HunyuanVideo
Stableby Tencent
Large-scale video generation model from Tencent. 13B parameter 3D DiT with Hunyuan-Large MLLM text encoder (~7B, not T5-based). Strong motion quality and visual fidelity up to 720p.
- 13B params for high visual fidelity
- Up to 720p resolution
- Strong motion quality and temporal coherence
- Up to 129 frames per generation
Your hardware
Detecting...
Image Quality Benchmarks
Measured quality metrics for HunyuanVideo outputs.
How often humans prefer this model's output (0-100%)
Visual quality and composition rating (5-9 scale)
VRAM by Scenario
VRAM estimates at FP16 and FP8 precision. FP8 uses ~40% less memory with minimal quality loss. Grade shows how well each GPU handles the generation workload.
FP16 (full precision)
| Scenario | VRAM | RTX 4090 24GB | RTX 3060 12GB | RTX 4060 8GB | MacBook Pro M4 Pro 24GB |
|---|---|---|---|---|---|
| 512×512 · 25 frames | 20.9 GB | A● | F● | F● | F● |
| 768×512 · 25 frames | 21.9 GB | A● | F● | F● | F● |
| 768×512 · 100 frames | 24.7 GB | B● | F● | F● | F● |
| 1280×720 · 25 frames | 25.7 GB | B● | F● | F● | F● |
FP8 (quantized — ~40% less VRAM)
| Scenario | VRAM | RTX 4090 24GB | RTX 3060 12GB | RTX 4060 8GB | MacBook Pro M4 Pro 24GB |
|---|---|---|---|---|---|
| 512×512 · 25 frames | 34.8 GB | F● | F● | F● | F● |
| 768×512 · 25 frames | 37.7 GB | F● | F● | F● | F● |
| 768×512 · 100 frames | 46.1 GB | F● | F● | F● | F● |
| 1280×720 · 25 frames | 49.0 GB | F● | F● | F● | F● |
Optimization Tips
Turbo / LCM distillation
Use distilled scheduler at 4-8 steps for faster iteration
Run with Python
from diffusers import HunyuanVideoPipeline
import torch
pipe = HunyuanVideoPipeline.from_pretrained(
"tencent/HunyuanVideo",
torch_dtype=torch.float16
)
pipe.to("cuda")
frames = pipe(
prompt="your prompt here",
num_inference_steps=50,
guidance_scale=6.0,
num_frames=129,
).frames[0]
# Save frames or export as videoGet started
Setup instructions for running HunyuanVideo locally
1. Download the model
Get the checkpoint from HuggingFace
2. Place in:
ComfyUI/models/checkpoints/3. Launch ComfyUI
python main.pyMemory Breakdown
VRAM allocation for 25 frames at 768×512 on RTX 4090 24GB
Estimated Generation Time
25 frames at 768×512, 30 steps, FP16.
Sample Outputs
Available Formats & Downloads
Download HunyuanVideo in different precisions. Lower precision = less VRAM but slight quality loss.
| 格式 | 精度 | 大小 | 提供商 | |
|---|---|---|---|---|
| safetensors推荐 | FP16 | 26.0 GB | official | 下载 |
LoRA Ecosystem
Growing EcosystemGrowing LoRA ecosystem with character and style LoRAs.
Related Workflows
You might also like
Frequently asked questions
FAQ — HunyuanVideo
How much VRAM does HunyuanVideo need for video?
HunyuanVideo (13B parameters) requires approximately 21.9 GB of VRAM at FP16 precision for generating 25 frames at 768×512. Video generation typically requires more VRAM than image generation due to temporal attention layers.
Can I run HunyuanVideo on RTX 4090?
Yes, the RTX 4090 (24 GB VRAM) can run HunyuanVideo at FP16. Expected generation time is around ~3m 3s for a 25-frame clip.
How long does it take to generate a video with HunyuanVideo?
On a reference GPU (RTX 4090 24GB), HunyuanVideo generates a 25-frame video at 768×512 in approximately ~3m 3s at FP16 with 30 inference steps. Faster GPUs with higher memory bandwidth will reduce generation time.
What resolution and frame count does HunyuanVideo support?
HunyuanVideo supports up to 1280×720 resolution and 129 frames per generation at 24 FPS. Higher resolutions and frame counts require proportionally more VRAM.
About HunyuanVideo
See also