HunyuanVideo 1.5

Name: HunyuanVideo 1.5
Author: Tencent

Stable

by Tencent

Consumer-oriented successor to HunyuanVideo 13B from Tencent. 8.3B parameter 3D DiT supporting both text-to-video and image-to-video (T2V + I2V). Step-distilled variant runs 480p at ~75s on RTX 4090; minimum ~14GB VRAM with offload in FP16.

8.3B params — lighter and more consumer-accessible than 13B predecessor
Text-to-video and image-to-video (T2V + I2V)
Step-distilled variant: ~75s for 480p on RTX 4090
Minimum ~14GB VRAM with offload in FP16

HuggingFace GitHub Paper Documentation

Your hardware

Detecting...

Parameters8.3B

Max Resolution1280×720

Max Frames129

FPS24

Architecture3D-DIT

Licensetencent-hunyuan-community

Image Quality Benchmarks

Measured quality metrics for HunyuanVideo 1.5 outputs.

Human Preference Score84%

How often humans prefer this model's output (0-100%)

Aesthetic Score7.4

Visual quality and composition rating (5-9 scale)

This model requires 39+ GB VRAM for basic video generation. A GPU with 24GB+ VRAM is recommended.

VRAM by Scenario

VRAM estimates at FP16 and FP8 precision. FP8 uses ~40% less memory with minimal quality loss. Grade shows how well each GPU handles the generation workload.

FP16 (full precision)

Scenario	VRAM	RTX 4090 24GB	RTX 3060 12GB	RTX 4060 8GB	MacBook Pro M4 Pro 24GB
512×512 · 25 frames	37.1 GB	F	F	F	F
768×512 · 25 frames	39.2 GB	F	F	F	F
768×512 · 100 frames	45.5 GB	F	F	F	F
1280×720 · 25 frames	47.6 GB	F	F	F	F

FP8 (quantized — ~40% less VRAM)

Scenario	VRAM	RTX 4090 24GB	RTX 3060 12GB	RTX 4060 8GB	MacBook Pro M4 Pro 24GB
512×512 · 25 frames	21.0 GB	A	F	F	D
768×512 · 25 frames	23.1 GB	B	F	F	D
768×512 · 100 frames	29.4 GB	D	F	F	F
1280×720 · 25 frames	31.6 GB	D	F	F	F

Optimization Tips

Turbo / LCM distillation

Use distilled scheduler at 4-8 steps for faster iteration

Run with Python

Run with Python (diffusers)

from diffusers import HunyuanVideoPipeline
import torch

pipe = HunyuanVideoPipeline.from_pretrained(
    "tencent/HunyuanVideo-1.5",
    torch_dtype=torch.float16
)
pipe.to("cuda")

frames = pipe(
    prompt="your prompt here",
    num_inference_steps=50,
    guidance_scale=6.0,
    num_frames=129,
).frames[0]
# Save frames or export as video

Get started

Setup instructions for running HunyuanVideo 1.5 locally

1. Download the model

Get the checkpoint from HuggingFace

2. Place in:

ComfyUI/models/checkpoints/

3. Launch ComfyUI

python main.py

Note: Video generation requires video output nodes. Install ComfyUI-VideoHelperSuite from the ComfyUI Manager for SaveAnimatedWEBP or VHS_VideoCombine nodes.

Memory Breakdown

VRAM allocation for 25 frames at 768×512 on RTX 4090 24GB

Required: 39.2 GBAvailable: 24.0 GB

Weights16.6 GB

VAE0.2 GB

Text Encoder14.0 GB

Activations6.0 GB

Overhead0.5 GB

Estimated Generation Time

25 frames at 768×512, 30 steps, FP16.

RTX 4090 24GB~2m 33s

RTX 3060 12GB~9m 40s

RTX 4060 8GB~14m 33s

MacBook Pro M4 Pro 24GB~20m 43s

Sample Outputs

Available Formats & Downloads

Download HunyuanVideo 1.5 in different precisions. Lower precision = less VRAM but slight quality loss.

Format	Präzision	Größe	Anbieter
safetensorsEmpfohlen	FP16	16.0 GB	official	Herunterladen

LoRA Ecosystem

growing

Early LoRA ecosystem following on from the HunyuanVideo community.

Related Workflows

Image-to-Video →

SkyReels V2 14B14B · Skywork Mochi 1 Preview10B · Genmo Wan2.2 TI2V 5B5B · Wan-AI

Frequently asked questions

FAQ — HunyuanVideo 1.5

How much VRAM does HunyuanVideo 1.5 need for video?

HunyuanVideo 1.5 (8.3B parameters) requires approximately 39.2 GB of VRAM at FP16 precision for generating 25 frames at 768×512. Video generation typically requires more VRAM than image generation due to temporal attention layers.

Can I run HunyuanVideo 1.5 on RTX 4090?

HunyuanVideo 1.5 exceeds the RTX 4090's 24 GB VRAM at FP16 for video generation. Consider reducing resolution, frame count, or using a GPU with more VRAM.

How long does it take to generate a video with HunyuanVideo 1.5?

On a reference GPU (RTX 4090 24GB), HunyuanVideo 1.5 generates a 25-frame video at 768×512 in approximately ~2m 33s at FP16 with 30 inference steps. Faster GPUs with higher memory bandwidth will reduce generation time.

What resolution and frame count does HunyuanVideo 1.5 support?

HunyuanVideo 1.5 supports up to 1280×720 resolution and 129 frames per generation at 24 FPS. Higher resolutions and frame counts require proportionally more VRAM.

About HunyuanVideo 1.5

Use cases

video-generationtext-to-videoimage-to-video

Recommended runtimes

comfyuidiffusers