FramePack I2V
Frontierby lllyasviel
Viral low-VRAM video generation model based on HunyuanVideo architecture. Uses a novel next-frame prediction approach that inverts the diffusion process to pack future frames into the noise of the current frame, enabling video generation with only 6GB VRAM. Image-to-video with strong motion quality.
- Generates AI video with only 6GB VRAM
- Based on HunyuanVideo architecture optimized for low VRAM
- Novel next-frame prediction packs future frames into noise
- Image-to-video with strong temporal coherence
- Apache 2.0 licensed — fully open source
Your hardware
Detecting...
Image Quality Benchmarks
Measured quality metrics for FramePack I2V outputs.
How often humans prefer this model's output (0-100%)
Visual quality and composition rating (5-9 scale)
VRAM by Scenario
VRAM estimates at FP16 and FP8 precision. FP8 uses ~40% less memory with minimal quality loss. Grade shows how well each GPU handles the generation workload.
FP16 (full precision)
| Scenario | VRAM | RTX 4090 24GB | RTX 3060 12GB | RTX 4060 8GB | MacBook Pro M4 Pro 24GB |
|---|---|---|---|---|---|
| 512×512 · 25 frames | 6.4 GB | S● | S● | S● | F● |
| 768×512 · 25 frames | 6.7 GB | S● | S● | A● | F● |
| 768×512 · 100 frames | 7.6 GB | S● | S● | B● | F● |
| 1280×720 · 25 frames | 7.8 GB | S● | S● | B● | F● |
FP8 (quantized — ~40% less VRAM)
| Scenario | VRAM | RTX 4090 24GB | RTX 3060 12GB | RTX 4060 8GB | MacBook Pro M4 Pro 24GB |
|---|---|---|---|---|---|
| 512×512 · 25 frames | 25.9 GB | B | F | F | F |
| 768×512 · 25 frames | 28.0 GB | D | F | F | F |
| 768×512 · 100 frames | 34.3 GB | F | F | F | F |
| 1280×720 · 25 frames | 36.5 GB | F | F | F | F |
Optimization Tips
Turbo / LCM distillation
Use distilled scheduler at 4-8 steps for faster iteration
Run with Python
from diffusers import DiffusionPipeline
import torch
pipe = DiffusionPipeline.from_pretrained(
"lllyasviel/FramePackI2V_HY",
torch_dtype=torch.float16
)
pipe.to("cuda")
frames = pipe(
prompt="your prompt here",
num_inference_steps=25,
guidance_scale=7.5,
num_frames=129,
).frames[0]
# Save frames or export as videoGet started
Setup instructions for running FramePack I2V locally
1. Download the model
Get the checkpoint from HuggingFace
2. Place in:
ComfyUI/models/checkpoints/3. Launch ComfyUI
python main.pyMemory Breakdown
VRAM allocation for 25 frames at 768×512 on RTX 4090 24GB
Estimated Generation Time
25 frames at 768×512, 30 steps, FP16.
Sample Outputs
Available Formats & Downloads
Download FramePack I2V in different precisions. Lower precision = less VRAM but slight quality loss.
| Format | Präzision | Größe | Anbieter | |
|---|---|---|---|---|
| Offizielle Gewichte | ||||
| safetensorsEmpfohlen | FP16 | 26.0 GB | official | Herunterladen |
| Community-Konvertierungen | ||||
| safetensorsCommunity | FP8 | 13.0 GB | community | Herunterladen |
LoRA Ecosystem
LimitedVery new model; LoRA ecosystem is still emerging.
Related Workflows
You might also like
Frequently asked questions
FAQ — FramePack I2V
How much VRAM does FramePack I2V need for video?
FramePack I2V (13B parameters) requires approximately 6.7 GB of VRAM at FP16 precision for generating 25 frames at 768×512. Video generation typically requires more VRAM than image generation due to temporal attention layers.
Can I run FramePack I2V on RTX 4090?
Yes, the RTX 4090 (24 GB VRAM) can run FramePack I2V at FP16. Expected generation time is around ~3m 3s for a 25-frame clip.
How long does it take to generate a video with FramePack I2V?
On a reference GPU (RTX 4090 24GB), FramePack I2V generates a 25-frame video at 768×512 in approximately ~3m 3s at FP16 with 30 inference steps. Faster GPUs with higher memory bandwidth will reduce generation time.
What resolution and frame count does FramePack I2V support?
FramePack I2V supports up to 1280×720 resolution and 129 frames per generation at 30 FPS. Higher resolutions and frame counts require proportionally more VRAM.
About FramePack I2V
See also