Mochi 1 Preview
Stableby Genmo
10B parameter video generation model from Genmo using AsymmDiT architecture with T5-XXL text encoder. Generates 848x480 videos at 30fps with strong motion quality. Apache 2.0 licensed.
- 10B AsymmDiT — strong motion quality
- 848x480 at 30fps
- Apache 2.0 — fully open for commercial use
- ~22GB VRAM with model offloading
Your hardware
Detecting...
Image Quality Benchmarks
Measured quality metrics for Mochi 1 Preview outputs.
How often humans prefer this model's output (0-100%)
Visual quality and composition rating (5-9 scale)
VRAM by Scenario
VRAM estimates at FP16 and FP8 precision. FP8 uses ~40% less memory with minimal quality loss. Grade shows how well each GPU handles the generation workload.
FP16 (full precision)
| Scenario | VRAM | RTX 4090 24GB | RTX 3060 12GB | RTX 4060 8GB | MacBook Pro M4 Pro 24GB |
|---|---|---|---|---|---|
| 512×512 · 25 frames | 35.8 GB | F | F | F | F |
| 768×512 · 25 frames | 37.9 GB | F | F | F | F |
| 768×512 · 100 frames | 44.2 GB | F | F | F | F |
| 1280×720 · 25 frames | 46.4 GB | F | F | F | F |
FP8 (quantized — ~40% less VRAM)
| Scenario | VRAM | RTX 4090 24GB | RTX 3060 12GB | RTX 4060 8GB | MacBook Pro M4 Pro 24GB |
|---|---|---|---|---|---|
| 512×512 · 25 frames | 20.4 GB | A | F | F | D |
| 768×512 · 25 frames | 22.5 GB | B | F | F | D |
| 768×512 · 100 frames | 28.8 GB | D | F | F | F |
| 1280×720 · 25 frames | 30.9 GB | D | F | F | F |
Optimization Tips
Turbo / LCM distillation
Use distilled scheduler at 4-8 steps for faster iteration
Run with Python
from diffusers import MochiPipeline
import torch
pipe = MochiPipeline.from_pretrained(
"genmo/mochi-1-preview",
torch_dtype=torch.float16
)
pipe.to("cuda")
frames = pipe(
prompt="your prompt here",
num_inference_steps=64,
guidance_scale=4.5,
num_frames=84,
).frames[0]
# Save frames or export as videoGet started
Setup instructions for running Mochi 1 Preview locally
1. Download the model
Get the checkpoint from HuggingFace
2. Place in:
ComfyUI/models/checkpoints/3. Launch ComfyUI
python main.pyMemory Breakdown
VRAM allocation for 25 frames at 768×512 on RTX 4090 24GB
Estimated Generation Time
25 frames at 768×512, 30 steps, FP16.
Sample Outputs
Available Formats & Downloads
Download Mochi 1 Preview in different precisions. Lower precision = less VRAM but slight quality loss.
| 格式 | 精度 | 大小 | 提供商 | |
|---|---|---|---|---|
| safetensors推荐 | FP16 | 20.0 GB | official | 下载 |
Related Workflows
You might also like
Frequently asked questions
FAQ — Mochi 1 Preview
How much VRAM does Mochi 1 Preview need for video?
Mochi 1 Preview (10B parameters) requires approximately 37.9 GB of VRAM at FP16 precision for generating 25 frames at 768×512. Video generation typically requires more VRAM than image generation due to temporal attention layers.
Can I run Mochi 1 Preview on RTX 4090?
Mochi 1 Preview exceeds the RTX 4090's 24 GB VRAM at FP16 for video generation. Consider reducing resolution, frame count, or using a GPU with more VRAM.
How long does it take to generate a video with Mochi 1 Preview?
On a reference GPU (RTX 4090 24GB), Mochi 1 Preview generates a 25-frame video at 768×512 in approximately ~2m 45s at FP16 with 30 inference steps. Faster GPUs with higher memory bandwidth will reduce generation time.
What resolution and frame count does Mochi 1 Preview support?
Mochi 1 Preview supports up to 848×480 resolution and 84 frames per generation at 30 FPS. Higher resolutions and frame counts require proportionally more VRAM.
About Mochi 1 Preview
See also