by Sand AI
24B autoregressive diffusion model for streaming video generation. Produces high-quality cinematic video with strong temporal coherence. Requires 80GB+ VRAM for full inference. Apache 2.0 licensed.
Your hardware
Detecting...
Measured quality metrics for MAGI-1 outputs.
How often humans prefer this model's output (0-100%)
Visual quality and composition rating (5-9 scale)
VRAM estimates at FP16 and FP8 precision. FP8 uses ~40% less memory with minimal quality loss. Grade shows how well each GPU handles the generation workload.
| Scenario | VRAM | RTX 4090 24GB | RTX 3060 12GB | RTX 4060 8GB | MacBook Pro M4 Pro 24GB |
|---|---|---|---|---|---|
| 512×512 · 25 frames | 65.2 GB | F | F | F | F |
| 768×512 · 25 frames | 67.3 GB | F | F | F | F |
| 768×512 · 100 frames | 73.6 GB | F | F | F | F |
| 1280×720 · 25 frames | 75.8 GB | F | F | F | F |
| Scenario | VRAM | RTX 4090 24GB | RTX 3060 12GB | RTX 4060 8GB | MacBook Pro M4 Pro 24GB |
|---|---|---|---|---|---|
| 512×512 · 25 frames | 35.1 GB | F | F | F | F |
| 768×512 · 25 frames | 37.2 GB | F | F | F | F |
| 768×512 · 100 frames | 43.5 GB | F | F | F | F |
| 1280×720 · 25 frames | 45.6 GB | F | F | F | F |
Turbo / LCM distillation
Use distilled scheduler at 4-8 steps for faster iteration
from diffusers import DiffusionPipeline
import torch
pipe = DiffusionPipeline.from_pretrained(
"sand-ai/MAGI-1",
torch_dtype=torch.float16
)
pipe.to("cuda")
frames = pipe(
prompt="your prompt here",
num_inference_steps=50,
guidance_scale=7.5,
num_frames=120,
).frames[0]
# Save frames or export as videoGet started
Setup instructions for running MAGI-1 locally
1. Download the model
Get the checkpoint from HuggingFace
2. Place in:
ComfyUI/models/checkpoints/3. Launch ComfyUI
python main.pyVRAM allocation for 25 frames at 768×512 on RTX 4090 24GB
25 frames at 768×512, 30 steps, FP16.
Download MAGI-1 in different precisions. Lower precision = less VRAM but slight quality loss.
| Format | Precision | Size | Provider | |
|---|---|---|---|---|
| safetensorsRecommended | BF16 | 48.0 GB | official | Download |
Frequently asked questions
MAGI-1 (24B parameters) requires approximately 67.3 GB of VRAM at FP16 precision for generating 25 frames at 768×512. Video generation typically requires more VRAM than image generation due to temporal attention layers.
MAGI-1 exceeds the RTX 4090's 24 GB VRAM at FP16 for video generation. Consider reducing resolution, frame count, or using a GPU with more VRAM.
On a reference GPU (RTX 4090 24GB), MAGI-1 generates a 25-frame video at 768×512 in approximately ~3m 55s at FP16 with 30 inference steps. Faster GPUs with higher memory bandwidth will reduce generation time.
MAGI-1 supports up to 1280×720 resolution and 120 frames per generation at 24 FPS. Higher resolutions and frame counts require proportionally more VRAM.
See also