by guoyww
Motion adapter module that plugs into any SD 1.5 checkpoint to generate short animated clips. Only 0.4B extra parameters on top of the SD 1.5 base model. Generates 16 frames at 8fps (2-second clips).
Your hardware
Detecting...
Measured quality metrics for AnimateDiff v1.5.3 outputs.
How often humans prefer this model's output (0-100%)
Visual quality and composition rating (5-9 scale)
VRAM estimates at FP16 and FP8 precision. FP8 uses ~40% less memory with minimal quality loss. Grade shows how well each GPU handles the generation workload.
| Scenario | VRAM | RTX 4090 24GB | RTX 3060 12GB | RTX 4060 8GB | MacBook Pro M4 Pro 24GB |
|---|---|---|---|---|---|
| 512×512 · 25 frames | 12.0 GB | S● | B● | F● | S● |
| 768×512 · 25 frames | 25.6 GB | B● | F● | F● | F● |
| 768×512 · 100 frames | 41.9 GB | F● | F● | F● | F● |
| 1280×720 · 25 frames | 47.5 GB | F● | F● | F● | F● |
| Scenario | VRAM | RTX 4090 24GB | RTX 3060 12GB | RTX 4060 8GB | MacBook Pro M4 Pro 24GB |
|---|---|---|---|---|---|
| 512×512 · 25 frames | 3.3 GB | S | S | S | S |
| 768×512 · 25 frames | 4.4 GB | S | S | S | S |
| 768×512 · 100 frames | 7.5 GB | S | S | B | S |
| 1280×720 · 25 frames | 8.6 GB | S | S | B | S |
Turbo / LCM distillation
Use distilled scheduler at 4-8 steps for faster iteration
from diffusers import AnimateDiffPipeline
import torch
pipe = AnimateDiffPipeline.from_pretrained(
"guoyww/animatediff-motion-adapter-v1-5-3",
torch_dtype=torch.float16
)
pipe.to("cuda")
frames = pipe(
prompt="your prompt here",
num_inference_steps=25,
guidance_scale=7.5,
num_frames=16,
).frames[0]
# Save frames or export as videoGet started
Setup instructions for running AnimateDiff v1.5.3 locally
1. Download the model
Get the checkpoint from HuggingFace
2. Place in:
ComfyUI/models/checkpoints/3. Launch ComfyUI
python main.pyVRAM allocation for 25 frames at 768×512 on RTX 4090 24GB
25 frames at 768×512, 30 steps, FP16.
Download AnimateDiff v1.5.3 in different precisions. Lower precision = less VRAM but slight quality loss.
| Format | Precision | Size | Provider | |
|---|---|---|---|---|
| safetensorsRecommended | FP16 | 1.5 GB | official | Download |
AnimateDiff-specific motion LoRAs plus full SD 1.5 LoRA ecosystem for style. Motion LoRAs control camera movement and animation style.
Browse all LoRAs on CivitAIFrequently asked questions
AnimateDiff v1.5.3 (0.4B parameters) requires approximately 25.6 GB of VRAM at FP16 precision for generating 25 frames at 768×512. Video generation typically requires more VRAM than image generation due to temporal attention layers.
AnimateDiff v1.5.3 can run on the RTX 4090 with sequential offloading, though video generation will be significantly slower than native fit.
On a reference GPU (RTX 4090 24GB), AnimateDiff v1.5.3 generates a 25-frame video at 768×512 in approximately ~45s at FP16 with 30 inference steps. Faster GPUs with higher memory bandwidth will reduce generation time.
AnimateDiff v1.5.3 supports up to 512×512 resolution and 16 frames per generation at 8 FPS. Higher resolutions and frame counts require proportionally more VRAM.
See also