AnimateDiff v1.5.3

Name: AnimateDiff v1.5.3
Author: guoyww

Stable

by guoyww

Motion adapter module that plugs into any SD 1.5 checkpoint to generate short animated clips. Only 0.4B extra parameters on top of the SD 1.5 base model. Generates 16 frames at 8fps (2-second clips).

Motion adapter — plugs into any SD 1.5 model
Only 0.4B extra params
16 frames at 8fps

HuggingFace GitHub Paper Documentation

2K downloads10 likes

Your hardware

Detecting...

Parameters0.4B

Max Resolution512×512

Max Frames16

FPS8

ArchitectureUNET

Licenseapache-2.0

Image Quality Benchmarks

Measured quality metrics for AnimateDiff v1.5.3 outputs.

Human Preference Score50%

How often humans prefer this model's output (0-100%)

Aesthetic Score5.5

Visual quality and composition rating (5-9 scale)

This model requires 26+ GB VRAM for basic video generation. A GPU with 24GB+ VRAM is recommended.

VRAM by Scenario

VRAM estimates at FP16 and FP8 precision. FP8 uses ~40% less memory with minimal quality loss. Grade shows how well each GPU handles the generation workload.

FP16 (full precision)

Scenario	VRAM	RTX 4090 24GB	RTX 3060 12GB	RTX 4060 8GB	MacBook Pro M4 Pro 24GB
512×512 · 25 frames	12.0 GB	S●	B●	F●	S●
768×512 · 25 frames	25.6 GB	B●	F●	F●	F●
768×512 · 100 frames	41.9 GB	F●	F●	F●	F●
1280×720 · 25 frames	47.5 GB	F●	F●	F●	F●

FP8 (quantized — ~40% less VRAM)

Scenario	VRAM	RTX 4090 24GB	RTX 3060 12GB	RTX 4060 8GB	MacBook Pro M4 Pro 24GB
512×512 · 25 frames	3.3 GB	S	S	S	S
768×512 · 25 frames	4.4 GB	S	S	S	S
768×512 · 100 frames	7.5 GB	S	S	B	S
1280×720 · 25 frames	8.6 GB	S	S	B	S

Optimization Tips

Turbo / LCM distillation

Use distilled scheduler at 4-8 steps for faster iteration

Run with Python

Run with Python (diffusers)

from diffusers import AnimateDiffPipeline
import torch

pipe = AnimateDiffPipeline.from_pretrained(
    "guoyww/animatediff-motion-adapter-v1-5-3",
    torch_dtype=torch.float16
)
pipe.to("cuda")

frames = pipe(
    prompt="your prompt here",
    num_inference_steps=25,
    guidance_scale=7.5,
    num_frames=16,
).frames[0]
# Save frames or export as video

Get started

Setup instructions for running AnimateDiff v1.5.3 locally

1. Download the model

Get the checkpoint from HuggingFace

2. Place in:

ComfyUI/models/checkpoints/

3. Launch ComfyUI

python main.py

Note: Video generation requires video output nodes. Install ComfyUI-VideoHelperSuite from the ComfyUI Manager for SaveAnimatedWEBP or VHS_VideoCombine nodes.

Memory Breakdown

VRAM allocation for 25 frames at 768×512 on RTX 4090 24GB

Required: 25.6 GBAvailable: 24.0 GB

Weights0.8 GB

VAE0.2 GB

Text Encoder0.2 GB

Activations3.0 GB

Overhead0.5 GB

Estimated Generation Time

25 frames at 768×512, 30 steps, FP16.

RTX 4090 24GB~45s

RTX 3060 12GB~2m 53s

RTX 4060 8GB~4m 20s

MacBook Pro M4 Pro 24GB~6m 10s

Sample Outputs

Available Formats & Downloads

Download AnimateDiff v1.5.3 in different precisions. Lower precision = less VRAM but slight quality loss.

Format	Präzision	Größe	Anbieter
safetensorsEmpfohlen	FP16	1.5 GB	official	Herunterladen

LoRA Ecosystem

Growing Ecosystem

AnimateDiff-specific motion LoRAs plus full SD 1.5 LoRA ecosystem for style. Motion LoRAs control camera movement and animation style.

Browse all LoRAs on CivitAI

Related Workflows

Browse Workflows →

CogVideoX 2B2B · THUDM Wan Video 2.1 1.3B1.3B · Alibaba FramePack I2V13B · lllyasviel

Frequently asked questions

FAQ — AnimateDiff v1.5.3

How much VRAM does AnimateDiff v1.5.3 need for video?

AnimateDiff v1.5.3 (0.4B parameters) requires approximately 25.6 GB of VRAM at FP16 precision for generating 25 frames at 768×512. Video generation typically requires more VRAM than image generation due to temporal attention layers.

Can I run AnimateDiff v1.5.3 on RTX 4090?

AnimateDiff v1.5.3 can run on the RTX 4090 with sequential offloading, though video generation will be significantly slower than native fit.

How long does it take to generate a video with AnimateDiff v1.5.3?

On a reference GPU (RTX 4090 24GB), AnimateDiff v1.5.3 generates a 25-frame video at 768×512 in approximately ~45s at FP16 with 30 inference steps. Faster GPUs with higher memory bandwidth will reduce generation time.

What resolution and frame count does AnimateDiff v1.5.3 support?

AnimateDiff v1.5.3 supports up to 512×512 resolution and 16 frames per generation at 8 FPS. Higher resolutions and frame counts require proportionally more VRAM.

About AnimateDiff v1.5.3

Use cases

video-generationanimationmotion

Recommended runtimes

comfyuidiffusers