Animate still images into video with natural motion. Feed a photo or illustration and get smooth camera movement and subject animation.
RTX 4090 (24 GB)
FP8 · N/A per clip
A100 (80 GB)
FP16 · N/A per clip
RTX 4090 (24 GB)
FP16 · N/A per clip
A100 (80 GB)
FP16 · N/A per clip
RTX 4090 (24 GB)
FP8 · N/A per clip
A100 (80 GB)
FP16 · N/A per clip
RTX 4090 (24 GB)
FP16 · N/A per clip
A100 (80 GB)
FP16 · N/A per clip
RTX 4090 (24 GB)
FP8 · N/A per clip
A100 (80 GB)
FP16 · N/A per clip
RTX 4090 (24 GB)
FP16 · N/A per clip
A100 (80 GB)
FP16 · N/A per clip
Recommended runtime: ComfyUI with video nodes
Wan 14B with image-to-video support produces the most natural animations from still images. It preserves the input image's composition while adding coherent motion. LTX Video also supports image conditioning with faster generation times.
Image-to-video requires slightly more VRAM than text-to-video because the image encoder runs alongside the video model. Wan 14B i2v needs 24-48 GB VRAM. Smaller models with i2v support can work on 12-16 GB GPUs.
Yes, most image-to-video models respond to motion prompts like 'slow zoom in', 'camera pans left', or 'subject turns head'. Some workflows also support explicit camera trajectory inputs for precise control.