Question 1

What is Llama 4?

Accepted Answer

Llama 4 is Meta's meta's llama 4 lineup — scout 109b moe and maverick 400b moe with up to 10m context. strong multilingual and reasoning performance, permissive community license.

Question 2

How many variants are in the Llama 4 family?

Accepted Answer

We track 2 canonical Llama 4 variants in our catalog, spanning 109B to 400B parameters.

Question 3

Which Llama 4 variant should I run?

Accepted Answer

Pick the largest variant that fits on your hardware at Q4_K_M quantization. For 8 GB GPUs, the smallest dense variants. For 24 GB GPUs, the 27B dense or 35B MoE (if available). For workstation / Mac Studio, the 122B or larger MoE. Use our fit calculator on each model page for the exact verdict.

Variant	Params	Active	VRAM @ Q4	Architecture
Llama 4 Scout 17B 16E	109B	17B	~66.5 GB	moe	Details →
Llama 4 Maverick 17B 128E	400B	17B	~244.0 GB	moe	Details →

Llama 4 — all variants, VRAM & hardware

Variant comparison

Llama 4 VRAM & hardware guides

Frequently asked questions