Will It Run AI
mistral, vram, gpu-requirements, reasoning, moe

Mistral Small 4 VRAM Requirements - 119B Hardware Guide

Exact VRAM for Mistral Small 4 119B at Q4_K_M, Q5_K_M, Q6_K, Q8_0, and FP16. See whether 80GB GPUs or high-memory Macs can run it locally.

If you are searching for Mistral Small 4 VRAM requirements, the short answer is: this is not a 24GB or 48GB consumer-GPU model.

Quick answers

  • Q4_K_M: ~72.6 GB
  • Q5_K_M: ~85.7 GB
  • Q6_K: ~97.6 GB
  • Q8_0: ~127.3 GB
  • FP16: ~244.0 GB

Mistral Small 4 119B sits in a very different class from Mistral Small 24B. It is a high-end local model for 80GB GPUs, high-memory Macs, or larger serving setups.

Mistral Small 4 VRAM by Quantization

These numbers are weights-only estimates. Add more memory for KV cache, runtime overhead, and serving headroom.

QuantizationVRAM
Q4_K_M72.6 GB
Q5_K_M85.7 GB
Q6_K97.6 GB
Q8_0127.3 GB
FP16244.0 GB

Why Mistral Small 4 Feels Confusing

The name makes it sound close to "Mistral Small 24B". It is not.

Mistral Small 4 is a frontier-tier 119B model with a much larger memory footprint. That is why Google is ranking broad Mistral requirement pages for queries like mistral small 4 vram requirements while the answer people need is actually simple:

  • it is a serious model
  • it needs serious memory
  • it is not a clean fit for mainstream single-GPU local use

What Hardware Can Actually Run Mistral Small 4?

24GB and 48GB GPUs

This is the wrong tier.

  • A single RTX 4090 24GB is not enough
  • Even 48GB class cards are still well below a clean Q4_K_M fit

You can force ugly offload-heavy setups, but that is not a good recommendation.

80GB GPU Tier

This is where Mistral Small 4 starts making sense.

If your goal is to run Mistral Small 4 locally without turning the setup into an experiment, 80GB is the real entry point.

High-Memory Apple Silicon

Apple Silicon can be relevant here because unified memory changes the fit story.

  • 128GB Macs are the first plausible single-machine Apple tier for Q4_K_M
  • 192GB Macs are much safer if you want headroom for context and runtime overhead

This is exactly the kind of model where Apple Silicon becomes attractive as a capacity-first local platform.

Serving and Multi-GPU Setups

If you care about throughput rather than "can I technically load it once", this is a serving-grade model.

That means:

  • multiple high-memory GPUs
  • tensor parallel or distributed inference
  • runtimes built for serving, not only personal chat

For that side of the problem, the right mental model is closer to workstation or lab infrastructure than enthusiast desktop hardware.

Is Mistral Small 4 Worth It for Local AI?

Only if you are already in the right hardware tier.

It makes sense when:

  • you already own 80GB+ GPU hardware
  • you have a 128GB-192GB Apple Silicon machine and want a bigger local reasoning model
  • you care about frontier-class local quality more than simple setup

It does not make sense when:

  • you are trying to stretch a 24GB consumer GPU
  • you want the easiest high-quality local Mistral experience
  • you are really looking for the best model that still feels sane on mainstream hardware

In that mainstream tier, the right answer is still Mistral Small 24B, not Mistral Small 4.

Better Alternatives for Smaller Hardware

If you do not have 80GB-class memory, these are the more realistic Mistral choices:

If you want the broader family overview, use the main guide:

Bottom Line

Mistral Small 4 119B is a real local model only if you already have serious hardware.

  • Q4_K_M starts around 72.6GB
  • 24GB consumer GPUs are out
  • 80GB GPUs or high-memory Macs are the realistic starting point

If you are searching this because you want to know whether your normal desktop can run it, the practical answer is no. If you are already shopping in the 80GB or 128GB+ tier, then it becomes interesting.

Frequently Asked Questions

How much VRAM does Mistral Small 4 need?

Mistral Small 4 119B needs about 72.6GB at Q4_K_M, 85.7GB at Q5_K_M, 97.6GB at Q6_K, 127.3GB at Q8_0, and 244GB at FP16, plus runtime and context overhead.

Can an RTX 4090 run Mistral Small 4?

Not in a clean local setup. The model needs roughly 72.6GB at Q4_K_M before overhead, so a single 24GB GPU is far below the practical requirement.

What hardware do I need for Mistral Small 4?

The realistic local tier starts around 80GB GPUs such as the H100 or A100, or high-memory Apple Silicon systems like a 128GB or 192GB Mac depending on quantization and workload.

Is Mistral Small 4 the same as Mistral Small 24B?

No. Mistral Small 4 is a 119B model and a very different hardware tier. Mistral Small 24B is the mainstream single-GPU option, while Mistral Small 4 is a workstation-class or datacenter-class local model.