Mradermacher
Codestral 21B Pruned i1 (21B parameters) requires approximately 17.1 GB of VRAM with Q4_K_M quantization. For the best balance of quality and speed, we recommend hardware with at least 20 GB of VRAM.
Quick specs
Related models
Quick picks
Best hardware
Run this model
Quantization options
No hardware detected — fit column shows raw VRAM estimates
| Quant | Bits | VRAM | Quality | Fit |
|---|---|---|---|---|
Q2_K | 2 | 8.2 GB | Low | — |
Q3_K_S | 3 | 10.3 GB | Low | — |
NVFP4 | 4 | 11.8 GB | Medium | — |
Q4_K_M | 4 | 12.8 GB | Medium | — |
Q5_K_M | 5 | 15.1 GB | High | — |
Q6_K | 6 | 17.2 GB | High | — |
Q8_0 | 8 | 22.5 GB | Very High | — |
F16 | 16 | 43.1 GB | Maximum | — |
Hardware compatibility
Computing compatibility...
Memory breakdown
Frequently asked questions
Codestral 21B Pruned i1 (21B parameters) requires approximately 17.1 GB of VRAM with Q4_K_M quantization. Lower quantizations like Q4_K_M use less memory but may reduce quality.
Yes, Intel Arc Pro B60 24GB can run Codestral 21B Pruned i1 with a compatibility score of 51/100. It provides 24 GB of memory and achieves approximately 19.2 tokens per second.
The recommended quantization for Codestral 21B Pruned i1 is Q4_K_M, which offers the best balance between model quality and memory efficiency. Higher quantizations preserve more quality but require more VRAM.
The top recommended hardware for Codestral 21B Pruned i1: RTX 4090 24GB (score: 55/100), RTX 5090 Laptop 24GB (score: 55/100), NVIDIA A30 24GB (score: 55/100). These provide the best combination of memory, bandwidth, and compute for running this model locally.
Yes, Codestral 21B Pruned i1 is well-suited for coding as well as rag. It was designed with these use cases in mind.
See also