NousResearch
Hermes 4.3 36B (36B parameters) requires approximately 28.0 GB of VRAM with Q4_K_M quantization. For the best balance of quality and speed, we recommend hardware with at least 33 GB of VRAM.
Quick specs
Related models
Quick picks
Best hardware
Run this model
Quantization options
No hardware detected — fit column shows raw VRAM estimates
| Quant | Bits | VRAM | Quality | Fit |
|---|---|---|---|---|
Q2_K | 2 | 14.0 GB | Low | — |
Q3_K_S | 3 | 17.6 GB | Low | — |
NVFP4 | 4 | 20.2 GB | Medium | — |
Q4_K_M | 4 | 22.0 GB | Medium | — |
Q5_K_M | 5 | 25.9 GB | High | — |
Q6_K | 6 | 29.5 GB | High | — |
Q8_0 | 8 | 38.5 GB | Very High | — |
F16 | 16 | 73.8 GB | Maximum | — |
Hardware compatibility
Computing compatibility...
Memory breakdown
Frequently asked questions
Hermes 4.3 36B (36B parameters) requires approximately 28.0 GB of VRAM with Q4_K_M quantization. Lower quantizations like Q4_K_M use less memory but may reduce quality.
Yes, Mac mini M4 64GB can run Hermes 4.3 36B with a compatibility score of 49/100. It provides 64 GB of memory and achieves approximately 7.1 tokens per second.
The recommended quantization for Hermes 4.3 36B is Q4_K_M, which offers the best balance between model quality and memory efficiency. Higher quantizations preserve more quality but require more VRAM.
The top recommended hardware for Hermes 4.3 36B: NVIDIA A100 40GB (score: 55/100), RTX PRO 5000 Blackwell 48GB (score: 54/100), RTX 6000 Ada 48GB (score: 53/100). These provide the best combination of memory, bandwidth, and compute for running this model locally.
Yes, Hermes 4.3 36B is well-suited for chat as well as reasoning. It was designed with these use cases in mind.
See also