NVIDIA

Nemotron Nano 8B

Name: Nemotron Nano 8B
Rating: 83 (185 reviews)
Author: NVIDIA

48.0KDownloads220LikesMar 2025Veröffentlicht131K TokenKontextNVIDIA Open ModelLizenz82 StarkQualität

Nemotron Nano 8B (8B parameters) requires approximately 8.6 GB of VRAM with Q4_K_M quantization. For the best balance of quality and speed, we recommend hardware with at least 10 GB of VRAM.

Loslegen

— kopieren & einfügen, um lokal auszuführen

Copy-paste commands to run Nemotron Nano 8B on your machine.

Run

lms load Llama-3.1-Nemotron-Nano-8B-v1 && lms server start

Quick specs

Parameters8B

Architecturedense

Context131K tokens

Modalitytext

Min RAM3.1 GB

Rec. RAM4.9 GB (Q4_K_M)

LicenseNVIDIA Open Model

FamilyNemotron

✓ Chat✓ Reasoning

About this model

Nemotron Nano 8B is NVIDIA's reasoning model derived from Llama 3.1 8B Instruct, post-trained for switchable reasoning with on/off modes. Achieves 95.4% on MATH-500 and 54.1% on GPQA Diamond with reasoning enabled. Fits on a single RTX GPU for local deployment.

•Switchable reasoning: toggle detailed thinking on/off via system prompt
•95.4% on MATH-500 with reasoning on, 36.6% with reasoning off
•Derived from Llama 3.1 8B with multi-phase post-training
•Fits on a single RTX GPU for local inference

Verwandte Modelle

Schnellauswahl

Bestes BudgetS

Intel Arc B570 10GB~$219 — 45 tok/s

Beste GesamtwahlS

RTX 3080 Ti 12GB~$1,199 — 112 tok/s

Beste Hardware

Top-Empfehlungen für Nemotron Nano 8B

Dieses Modell ausführen

Nemotron Nano 8B on RTX 3080 Ti 12GB Nemotron Nano 8B on RTX 5070 12GB Nemotron Nano 8B on RTX 3080 12GB

Quantisierungsoptionen

VRAM-Schätzungen nach Quantisierungsstufe

No hardware detected — fit column shows raw VRAM estimates

Quant	Bits	VRAM	Quality	Fit
Q2_K	2	3.1 GB	Low	—
Q3_K_S	3	3.9 GB	Low	—
NVFP4	4	4.5 GB	Medium	—
Q4_K_M	4	4.9 GB	Medium	—
Q5_K_M	5	5.8 GB	High	—
Q6_K	6	6.6 GB	High	—
Q8_0	8	8.6 GB	Very High	—
F16	16	16.4 GB	Maximum	—

Quality benchmarks

Nemotron Nano 8B benchmark scores

Benchmark verified

Reasoning

MMLU-Pro—

GPQA Diamond54.1%

MATH-50095.4%

ARC Challenge—

General

Chatbot Arena—

IFEval74.7%

Source: official · 2025-03-18

Hardware-Kompatibilität

Eignungsschätzungen für alle Hardware

Rechner öffnen

Computing compatibility...

Speicheraufschlüsselung

Reference: RTX 2060 6GB

Weights4.9 GB

KV Cache2.0 GB

Runtime1.2 GB

Headroom0.6 GB

Häufig gestellte Fragen

FAQ — Nemotron Nano 8B

Siehe auch

Quantisierungsleitfaden Bewertungsmethodik Rechner öffnen