NVIDIA

Nemotron Nano 8B

Name: Nemotron Nano 8B
Rating: 83 (185 reviews)
Author: NVIDIA

48.0KDescargas220Me gustaMar 2025Publicado131K tokensContextoNVIDIA Open ModelLicencia82 FuerteCalidad

Nemotron Nano 8B (8B parameters) requires approximately 8.6 GB of VRAM with Q4_K_M quantization. For the best balance of quality and speed, we recommend hardware with at least 10 GB of VRAM.

Comenzar

— copia y pega para ejecutar en local

Copy-paste commands to run Nemotron Nano 8B on your machine.

Run

lms load Llama-3.1-Nemotron-Nano-8B-v1 && lms server start

Quick specs

Parameters8B

Architecturedense

Context131K tokens

Modalitytext

Min RAM3.1 GB

Rec. RAM4.9 GB (Q4_K_M)

LicenseNVIDIA Open Model

FamilyNemotron

✓ Chat✓ Reasoning

About this model

Nemotron Nano 8B is NVIDIA's reasoning model derived from Llama 3.1 8B Instruct, post-trained for switchable reasoning with on/off modes. Achieves 95.4% on MATH-500 and 54.1% on GPQA Diamond with reasoning enabled. Fits on a single RTX GPU for local deployment.

•Switchable reasoning: toggle detailed thinking on/off via system prompt
•95.4% on MATH-500 with reasoning on, 36.6% with reasoning off
•Derived from Llama 3.1 8B with multi-phase post-training
•Fits on a single RTX GPU for local inference

Modelos relacionados

Selecciones rápidas

Mejor económicoS

Intel Arc B570 10GB~$219 — 45 tok/s

Mejor en generalS

RTX 3080 Ti 12GB~$1,199 — 112 tok/s

Mejor hardware

Mejores opciones para Nemotron Nano 8B

Ejecutar este modelo

Nemotron Nano 8B on RTX 3080 Ti 12GB Nemotron Nano 8B on RTX 5070 12GB Nemotron Nano 8B on RTX 3080 12GB

Opciones de cuantización

Estimaciones de VRAM por nivel de cuantización

No hardware detected — fit column shows raw VRAM estimates

Quant	Bits	VRAM	Quality	Fit
Q2_K	2	3.1 GB	Low	—
Q3_K_S	3	3.9 GB	Low	—
NVFP4	4	4.5 GB	Medium	—
Q4_K_M	4	4.9 GB	Medium	—
Q5_K_M	5	5.8 GB	High	—
Q6_K	6	6.6 GB	High	—
Q8_0	8	8.6 GB	Very High	—
F16	16	16.4 GB	Maximum	—

Quality benchmarks

Nemotron Nano 8B benchmark scores

Benchmark verified

Reasoning

MMLU-Pro—

GPQA Diamond54.1%

MATH-50095.4%

ARC Challenge—

General

Chatbot Arena—

IFEval74.7%

Source: official · 2025-03-18

Compatibilidad de hardware

Estimaciones de encaje en todo el hardware

Abrir calculadora

Computing compatibility...

Desglose de memoria

Reference: RTX 2060 6GB

Weights4.9 GB

KV Cache2.0 GB

Runtime1.2 GB

Headroom0.6 GB

Preguntas frecuentes

FAQ — Nemotron Nano 8B

Ver también

Guía de cuantización Metodología de puntuación Abrir calculadora