NVIDIA

Nemotron 70B

Name: Nemotron 70B
Rating: 69 (44 reviews)
Author: NVIDIA

当前

HuggingFace

Ollama

35下载量568点赞Oct 2024发布日期131K tokens上下文NVIDIA Open Model许可证52 良好质量

Nemotron 70B (70B parameters) requires approximately 49.1 GB of VRAM with Q4_K_M quantization. For the best balance of quality and speed, we recommend hardware with at least 57 GB of VRAM.

快速开始

— 复制粘贴即可本地运行

Copy-paste commands to run Nemotron 70B on your machine.

Run

ollama run nemotron

Quick specs

Parameters70B

Architecturedense

Context131K tokens

Modalitytext

Min RAM27.3 GB

Rec. RAM42.7 GB (Q4_K_M)

LicenseNVIDIA Open Model

FamilyNemotron

✓ Chat✓ Reasoning

About this model

Llama-3.1-Nemotron-70B-Instruct is a large language model customized by NVIDIA to improve the helpfulness of LLM generated responses to user queries.

•Please sign up to get free and immediate access to NVIDIA NeMo Framework container. If you don’t have an NVIDIA NGC account, you will be...
•If you don’t have an NVIDIA NGC API key, sign into NVIDIA NGC, selecting organization/team: ea-bignlp/ga-participants and click Generate API key....
•On your machine, docker login to nvcr.io using

Nemotron 70B 的最佳选择

运行此模型

Nemotron 70B on NVIDIA H100 80GB Nemotron 70B on NVIDIA H800 80GB Nemotron 70B on NVIDIA GH200 96GB

量化选项

各量化级别的 VRAM 估算

No hardware detected — fit column shows raw VRAM estimates

Quant	Bits	VRAM	Quality	Fit
Q2_K	2	27.3 GB	Low	—
Q3_K_S	3	34.3 GB	Low	—
NVFP4	4	39.2 GB	Medium	—
Q4_K_M	4	42.7 GB	Medium	—
Q5_K_M	5	50.4 GB	High	—
Q6_K	6	57.4 GB	High	—
Q8_0	8	74.9 GB	Very High	—
F16	16	143.5 GB	Maximum	—

Quality benchmarks

Nemotron 70B benchmark scores

Benchmark verified

Reasoning

MMLU-Pro85.2%

GPQA Diamond1.1%

MATH-50042.7%

ARC Challenge—

General

Chatbot Arena—

IFEval73.8%

Source: community · 2024-10-16

硬件兼容性

全部硬件的适配估算

打开计算器

Computing compatibility...

内存详细分析

Reference: RTX 2060 6GB

Weights42.7 GB

KV Cache4.9 GB

Runtime0.9 GB

Headroom0.6 GB

常见问题

FAQ — Nemotron 70B

另请参阅

量化指南评分方法打开计算器