MaziyarPanahi

Llama 3.3 70B Instruct

Name: Llama 3.3 70B Instruct
Rating: 46 (44 reviews)
Author: MaziyarPanahi

Datos limitados disponibles — algunas especificaciones pueden estar incompletas o ser estimadas.

0K tokensContextoUnknownLicencia4 EntradaCalidad

Llama 3.3 70B Instruct (70B parameters) requires approximately 52.7 GB of VRAM with Q4_K_M quantization. For the best balance of quality and speed, we recommend hardware with at least 61 GB of VRAM.

Quick specs

Parameters70B

Architecturedense

Context0K tokens

Modalitytext

Min RAM27.3 GB

Rec. RAM42.7 GB (Q4_K_M)

LicenseUnknown

FamilyLlama

✓ Chat

Modelos relacionados

Selecciones rápidas

Mejor económicoC

MacBook Pro M3 Max 128GB~$2,499 — 6 tok/s

Mejor en generalB

NVIDIA H100 80GB~$40,000 — 66 tok/s

Mejor hardware

Mejores opciones para Llama 3.3 70B Instruct

Ejecutar este modelo

Llama 3.3 70B Instruct on NVIDIA H100 80GB Llama 3.3 70B Instruct on NVIDIA H800 80GB Llama 3.3 70B Instruct on NVIDIA GH200 96GB

Opciones de cuantización

Estimaciones de VRAM por nivel de cuantización

No hardware detected — fit column shows raw VRAM estimates

Quant	Bits	VRAM	Quality	Fit
Q2_K	2	27.3 GB	Low	—
Q3_K_S	3	34.3 GB	Low	—
NVFP4	4	39.2 GB	Medium	—
Q4_K_M	4	42.7 GB	Medium	—
Q5_K_M	5	50.4 GB	High	—
Q6_K	6	57.4 GB	High	—
Q8_0	8	74.9 GB	Very High	—
F16	16	143.5 GB	Maximum	—

Compatibilidad de hardware

Estimaciones de encaje en todo el hardware

Abrir calculadora

Computing compatibility...

Desglose de memoria

Reference: RTX 2060 6GB

Weights42.7 GB

KV Cache8.2 GB

Runtime1.2 GB

Headroom0.6 GB

Preguntas frecuentes

FAQ — Llama 3.3 70B Instruct

Ver también

Guía de cuantización Metodología de puntuación Abrir calculadora