Bartowski

Llama 3.2 3B Instruct

Name: Llama 3.2 3B Instruct
Rating: 47 (196 reviews)
Author: Bartowski

HuggingFace

Dados limitados disponíveis — algumas especificações podem estar incompletas ou estimadas.

407.1KDownloads193Curtidas0K tokensContextoUnknownLicença5 InicialQualidade

Llama 3.2 3B Instruct (3B parameters) requires approximately 4.3 GB of VRAM with Q5_K_M quantization. For the best balance of quality and speed, we recommend hardware with at least 5 GB of VRAM.

Comece agora

— copie e cole para rodar localmente

Copy-paste commands to run Llama 3.2 3B Instruct on your machine.

Run

docker run --rm -it ghcr.io/ggerganov/llama.cpp:full \
  --hf-repo "bartowski/Llama-3.2-3B-Instruct-GGUF" \
  --hf-file "Llama-3.2-3B-Instruct-GGUF-Q5_K_M.gguf" \
  -c 4096 -ngl 99

Quick specs

Parameters3B

Architecturedense

Context0K tokens

Modalitytext

Min RAM1.2 GB

Rec. RAM2.2 GB (Q5_K_M)

LicenseUnknown

FamilyLlama

✓ Chat

Modelos relacionados

Escolhas rápidas

Melhor custo-benefícioC

Intel Arc A380 6GB~$139 — 42 tok/s

Melhor no geralC

RTX 2060 6GB~$349 — 42 tok/s

Melhor hardware

Melhores opções para Llama 3.2 3B Instruct

Rodar este modelo

Llama 3.2 3B Instruct on RTX 2060 6GB Llama 3.2 3B Instruct on RTX 4050 Laptop 6GB Llama 3.2 3B Instruct on GTX 1060 6GB

Opções de quantização

Estimativas de VRAM por nível de quantização

No hardware detected — fit column shows raw VRAM estimates

Quant	Bits	VRAM	Quality	Fit
Q2_K	2	1.2 GB	Low	—
Q3_K_S	3	1.5 GB	Low	—
NVFP4	4	1.7 GB	Medium	—
Q4_K_M	4	1.8 GB	Medium	—
Q5_K_M	5	2.2 GB	High	—
Q6_K	6	2.5 GB	High	—
Q8_0	8	3.2 GB	Very High	—
F16	16	6.1 GB	Maximum	—

Compatibilidade de hardware

Estimativas de compatibilidade para todo o hardware

Abrir calculadora

Computing compatibility...

Detalhamento de memória

Reference: RTX 2060 6GB

Weights2.2 GB

KV Cache0.4 GB

Runtime1.2 GB

Headroom0.6 GB

Perguntas frequentes

FAQ — Llama 3.2 3B Instruct

Veja também

Guia de Quantização Metodologia de Pontuação Abrir calculadora