InternLM

InternLM Chat 7B

Name: InternLM Chat 7B
Rating: 69 (142 reviews)
Author: InternLM

Legado

HuggingFace

34.8KDownloads101CurtidasJul 2023Publicado8K tokensContextoApache 2.0Licença50 BomQualidade

InternLM Chat 7B (7B parameters) requires approximately 13.9 GB of VRAM with Q4_K_M quantization. For the best balance of quality and speed, we recommend hardware with at least 16 GB of VRAM.

Comece agora

— copie e cole para rodar localmente

Copy-paste commands to run InternLM Chat 7B on your machine.

Run

docker run --rm -it ghcr.io/ggerganov/llama.cpp:full \
  --hf-repo "InternLM/InternLM-Chat-7B" \
  --hf-file "InternLM-Chat-7B-Q4_K_M.gguf" \
  -c 4096 -ngl 99

Quick specs

Parameters7B

Architecturedense

Context8K tokens

Modalitytext

Min RAM2.7 GB

Rec. RAM4.3 GB (Q4_K_M)

LicenseApache 2.0

FamilyInternLM

✓ Chat✓ Reasoning

About this model

InternLM has open-sourced a 7 billion parameter base model and a chat model tailored for practical scenarios. The model has the following characteristics: - It leverages trillions of high-quality tokens for training to establish a powerful knowledge base. - It supports an 8k context window length, enabling longer input sequences and stronger reasoning capabilities. - It provides a versatile toolset for users to flexibly build their own workflows.

•It leverages trillions of high-quality tokens for training to establish a powerful knowledge base
•It supports an 8k context window length, enabling longer input sequences and stronger reasoning capabilities
•It provides a versatile toolset for users to flexibly build their own workflows

Modelos relacionados

Escolhas rápidas

Melhor custo-benefícioA

RX 7600 XT 16GB~$329 — 39 tok/s

Melhor no geralA

RX 7900 XT 20GB~$899 — 98 tok/s

Melhor hardware

Melhores opções para InternLM Chat 7B

Rodar este modelo

InternLM Chat 7B on RX 7900 XT 20GB InternLM Chat 7B on RTX A4500 20GB InternLM Chat 7B on RTX 3090 24GB

Opções de quantização

Estimativas de VRAM por nível de quantização

No hardware detected — fit column shows raw VRAM estimates

Quant	Bits	VRAM	Quality	Fit
Q2_K	2	2.7 GB	Low	—
Q3_K_S	3	3.4 GB	Low	—
NVFP4	4	3.9 GB	Medium	—
Q4_K_M	4	4.3 GB	Medium	—
Q5_K_M	5	5.0 GB	High	—
Q6_K	6	5.7 GB	High	—
Q8_0	8	7.5 GB	Very High	—
F16	16	14.3 GB	Maximum	—

Compatibilidade de hardware

Estimativas de compatibilidade para todo o hardware

Abrir calculadora

Computing compatibility...

Detalhamento de memória

Reference: RTX 2060 6GB

Weights4.3 GB

KV Cache7.8 GB

Runtime1.2 GB

Headroom0.6 GB

Perguntas frequentes

FAQ — InternLM Chat 7B

Veja também

Guia de Quantização Metodologia de Pontuação Abrir calculadora