InternLM

InternLM Chat 7B

Name: InternLM Chat 7B
Rating: 69 (142 reviews)
Author: InternLM

Legacy

HuggingFace

34.8KDownloads101LikesJul 2023Veröffentlicht8K TokenKontextApache 2.0Lizenz50 GutQualität

InternLM Chat 7B (7B parameters) requires approximately 13.9 GB of VRAM with Q4_K_M quantization. For the best balance of quality and speed, we recommend hardware with at least 16 GB of VRAM.

Loslegen

— kopieren & einfügen, um lokal auszuführen

Copy-paste commands to run InternLM Chat 7B on your machine.

Run

docker run --rm -it ghcr.io/ggerganov/llama.cpp:full \
  --hf-repo "InternLM/InternLM-Chat-7B" \
  --hf-file "InternLM-Chat-7B-Q4_K_M.gguf" \
  -c 4096 -ngl 99

Quick specs

Parameters7B

Architecturedense

Context8K tokens

Modalitytext

Min RAM2.7 GB

Rec. RAM4.3 GB (Q4_K_M)

LicenseApache 2.0

FamilyInternLM

✓ Chat✓ Reasoning

About this model

InternLM has open-sourced a 7 billion parameter base model and a chat model tailored for practical scenarios. The model has the following characteristics: - It leverages trillions of high-quality tokens for training to establish a powerful knowledge base. - It supports an 8k context window length, enabling longer input sequences and stronger reasoning capabilities. - It provides a versatile toolset for users to flexibly build their own workflows.

•It leverages trillions of high-quality tokens for training to establish a powerful knowledge base
•It supports an 8k context window length, enabling longer input sequences and stronger reasoning capabilities
•It provides a versatile toolset for users to flexibly build their own workflows

Verwandte Modelle

Schnellauswahl

Bestes BudgetA

RX 7600 XT 16GB~$329 — 39 tok/s

Beste GesamtwahlA

RX 7900 XT 20GB~$899 — 98 tok/s

Beste Hardware

Top-Empfehlungen für InternLM Chat 7B

Dieses Modell ausführen

InternLM Chat 7B on RX 7900 XT 20GB InternLM Chat 7B on RTX A4500 20GB InternLM Chat 7B on RTX 3090 24GB

Quantisierungsoptionen

VRAM-Schätzungen nach Quantisierungsstufe

No hardware detected — fit column shows raw VRAM estimates

Quant	Bits	VRAM	Quality	Fit
Q2_K	2	2.7 GB	Low	—
Q3_K_S	3	3.4 GB	Low	—
NVFP4	4	3.9 GB	Medium	—
Q4_K_M	4	4.3 GB	Medium	—
Q5_K_M	5	5.0 GB	High	—
Q6_K	6	5.7 GB	High	—
Q8_0	8	7.5 GB	Very High	—
F16	16	14.3 GB	Maximum	—

Hardware-Kompatibilität

Eignungsschätzungen für alle Hardware

Rechner öffnen

Computing compatibility...

Speicheraufschlüsselung

Reference: RTX 2060 6GB

Weights4.3 GB

KV Cache7.8 GB

Runtime1.2 GB

Headroom0.6 GB

Häufig gestellte Fragen

FAQ — InternLM Chat 7B

Siehe auch

Quantisierungsleitfaden Bewertungsmethodik Rechner öffnen