DeepSeek
DeepSeek V4 Flash
Frontier3.4MDownloads1.3KLikesApr 2026Veröffentlicht1.0M TokenKontextMITLizenz98 HerausragendQualität
DeepSeek V4 Flash (284B parameters) requires approximately 160.8 GB of VRAM with NVFP4 quantization. As a Mixture of Experts model with 13B active parameters, it uses less memory than its total parameter count suggests. For the best balance of quality and speed, we recommend hardware with at least 185 GB of VRAM.
Loslegen
— kopieren & einfügen, um lokal auszuführenCopy-paste commands to run DeepSeek V4 Flash on your machine.
Run
docker run --rm -it ghcr.io/ggerganov/llama.cpp:full \
--hf-repo "deepseek-ai/DeepSeek-V4-Flash" \
--hf-file "DeepSeek-V4-Flash-NVFP4.gguf" \
-c 4096 -ngl 99Quick specs
Parameters284B (13B active)
Architecturemoe (MoE)
Context1.0M tokens
Modalitytext
Min RAM110.8 GB
Rec. RAM159 GB (NVFP4)
LicenseMIT
FamilyDeepSeek
✓ Code✓ Reasoning
About this model
- •284B total / 13B active sparse MoE — 256 routed + 1 shared expert
- •Native FP4 experts: ~158 GB on disk
- •1M-token context with near-frontier coding quality
- •Runs on a single 192 GB unified-memory box or a small GPU server
Verwandte Modelle
Schnellauswahl
Beste Hardware
Top-Empfehlungen für DeepSeek V4 Flash
Dieses Modell ausführen
Quantisierungsoptionen
VRAM-Schätzungen nach Quantisierungsstufe
No hardware detected — fit column shows raw VRAM estimates
| Quant | Bits | VRAM | Quality | Fit |
|---|---|---|---|---|
Q2_K | 2 | 110.8 GB | Low | — |
Q3_K_S | 3 | 139.2 GB | Low | — |
NVFP4 | 4 | 159.0 GB | Medium | — |
Q4_K_M | 4 | 173.2 GB | Medium | — |
Q5_K_M | 5 | 204.5 GB | High | — |
Q6_K | 6 | 232.9 GB | High | — |
Q8_0 | 8 | 303.9 GB | Very High | — |
F16 | 16 | 582.2 GB | Maximum | — |
Quality benchmarks
DeepSeek V4 Flash benchmark scores
Coding
SWE-bench Verified—
HumanEval+—
Aider Polyglot—
LiveCodeBench91.6%
Reasoning
MMLU-Pro86.2%
GPQA Diamond—
MATH-500—
ARC Challenge—
Source: vendor-reported · 2026-04-24
Hardware-Kompatibilität
Eignungsschätzungen für alle Hardware
Computing compatibility...
Speicheraufschlüsselung
Reference: RTX 2060 6GB
Weights158.0 GB
KV Cache1.3 GB
Runtime0.9 GB
Headroom0.6 GB
Häufig gestellte Fragen
FAQ — DeepSeek V4 Flash
Siehe auch