AlibabaAlibaba

Qwen 3.5 35B A3B

最先端
Jun 2025公開日131K トークンコンテキストApache 2.0ライセンス97 卓越品質

Qwen 3.5 35B A3B (35B parameters) requires approximately 24.6 GB of VRAM with Q4_K_M quantization. As a Mixture of Experts model with 3B active parameters, it uses less memory than its total parameter count suggests. For the best balance of quality and speed, we recommend hardware with at least 29 GB of VRAM.

はじめに

— コピー&ペーストでローカル実行

Copy-paste commands to run Qwen 3.5 35B A3B on your machine.

Run

ollama run qwen3.5:35b-a3b

Quick specs

Parameters35B (3B active)
Architecturemoe (MoE)
Context131K tokens
Modalitytext
Min RAM13.7 GB
Rec. RAM21.3 GB (Q4_K_M)
LicenseApache 2.0
FamilyQwen
Code Chat

About this model

Qwen3.5 35B A3B is a Mixture-of-Experts model with only 3B active parameters per token, offering surprisingly strong performance at very low inference cost.

  • MoE architecture: 35B total params but only 3B active — fast and memory-efficient
  • Quality comparable to dense 7-8B models at a fraction of the compute
  • Runs well on 8 GB VRAM GPUs at Q4_K_M

関連モデル

あなたのハードウェア

検出中...

おすすめ

最適なハードウェア

Qwen 3.5 35B A3Bのおすすめ

このモデルを実行

量子化オプション

量子化レベル別VRAM推定値

No hardware detected — fit column shows raw VRAM estimates

QuantBitsVRAMQualityFit
Q2_K
2
13.7 GB
Low
Q3_K_S
3
17.2 GB
Low
NVFP4
4
19.6 GB
Medium
Q4_K_M
4
21.3 GB
Medium
Q5_K_M
5
25.2 GB
High
Q6_K
6
28.7 GB
High
Q8_0
8
37.5 GB
Very High
F16
16
71.8 GB
Maximum

Quality benchmarks

Qwen 3.5 35B A3B benchmark scores

Benchmark verified

Coding

SWE-bench Verified69.2%
HumanEval+
Aider Polyglot
LiveCodeBench74.6%

Reasoning

MMLU-Pro85.3%
GPQA Diamond84.2%
MATH-500
ARC Challenge

General

Chatbot Arena
IFEval91.9%

Source: official · 2026-03-01

ハードウェア互換性

全ハードウェアの適合度推定

カリキュレーターを開く

Computing compatibility...

メモリ内訳

Reference: RTX 2060 6GB

Weights21.3 GB
KV Cache1.5 GB
Runtime1.2 GB
Headroom0.6 GB

よくある質問

FAQ — Qwen 3.5 35B A3B

関連項目