Qwen3.6-35B-A3B Release Date — Open-Weight & GGUF Timeline (April 2026)
qwen3.6-35b-a3b release date status: API launched April 2, 2026; open-weight GGUF for the 35B-A3B MoE expected late April / early May 2026. Latest timeline and links.
This page tracks the Qwen3.6-35B-A3B release date and the full Qwen 3.6 family timeline, including GGUF quantizations and client integrations. Updated daily as the timeline develops.
Current status (April 23, 2026) — RELEASED
| Channel | Status | Date |
|---|---|---|
| Alibaba Cloud API (Qwen 3.6 Plus Preview) | ✅ Live | March 30, 2026 |
| Open-weight Qwen3.6-35B-A3B on Hugging Face | ✅ Live | April 16, 2026 |
| Open-weight Qwen3.6-27B dense on Hugging Face | ✅ Live | April 22, 2026 |
| GGUF quantizations (unsloth, ggml-org, bartowski) | ✅ Live | Within 24-48h of HF |
| vLLM (≥0.19.0) | ✅ Supported | April 17, 2026 |
| SGLang (≥0.5.10) | ✅ Supported | April 17, 2026 |
| LM Studio | ✅ Supported | Rolling |
| Jan | ✅ Supported | Rolling |
| Ollama library | ⏳ In progress | Pending mmproj vision file support |
Actual release timeline
The family rolled out in three waves:
- March 30-31, 2026 — Qwen 3.6 Plus API preview on Alibaba Cloud + free access via OpenRouter
- April 16, 2026 — Qwen3.6-35B-A3B open weights under Apache 2.0
- April 22, 2026 — Qwen3.6-27B dense open weights, with surprising flagship-level coding benchmarks
The API-to-open-weight gap was 17 days for the 35B-A3B (longer than the 11 days of Qwen 3.5 → Qwen 3.5 OW), and a further 6 days for the 27B dense variant.
Download Qwen 3.6 now
Since open weights shipped, the fastest paths to run Qwen 3.6 locally:
Qwen3.6-35B-A3B MoE (~21 GB Q4, best for fast chat):
# Unsloth GGUF
huggingface-cli download unsloth/Qwen3.6-35B-A3B-GGUF Qwen3.6-35B-A3B-Q4_K_M.gguf
# Or via vLLM
pip install "vllm>=0.19.0"
vllm serve Qwen/Qwen3.6-35B-A3B --max-model-len 262144
Qwen3.6-27B dense (~16.8 GB Q4, best for coding — fits 16 GB GPUs):
# Unsloth GGUF
huggingface-cli download unsloth/Qwen3.6-27B-GGUF Qwen3.6-27B-UD-Q4_K_XL.gguf
# Or via vLLM
vllm serve Qwen/Qwen3.6-27B --max-model-len 262144 --reasoning-parser qwen3
See the dedicated pages for exact quantization tables and buyer advice:
- Qwen3.6-27B VRAM & Hardware Requirements
- Qwen 3.6 VRAM & Hardware Requirements (35B-A3B, with buyer guide)
Monitor future releases
- Qwen HF organization — official weights
- Unsloth Qwen 3.6 docs — recommended GGUFs
- Alibaba QwenLM GitHub — source + training details
Key differences vs Qwen 3.5 35B-A3B
- 1M-token native context (vs 262K in Qwen 3.5). Long documents, multi-file codebases, and agentic workflows benefit the most.
- Same 35B total / 3B active parameters — VRAM and tokens/sec are nearly identical at short context.
- KV cache grows significantly at full 1M context — plan for 20-40 GB of extra VRAM if you push context past 256K.
Related pages
- Qwen 3.6 VRAM & Hardware Requirements (35B-A3B) — exact Q4/Q5/Q6/Q8/FP16 numbers + which GPU to buy
- Qwen3.6-27B VRAM & Hardware Requirements — the dense 27B sibling
- Qwen 3.5 35B-A3B VRAM Requirements — current-generation sibling
- Qwen 3 / 3.5 Family GPU Requirements — original family overview