KI-Modelle durchsuchen

374 Modells verfügbar

/
Status:
Sort:
MistralMistralDevstral 2 123B Instruct
123B256K ctx75 GBfrontier
denseTop tier

Devstral is an agentic LLM for software engineering tasks. Devstral 2 excels at using tools to explore codebases, editing multiple files and power software engineering agents. The model achieves remarkable performance on SWE-bench.

Moonshot AIMoonshot AIKimi K2.5
1000B (32B active)256K ctx610 GBfrontier
moeTop tier

Kimi K2.5 is Moonshot AI's advanced reasoning model with strong performance in math, coding, and multilingual tasks. Features long-context understanding and agentic capabilities for complex multi-step problem solving.

Moonshot AIMoonshot AIKimi K2.6
1000B (32B active)256K ctx610 GBfrontier
moeTop tier

Kimi K2.6 is Moonshot AI's open-weight multimodal agentic model, focused on long-horizon coding, coding-driven design, autonomous execution, and swarm-style task orchestration.

AlibabaAlibabaQwen 3.5 397B A17B
397B (17B active)131K ctx242.2 GBfrontier
moeTop tier

Qwen3.5 397B A17B is the flagship of the Qwen3.5 family — a massive MoE model with frontier-level quality across all tasks.

AlibabaAlibabaQwen3-Coder 30B A3B Instruct
30.5B (3.3B active)256K ctx18.6 GBfrontier
moeTop tier

Qwen3-Coder is available in multiple sizes. Today, we're excited to introduce Qwen3-Coder-30B-A3B-Instruct. This streamlined model maintains impressive performance and efficiency, featuring the following key enhancements:

DeepSeekDeepSeekDeepSeek V4 Pro
1600B (49B active)1.0M ctx976 GBfrontier
moeTop tier

DeepSeek V4 Pro is a 1.6T-parameter sparse MoE (49B active, 384 routed + 1 shared expert) built for million-token agentic reasoning. Experts ship natively in FP4, so the real on-disk footprint is roughly 862 GB (FP4 experts + FP8 attention) rather than the trillion-scale FP16 size — but it is still a server/workstation deployment: realistic local use targets 8x 80GB GPUs or 1 TB+ unified memory, and at long Think Max contexts the KV cache dominates.

AlibabaAlibabaQwen 3.5 27B
27B131K ctx16.5 GBfrontier
denseTop tier

Qwen3.5 27B provides frontier-level quality in a size that fits on high-end consumer GPUs. Excellent for coding, reasoning, and RAG workloads.

AlibabaAlibabaQwen 3.6 27B
27B262K ctx16.5 GBfrontier
denseTop tier

Qwen 3.6 27B is a dense, multimodal 27B model that beats the previous-gen Qwen 3.5 397B-A17B MoE flagship on SWE-bench Verified (77.2% vs 76.2%) while fitting on a 16-24 GB consumer GPU. It uses a Gated DeltaNet + Gated Attention hybrid, ships a vision encoder, and supports 262K native context extensible to ~1M tokens via YaRN.

AlibabaAlibabaQwen 3.5 122B A10B
122B (10B active)131K ctx74.4 GBfrontier
moeTop tier

Qwen3.5 122B A10B is a high-quality MoE model with 10B active parameters. Strong performance across coding, reasoning, and RAG with moderate inference cost.

DeepSeekDeepSeekDeepSeek V4 Flash
284B (13B active)1.0M ctx173.2 GBfrontier
moeTop tier

DeepSeek V4 Flash is the lighter 284B-parameter sparse MoE sibling of V4 Pro (13B active, 256 routed + 1 shared expert) with the same 1M-token context. Experts ship natively in FP4, so the real on-disk footprint is roughly 158 GB rather than the FP16 size — it fits a single 192 GB unified-memory machine or a 2-4 GPU server while keeping near-frontier reasoning and coding quality.

AlibabaAlibabaQwen 3.6 35B A3B
35B (3B active)262K ctx21.3 GBfrontier
moeTop tier

Qwen 3.6 35B A3B is the first open-weight Qwen 3.6 model, a multimodal MoE release focused on stronger agentic coding, long-context reasoning, and more stable repository-scale workflows.

AlibabaAlibabaQwen3-VL 30B A3B Instruct
30B (3B active)256K ctx18.3 GBfrontier
moeTop tier

Meet Qwen3-VL — the most powerful vision-language model in the Qwen series to date.

AlibabaAlibabaQwen 3.5 9B
9B131K ctx5.5 GBfrontier
denseTop tier

Qwen3.5 9B is a strong all-rounder at the popular 7-9B parameter sweet spot, with significant improvements in reasoning, code generation, and instruction following.

AlibabaAlibabaQwen 3.5 35B A3B
35B (3B active)131K ctx21.3 GBfrontier
moeTop tier

Qwen3.5 35B A3B is a Mixture-of-Experts model with only 3B active parameters per token, offering surprisingly strong performance at very low inference cost.

MistralMistralMagistral Small 2507
24B131K ctx14.6 GBlegacy
denseTop tier

Building upon Mistral Small 3.1 (2503), with added reasoning capabilities, undergoing SFT from Magistral Medium traces and RL on top, it's a small, efficient reasoning model with 24B parameters.

MistralMistralDevstral Small 2 24B Instruct
24B256K ctx14.6 GBfrontier
denseTop tier

Devstral is an agentic LLM for software engineering tasks. Devstral Small 2 excels at using tools to explore codebases, editing multiple files and power software engineering agents. The model achieves remarkable performance on SWE-bench.

AlibabaAlibabaQwen 3 32B
32B131K ctx19.5 GBfrontier
denseTop tier

Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support, with the following key features:

AlibabaAlibabaQwen 3 14B
14B131K ctx8.5 GBfrontier
denseTop tier

Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support, with the following key features:

MistralMistralMistral Small 4 119B
119B (6.5B active)256K ctx72.6 GBfrontier
moeTop tier

Mistral Small 4 is a powerful hybrid model capable of acting as both a general instruction model and a reasoning model. It unifies the capabilities of three different model families—Instruct, Reasoning (previously called Magistral), and Devstral—into a single, unified model.

AlibabaAlibabaQwen 3 30B A3B
30.5B (3.3B active)131K ctx18.6 GBfrontier
moeTop tier

We introduce the updated version of the Qwen3-30B-A3B non-thinking mode, named Qwen3-30B-A3B-Instruct-2507, featuring the following key enhancements:

CohereCohereCommand A 111B
111B262K ctx67.7 GBfrontier
denseTop tier

Command A is Cohere's latest flagship model with 111B parameters, designed for agentic enterprise applications. Features advanced tool use, multi-step reasoning, and retrieval-augmented generation.

OpenAIOpenAIGPT-OSS 120B
117B131K ctx71.4 GBfrontier
denseTop tier

GPT-OSS 120B is OpenAI's large open-source model, offering strong reasoning and coding capabilities.

AlibabaAlibabaQwen 2.5 VL 72B
72B33K ctx43.9 GBfrontier
denseTop tier

license: other license_name: qwen license_link: https://huggingface.co/Qwen/Qwen2.5-VL-72B-Instruct/blob/main/LICENSE language: - en pipeline_tag: image-text-to-text tags: - multimodal library_name: transformers

NVIDIANVIDIANemotron 3 Nano 30B
30B131K ctx18.3 GBfrontier
denseTop tier

Nemotron 3 Nano 30B is NVIDIA's mid-size reasoning model delivering strong performance across coding, math, and agentic tasks. Fits on a 24 GB GPU at Q4_K_M.