KI-Modelle durchsuchen
374 Modells verfügbar
Qwen3.5 4B offers a sweet spot between capability and efficiency, handling coding and general tasks well on modest hardware.
Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support, with the following key features:
Today, we're announcing Qwen3-Coder-Next, an open-weight language model designed specifically for coding agents and local development. It features the following key enhancements:
> [!IMPORTANT] > To fully take advantage of the model's capabilities, inference must use `temperature=0.8`, `top_k=50`, `top_p=0.95`, and `do_sample=True`. For more complex queries, set `max_new_tokens=32768` to allow for longer chain-of-thought (CoT).
Devstral is an agentic LLM for software engineering tasks built under a collaboration between Mistral AI and All Hands AI 🙌. Devstral excels at using tools to explore codebases, editing multiple files and power software engineering agents. The model achieves remarkable performance on SWE-bench which positions it as the #1 open source model on this benchmark.
GLM-5.1 is Z.ai's next-generation flagship MoE model for agentic engineering, with significantly stronger coding capabilities than GLM-5. It achieves state-of-the-art performance on SWE-Bench Pro and sustains optimization over hundreds of rounds and thousands of tool calls on long-horizon agentic tasks.
Pixtral-Large-Instruct-2411 is a 124B multimodal model built on top of Mistral Large 2, i.e., Mistral-Large-Instruct-2407. Pixtral Large is the second model in our multimodal family and demonstrates frontier-level image understanding. Particularly, the model is able to understand documents, charts and natural images, while maintaining the leading text-only understanding of Mistral Large 2.
DeepSeek V3.2 is a 671B MoE model with 37B active parameters per token, using DeepSeek Sparse Attention and Multi-head Latent Attention. 128K context window. MIT licensed. Requires multi-GPU or high-memory Macs for local inference.
GPT-OSS 20B is OpenAI's first open-weight model, a 21B-parameter mixture-of-experts model with 3.6B active parameters per token. Features configurable reasoning effort (low/medium/high), full chain-of-thought visibility, and agentic capabilities including function calling. Runs on devices with 16GB of memory using MXFP4 quantization.
We introduce the updated version of the Qwen3-235B-A22B non-thinking mode, named Qwen3-235B-A22B-Instruct-2507, featuring the following key enhancements:
Today, we're announcing Qwen3-Coder, our most agentic code model to date. Qwen3-Coder is available in multiple sizes, but we're excited to introduce its most powerful variant first: Qwen3-Coder-480B-A35B-Instruct. featuring the following key enhancements:
NVIDIA Nemotron Cascade 2 is a 30B MoE model with 3B active parameters, using a Mamba-2 + Transformer hybrid architecture. Gold medal at IMO 2025 and IOI 2025. 92% AIME 2025, 87% LiveCodeBench. Fits on a single RTX 4090.
Phi-4 Mini Reasoning is Microsoft's compact reasoning model with chain-of-thought capabilities at just 3.8B parameters.
Gemma 4 31B is the largest and most capable open Gemma model. Dense architecture with 30.7B parameters. 256K context window. Achieves 2150 Codeforces ELO and 89.2% AIME 2026. Apache 2.0 licensed.
jina-embeddings-v3: Multilingual Embeddings With Task LoRA
Self-evolving agent model with 230B total / 10B active MoE architecture. SOTA on SWE-Pro (56.2%) and Terminal Bench 2 (57%). Runs locally on 128GB Mac with Dynamic 4-bit GGUF.
Leanstral is Mistral's open-weight proof and code agent for Lean 4 workflows, built on the Mistral Small 4 family with multimodal input, tool use, and long-context support.
We present DeepSeek-Coder-V2, an open-source Mixture-of-Experts (MoE) code language model that achieves performance comparable to GPT4-Turbo in code-specific tasks. Specifically, DeepSeek-Coder-V2 is further pre-trained from an intermediate checkpoint of DeepSeek-V2 with additional 6 trillion tokens. Through this continued pre-training, DeepSeek-Coder-V2 substantially enhances the coding and mathematical reasoning capabilities of DeepSeek-V2, while maintaining comparable performance in general language tasks.
We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrated remarkable performance on reasoning. With RL, DeepSeek-R1-Zero naturally emerged with numerous powerful and interesting reasoning behaviors. However, DeepSeek-R1-Zero encounters challenges such as endless repetition, poor readability, and language mixing.
DeepSeek V3.1 (V3-0324) is a major update to the DeepSeek V3 family, with substantial improvements in instruction following, coding, creative writing, and agentic capabilities.
Nemotron Nano 8B is NVIDIA's reasoning model derived from Llama 3.1 8B Instruct, post-trained for switchable reasoning with on/off modes. Achieves 95.4% on MATH-500 and 54.1% on GPQA Diamond with reasoning enabled. Fits on a single RTX GPU for local deployment.
EXAONE 4.0 is LG AI Research's flagship language model. The 32B variant offers strong multilingual performance with particular strength in Korean and English tasks.