Will It Run AI

Explorar modelos de IA

374 modelos disponibles

/
Estado:
Sort:
AlibabaAlibabaQwen 3 4B
4B33K ctx2.4 GBcurrent
denseAlto

We introduce the updated version of the Qwen3-4B non-thinking mode, named Qwen3-4B-Instruct-2507, featuring the following key enhancements:

Mistral AIMistral AIMistral Small 3.1 24B
24B131K ctx14.6 GBfrontier
denseAlto

Mistral Small 3.1 is an updated version of Mistral Small with improved instruction following and vision capabilities.

MetaMetaLlama 3.1 70B
70B128K ctx42.7 GBlegacy
denseAlto

Llama 3.1 70B is Meta's high-capability open model with 128K context window. Excels at complex reasoning, multilingual tasks, code generation, and tool use with quality competitive with leading proprietary models.

AlibabaAlibabaQwen 2.5 72B
72B131K ctx43.9 GBcurrent
denseAlto

Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters. Qwen2.5 brings the following improvements upon Qwen2:

AlibabaAlibabaQwen 2.5 14B
14B131K ctx8.5 GBcurrent
denseAlto

Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters. Qwen2.5 brings the following improvements upon Qwen2:

OpenBMBOpenBMBMiniCPM-V 2.6 8B
8B2K ctx4.9 GBcurrent
denseAlto

MiniCPM-V 2.6 is OpenBMB's compact multimodal model supporting image and video understanding alongside text. Delivers strong visual reasoning and OCR capabilities at 8B parameter scale.

MistralMistralMinistral 3 8B
8B262K ctx4.9 GBfrontier
multimodalAlto

A balanced model in the Ministral 3 family, Ministral 3 8B is a powerful, efficient tiny language model with vision capabilities.

GoogleGoogleGemma 3 12B
12B131K ctx7.3 GBcurrent
denseAlto

Gemma 3 12B is Google's mid-range Gemma 3 model with vision capabilities. Offers strong reasoning, code generation, and image understanding balanced with practical resource requirements.

IBMIBMGranite Code 20B
20B8K ctx12.2 GBcurrent
denseAlto

Granite-20B-Code-Instruct-8K is a 20B parameter model fine tuned from *Granite-20B-Code-Base-8K* on a combination of permissively licensed instruction data to enhance instruction following capabilities including logical reasoning and problem-solving skills.

AlibabaAlibabaQwen 2.5 VL 7B
7B33K ctx4.3 GBcurrent
denseAlto

license: apache-2.0 language: - en pipeline_tag: image-text-to-text tags: - multimodal library_name: transformers

DefogDefogSQLCoder 7B
7B8K ctx4.3 GBcurrent
denseAlto

The model weights were updated at 7 AM UTC on Feb 7, 2024. The new model weights lead to a much more performant model – particularly for joins.

NVIDIANVIDIANemotron Nano 9B v2
9B131K ctx5.5 GBfrontier
denseMedio

Nemotron Nano 9B v2 is an updated version of NVIDIA's compact reasoning model with improved instruction following, coding, and math capabilities.

DeepSeekDeepSeekDeepSeek Coder V2 16B
16B (2.4B active)131K ctx9.8 GBcurrent
moeMedio

We present DeepSeek-Coder-V2, an open-source Mixture-of-Experts (MoE) code language model that achieves performance comparable to GPT4-Turbo in code-specific tasks. Specifically, DeepSeek-Coder-V2 is further pre-trained from an intermediate checkpoint of DeepSeek-V2 with additional 6 trillion tokens. Through this continued pre-training, DeepSeek-Coder-V2 substantially enhances the coding and mathematical reasoning capabilities of DeepSeek-V2, while maintaining comparable performance in general language tasks.

Tsinghua/ZhipuTsinghua/ZhipuCodeGeeX 4 9B
9B131K ctx5.5 GBcurrent
denseMedio

We introduce CodeGeeX4-ALL-9B, the open-source version of the latest CodeGeeX4 model series. It is a multilingual code generation model continually trained on the GLM-4-9B, significantly enhancing its code generation capabilities. Using a single CodeGeeX4-ALL-9B model, it can support comprehensive functions such as code completion and generation, code interpreter, web search, function call, repository-level code Q&A, covering various scenarios of software development. CodeGeeX4-ALL-9B has achieved highly competitive performance on public benchmarks, such as BigCodeBench and NaturalCodeBench.

MetaMetaLlama 4 Scout 17B 16E
109B (17B active)10.5M ctx66.5 GBfrontier
moeMedio

Llama 4 Scout is Meta's efficient Mixture-of-Experts model with 17B active parameters across 16 experts. Supports a 10M token context window and natively handles text, images, and video inputs.

Magistral AIMagistral AIMagistral 7B
7B8K ctx4.3 GBlegacy
denseMedio

Magistral 7B is Mistral AI's reasoning-focused model designed for complex analytical and mathematical tasks. Features chain-of-thought capabilities for step-by-step problem solving.

AlibabaAlibabaQwen 2.5 Coder 32B
32B131K ctx19.5 GBcurrent
denseMedio

Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen). As of now, Qwen2.5-Coder has covered six mainstream model sizes, 0.5, 1.5, 3, 7, 14, 32 billion parameters, to meet the needs of different developers. Qwen2.5-Coder brings the following improvements upon CodeQwen1.5:

GoogleGoogleGemma 4 E4B
8B128K ctx4.9 GBfrontier
denseMedio

Gemma 4 E4B is Google's mid-range on-device model with 8B total parameters (4.5B effective). Default Gemma 4 model on Ollama. Supports text and image. Apache 2.0 licensed.

Sentence TransformersSentence TransformersAll MiniLM L6 v2
0.02B0K ctx0 GBcurrent
denseMedio

This is a sentence-transformers model: It maps sentences & paragraphs to a 384 dimensional dense vector space and can be used for tasks like clustering or semantic search.

IBMIBMGranite Code 34B
34B8K ctx20.7 GBcurrent
denseMedio

Granite Code 34B is IBM's largest code generation model, strong across 100+ programming languages.

AllenAIAllenAIOLMo 2 13B
13B33K ctx7.9 GBcurrent
denseMedio

OLMo 2 13B is AI2's fully open research model with transparent training data and methodology. Designed for reproducible research with competitive performance on reasoning and general knowledge tasks.

CohereCohereCommand R 35B
35B131K ctx21.3 GBcurrent
denseMedio

Command R is Cohere's retrieval-augmented generation model optimized for enterprise use. Excels at long-context document processing, tool use, and grounded generation with citation support.

DeepSeekDeepSeekDeepSeek R1 Distill 70B
70B131K ctx42.7 GBfrontier
denseMedio

DeepSeek R1 Distill 70B is a distilled reasoning model based on Llama 70B, offering strong chain-of-thought reasoning at a practical size.

AlibabaAlibabaQwen 2.5 7B
7B131K ctx4.3 GBcurrent
denseMedio

Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters. Qwen2.5 brings the following improvements upon Qwen2: