AIモデル一覧
374モデルsが利用可能
SmolLM3 is a fully open 3B-parameter language model with dual-mode reasoning, 128K context via YARN extrapolation, and native support for 6 languages. Pretrained on 11.2T tokens with a staged curriculum of web, code, math, and reasoning data. Post-trained with 140B reasoning tokens and Anchored Preference Optimization.
Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support, with the following key features:
Aya Expanse 32B is Cohere's massively multilingual model supporting 23 languages with strong instruction following.
Gemma 2 2B is Google's lightweight model designed for on-device and edge deployment. Delivers strong text generation and reasoning performance at minimal resource cost.
Model Summary: Granite-3.1-8B-Instruct is a 8B parameter long-context instruct model finetuned from Granite-3.1-8B-Base using a combination of open source instruction datasets with permissive license and internally collected synthetic datasets tailored for solving long context problems. This model is developed using a diverse set of techniques with a structured chat format, including supervised finetuning, model alignment using reinforcement learning, and model merging.
Llama 3.2 1B is Meta's smallest text model designed for on-device inference. Optimized for multilingual text generation, summarization, and instruction following on resource-constrained hardware.
> [!Warning] > > > 🚨 Qwen2.5-Math mainly supports solving English and Chinese math problems through CoT and TIR. We do not recommend using this series of models for other tasks. > >
Qwen3.5 is the latest generation of Alibaba's Qwen large language model family, bringing major improvements in reasoning, instruction following, and multilingual capability across both dense and MoE architectures.
Qwen 2.5 0.5B is an ultra-lightweight model for edge and IoT deployment, offering basic chat capability at minimal resource cost.
Advancing Open-source Language Models with Mixed-Quality Data
Aya Expanse 8B is Cohere's multilingual model supporting 23 languages with strong cross-lingual transfer. Designed for global applications requiring high-quality generation across diverse languages.
- they might want nothing more than destruction itself rather then anything else from their quest after immortality (and maybe someone should tell them about modern medicine)? In any event though – one thing remains true regardless : whether or not success comes easy depends entirely upon how much effort we put into conquering whatever challenges lie ahead along with having faith deep down inside ourselves too ;) So let’s get started now shall We?" pipeline_tag: text-generation
- Project Website: bigcode-project.org - Paper: Link - Point of Contact: [email protected] - Languages: 600+ Programming languages
Nemotron-Mini-4B-Instruct is a model for generating responses for roleplaying, retrieval augmented generation, and function calling. It is a small language model (SLM) optimized through distillation, pruning and quantization for speed and on-device deployment. It is a fine-tuned version of nvidia/Minitron-4B-Base, which was pruned and distilled from Nemotron-4 15B using our LLM compression technique. This instruct model is optimized for roleplay, RAG QA, and function calling in English. It supports a context length of 4,096 tokens. This model is ready for commercial use.
*In the tapestry of Greek mythology, Hermes reigns as the eloquent Messenger of the Gods, a deity who deftly bridges the realms through the art of communication. It is in homage to this divine mediator that I name this advanced LLM "Hermes," a system crafted to navigate the complex intricacies of human discourse with celestial finesse.*
Curated and trained by Eric Hartford, Lucas Atkins, and Fernando Fernandes, and Cognitive Computations
- Model type: A 7B parameter GPT-like model fine-tuned on a mix of publicly available, synthetic datasets. - Language(s) (NLP): Primarily English - License: MIT - Finetuned from model: mistralai/Mistral-7B-v0.1