Will It Run AI · カリキュレーター

お持ちのハードウェアとやりたいことを教えてください。最適なローカルモデルをランキングします。

ハードウェアとワークロードから始めて、汎用的なモデルリストやベンチマークのスクリーンショットを頼りにする代わりに、適合度、速度、ランタイム対応に基づいた候補リストを取得しましょう。

ハードウェアから始めるランキングの仕組みを見る

ライブカタログスナップショット: 196 hardware profiles, 380 models, 24 runtimes。静的なベンチマークリストではなく、現在のカタログに合わせてカリキュレーターを更新しています。

評価中

RTX 4070 12GB

ワークロード

Coding

ランタイム

llama.cpp

Operating mode

Balanced

入力

テストしたいハードウェア、ランタイム、ワークロードを選択してください。

検出されたハードウェアが正しければそのまま使用し、異なる場合は変更して、ランキングを再実行してローカルAIの選択肢を比較できます。

Browser detection

Collecting GPU metadata…

Awaiting detection

Hardware

Custom hardware specs

RuntimeWorkloadOperating mode

Balanced for general local use. Keeps the ranking neutral across personal and serving workflows.

Update the hardware or workload and recalculate to refresh the ranking.

1. 適合度

メモリの適合度と余裕が、選択したハードウェアでモデルが現実的に動作するかを判定します。

2. ワークロード

選択したタスクに合うモデルにスコアを加算し、新しい専門リリースがある場合は古いモデルファミリーにペナルティを与えます。

3. 速度

デコードスループットとTTFTにより、理論上動作可能なだけでなく、実際に使えるモデルに候補を絞ります。

Qwen

Qwen 3.5 9B

最先端Jun 2025 リリースHugging FaceOllamaLM Studio

なぜ推奨されるか

This model is a direct match for coding. It belongs to a current frontier family for local AI. It fits natively with comfortable headroom. Known channels: huggingface, ollama, lm-studio.

Capacity: Roomy · Bandwidth: Medium · Stack: Standard

Interactive: Good · Light API: Great · Bottleneck: Balanced

ランク #1

SRunsEST.

スコア

130.7

適合状態

Runs well

適合：Runs well、安全なコンテキスト 32K。

ランタイムサポート：unknown、unknown 上の n/a 経由。

ランタイム

llama.cpp

アーティファクト

n/a

量子化

Q4_K_M

デコード

71.5 tok/s

安全なコンテキスト

32K

公式コンテキスト

131K

サポート

n/a

TTFT

2708 ms

重み：5.5 GB

KVキャッシュ：2.2 GB

バックエンド：unknown

Current limits

This setup is broadly balanced for this model.

No major red flags

This recommendation has enough memory headroom and acceptable estimated speed for the selected workload.

Best next improvements

スコア 130.7 はワークロード適合度、カタログの新鮮さ、適合安全性、コンテキストカバレッジ、アーティファクト選択、メモリ使用率、スループット、レイテンシを組み合わせています。

Gemma

Gemma 4 E4B

最先端Apr 2026 リリースHugging FaceOllamaLM Studio

なぜ推奨されるか

This model is a direct match for coding. It belongs to a current frontier family for local AI. It fits natively with comfortable headroom. Known channels: huggingface, ollama, lm-studio.

Capacity: Roomy · Bandwidth: Medium · Stack: Standard

Interactive: Good · Light API: Great · Bottleneck: Balanced

ランク #2

ARunsEST.

スコア

112.1

適合状態

Runs well

適合：Runs well、安全なコンテキスト 63K。

ランタイムサポート：unknown、unknown 上の n/a 経由。

ランタイム

llama.cpp

アーティファクト

n/a

量子化

Q4_K_M

デコード

55.7 tok/s

安全なコンテキスト

63K

公式コンテキスト

128K

サポート

n/a

TTFT

3474 ms

重み：4.9 GB

KVキャッシュ：1.3 GB

バックエンド：unknown

Current limits

This setup is broadly balanced for this model.

No major red flags

This recommendation has enough memory headroom and acceptable estimated speed for the selected workload.

Best next improvements

スコア 112.1 はワークロード適合度、カタログの新鮮さ、適合安全性、コンテキストカバレッジ、アーティファクト選択、メモリ使用率、スループット、レイテンシを組み合わせています。

CodeGeeX

CodeGeeX 4 9B

現行Jul 2024 リリースHugging FaceOllama

なぜ推奨されるか

This model is still usable for coding, but it is not the most specialized pick. It sits in the middle of the current model mix. It fits natively with comfortable headroom. Known channels: huggingface, ollama.

Capacity: Roomy · Bandwidth: Medium · Stack: Standard

Interactive: Good · Light API: Great · Bottleneck: Balanced

ランク #3

ARunsEST.

スコア

108.4

適合状態

Runs well

適合：Runs well、安全なコンテキスト 116K。

ランタイムサポート：unknown、unknown 上の n/a 経由。

ランタイム

llama.cpp

アーティファクト

n/a

量子化

Q4_K_M

デコード

69.3 tok/s

安全なコンテキスト

116K

公式コンテキスト

131K

サポート

n/a

TTFT

2794 ms

重み：5.5 GB

KVキャッシュ：0.6 GB

バックエンド：unknown

Current limits

This setup is broadly balanced for this model.

No major red flags

This recommendation has enough memory headroom and acceptable estimated speed for the selected workload.

Best next improvements

スコア 108.4 はワークロード適合度、カタログの新鮮さ、適合安全性、コンテキストカバレッジ、アーティファクト選択、メモリ使用率、スループット、レイテンシを組み合わせています。

全380モデル

Full compatibility grid for RTX 4070 12GB

246 models fit · 9 excellent · 38 great

Grade

Model

Params

Tasks

Q4 VRAM

Decode

Context

Memory

Fit