8.9× cheaper than cloud
$1.13 / 1M tokens local
Top models you can run
- SQwen3-Coder 30B A3B Instruct· 131 tok/s · 102K ctx · 30.5B params
- SQwen3-VL 30B A3B Instruct· 188 tok/s · 105K ctx · 30B params
- SQwen 3.5 27B· 59 tok/s · 58K ctx · 27B params
Interactive tool
Tell us your budget and what you'll use it for. We'll rank GPUs and Macs by how well they run models for that workload, show you which specific models will run great, and compare cost-per-token against cloud APIs.
8.9× cheaper than cloud
$1.13 / 1M tokens local
Top models you can run
18.6× cheaper than cloud
$0.54 / 1M tokens local
Top models you can run
18.7× cheaper than cloud
$0.53 / 1M tokens local
Top models you can run
16.5× cheaper than cloud
$0.61 / 1M tokens local
Top models you can run
15.2× cheaper than cloud
$0.66 / 1M tokens local
Top models you can run
How we rank
We filter hardware by MSRP under your budget, then score each candidate by the best-fit model for your workload — combining fit status (native / tight / offload), decode throughput, VRAM utilization, and model quality tier. ROI uses $10/1M tokens as the cloud reference (GPT-4o / Claude Sonnet tier) with 36-month amortization and $0.15/kWh electricity. Used-market prices are flagged when applicable (e.g., RTX 3090 at ~$800).
See also