Daily refreshed
LLM comparison for builders and product teams
Prices come from the public OpenRouter model catalog. Ratings, notes, and use-case guidance are editorial so the table stays useful instead of becoming a raw API dump.
Latest snapshot: April 8, 2026
Previous snapshot: April 7, 2026
GPT-5.4
OpenAI
Strong general-purpose frontier model with excellent tooling fit.
Input
$2.500
Context
1.1M
GPT-5.4 Mini
OpenAI
Good default when you want quality without frontier-model spend.
Input
$0.750
Context
400k
Claude Sonnet 4.6
Anthropic
A strong pick for engineering teams that care about clarity and structure.
Input
$3.000
Context
1M
10 models shown
| Model | Provider | Rating | Input / 1M | Output / 1M | Context | Strengths | Best for |
|---|---|---|---|---|---|---|---|
GPT-5.4 Strong general-purpose frontier model with excellent tooling fit. | OpenAI | 9.7 | $2.500 | $15.00 | 1.1M | reasoning, coding, long context | production copilots, deep debugging, agent workflows |
Claude Sonnet 4.6 A strong pick for engineering teams that care about clarity and structure. | Anthropic | 9.5 | $3.000 | $15.00 | 1M | repo reasoning, writing, clean refactors | large codebases, spec writing, review-heavy work |
Claude Opus 4.6 Premium model tier when the extra reasoning depth is worth the cost. | Anthropic | 9.4 | $5.000 | $25.00 | 1M | deep reasoning, analysis, high-complexity tasks | research synthesis, architecture tradeoffs, hard debugging |
Gemini 2.5 Pro Excellent value when context length matters more than ultra-specific coding behavior. | 9.2 | $1.250 | $10.00 | 1M | very long context, multimodal, analysis | document-heavy work, RAG, planning | |
Grok 4.20 Strong all-rounder if you want a different frontier-model tradeoff profile. | xAI | 8.9 | $2.000 | $6.000 | 2M | speed, general reasoning, large context | rapid analysis, chat products, broad assistant tasks |
GPT-5.4 Mini Good default when you want quality without frontier-model spend. | OpenAI | 8.8 | $0.750 | $4.500 | 400k | cost efficiency, speed, strong coding baseline | cron jobs, bulk edits, assistant backends |
DeepSeek R1 Popular when you want a reasoning-heavy option without premium pricing. | DeepSeek | 8.8 | $0.450 | $2.150 | 164k | reasoning, price-to-quality, mathy tasks | budget reasoning, batch analytics, tool-driven workflows |
Gemini 2.5 Flash Useful for high-volume pipelines that still need decent quality. | 8.7 | $0.300 | $2.500 | 1M | speed, multimodal, cost | lightweight assistants, classification, cheap refresh jobs | |
Mistral Medium 3.1 A pragmatic middle tier when you want good quality without frontier pricing. | Mistral | 8.4 | $0.400 | $2.000 | 131k | balanced latency, European stack fit, good value | assistant APIs, cost-aware product features, summaries |
Llama 4 Maverick Worth tracking for teams that want a broader open-model option set. | Meta | 8.2 | $0.150 | $0.600 | 1M | open ecosystem, cost, large context | experimentation, self-hosting comparisons, budget features |
Recent snapshot history
April 8, 2026
10 tracked models
Top models: GPT-5.4, GPT-5.4 Mini, Claude Sonnet 4.6
April 7, 2026
11 tracked models
Top models: GPT-5.4, GPT-5.4 Mini, Claude Sonnet 4.6
April 6, 2026
11 tracked models
Top models: GPT-5.4, GPT-5.4 Mini, Claude Sonnet 4.6