HomeLLM Compare
Daily refreshed

LLM comparison for builders and product teams

Prices come from the public OpenRouter model catalog. Ratings, notes, and use-case guidance are editorial so the table stays useful instead of becoming a raw API dump.

Latest snapshot: April 8, 2026

Previous snapshot: April 7, 2026

GPT-5.4

OpenAI

9.7

Strong general-purpose frontier model with excellent tooling fit.

Input

$2.500

Context

1.1M

GPT-5.4 Mini

OpenAI

8.8

Good default when you want quality without frontier-model spend.

Input

$0.750

Context

400k

Claude Sonnet 4.6

Anthropic

9.5

A strong pick for engineering teams that care about clarity and structure.

Input

$3.000

Context

1M

10 models shown
ModelProviderRatingInput / 1MOutput / 1MContextStrengthsBest for
GPT-5.4
Strong general-purpose frontier model with excellent tooling fit.
OpenAI
9.7
$2.500
$15.00
1.1Mreasoning, coding, long contextproduction copilots, deep debugging, agent workflows
Claude Sonnet 4.6
A strong pick for engineering teams that care about clarity and structure.
Anthropic
9.5
$3.000
$15.00
1Mrepo reasoning, writing, clean refactorslarge codebases, spec writing, review-heavy work
Claude Opus 4.6
Premium model tier when the extra reasoning depth is worth the cost.
Anthropic
9.4
$5.000
$25.00
1Mdeep reasoning, analysis, high-complexity tasksresearch synthesis, architecture tradeoffs, hard debugging
Gemini 2.5 Pro
Excellent value when context length matters more than ultra-specific coding behavior.
Google
9.2
$1.250
$10.00
1Mvery long context, multimodal, analysisdocument-heavy work, RAG, planning
Grok 4.20
Strong all-rounder if you want a different frontier-model tradeoff profile.
xAI
8.9
$2.000
$6.000
2Mspeed, general reasoning, large contextrapid analysis, chat products, broad assistant tasks
GPT-5.4 Mini
Good default when you want quality without frontier-model spend.
OpenAI
8.8
$0.750
$4.500
400kcost efficiency, speed, strong coding baselinecron jobs, bulk edits, assistant backends
DeepSeek R1
Popular when you want a reasoning-heavy option without premium pricing.
DeepSeek
8.8
$0.450
$2.150
164kreasoning, price-to-quality, mathy tasksbudget reasoning, batch analytics, tool-driven workflows
Gemini 2.5 Flash
Useful for high-volume pipelines that still need decent quality.
Google
8.7
$0.300
$2.500
1Mspeed, multimodal, costlightweight assistants, classification, cheap refresh jobs
Mistral Medium 3.1
A pragmatic middle tier when you want good quality without frontier pricing.
Mistral
8.4
$0.400
$2.000
131kbalanced latency, European stack fit, good valueassistant APIs, cost-aware product features, summaries
Llama 4 Maverick
Worth tracking for teams that want a broader open-model option set.
Meta
8.2
$0.150
$0.600
1Mopen ecosystem, cost, large contextexperimentation, self-hosting comparisons, budget features

Recent snapshot history

April 8, 2026

10 tracked models

Top models: GPT-5.4, GPT-5.4 Mini, Claude Sonnet 4.6

April 7, 2026

11 tracked models

Top models: GPT-5.4, GPT-5.4 Mini, Claude Sonnet 4.6

April 6, 2026

11 tracked models

Top models: GPT-5.4, GPT-5.4 Mini, Claude Sonnet 4.6