LLM comparison for builders and product teams

Prices come from the public OpenRouter model catalog. Ratings, notes, and use-case guidance are editorial so the table stays useful instead of becoming a raw API dump.

Latest snapshot: July 17, 2026

Previous snapshot: July 16, 2026

GPT-5.4

OpenAI

9.7

Strong general-purpose frontier model with excellent tooling fit.

Input

$2.500

Context

1.1M

GPT-5.4 Mini

OpenAI

8.8

Good default when you want quality without frontier-model spend.

Input

$0.750

Context

400k

Claude Sonnet 4.6

Anthropic

9.5

A strong pick for engineering teams that care about clarity and structure.

Input

$3.000

Context

ProviderMax input price / 1MMin contextSort by

10 models shown

Model	Provider	Rating	Input / 1M	Output / 1M	Context	Strengths	Best for
GPT-5.4 Strong general-purpose frontier model with excellent tooling fit.	OpenAI	9.7	$2.500	$15.00	1.1M	reasoning, coding, long context	production copilots, deep debugging, agent workflows
Claude Sonnet 4.6 A strong pick for engineering teams that care about clarity and structure.	Anthropic	9.5	$3.000	$15.00	1M	repo reasoning, writing, clean refactors	large codebases, spec writing, review-heavy work
Claude Opus 4.6 Premium model tier when the extra reasoning depth is worth the cost.	Anthropic	9.4	$5.000	$25.00	1M	deep reasoning, analysis, high-complexity tasks	research synthesis, architecture tradeoffs, hard debugging
Gemini 2.5 Pro Excellent value when context length matters more than ultra-specific coding behavior.	Google	9.2	$1.250	$10.00	1M	very long context, multimodal, analysis	document-heavy work, RAG, planning
Grok 4.20 Strong all-rounder if you want a different frontier-model tradeoff profile.	xAI	8.9	$1.250	$2.500	2M	speed, general reasoning, large context	rapid analysis, chat products, broad assistant tasks
GPT-5.4 Mini Good default when you want quality without frontier-model spend.	OpenAI	8.8	$0.750	$4.500	400k	cost efficiency, speed, strong coding baseline	cron jobs, bulk edits, assistant backends
DeepSeek R1 Popular when you want a reasoning-heavy option without premium pricing.	DeepSeek	8.8	$0.500	$2.150	164k	reasoning, price-to-quality, mathy tasks	budget reasoning, batch analytics, tool-driven workflows
Gemini 2.5 Flash Useful for high-volume pipelines that still need decent quality.	Google	8.7	$0.300	$2.500	1M	speed, multimodal, cost	lightweight assistants, classification, cheap refresh jobs
Mistral Medium 3.1 A pragmatic middle tier when you want good quality without frontier pricing.	Mistral	8.4	$0.400	$2.000	131k	balanced latency, European stack fit, good value	assistant APIs, cost-aware product features, summaries
Llama 4 Maverick Worth tracking for teams that want a broader open-model option set.	Meta	8.2	$0.200	$0.800	1M	open ecosystem, cost, large context	experimentation, self-hosting comparisons, budget features