o1 Benchmark Results — Coding Performance 2026 | AI Dev Tools — AI Coding Model Pricing Comparison 2026

Performance Scores

Overall

83

Rank #3 of 45 — Top 93%

SWE-bench

80

Rank #3 of 45 — Top 93%

LiveCodeBench

84

Rank #4 of 45 — Top 91%

HumanEval

95

Rank #3 of 45 — Top 93%

BigCodeBench

73

Rank #3 of 45 — Top 93%

Strengths & Weaknesses

Strengths

Strong step-by-step reasoning
Best at math-heavy coding

Weaknesses

Expensive
Slow

Compare with Similar-Priced Models

Model	Overall Score	Input $/M
o1	83	$15.00
Claude Opus 4	86	$15.00
Claude 3 Opus	78	$15.00
GPT-4 Turbo	70	$10.00
OpenAI o3	85	$10.00