o1 — Benchmark Results
OpenAI Overall Score: 83/100
Performance Scores
Overall
83
Rank #3 of 45 — Top 93%
SWE-bench
80
Rank #3 of 45 — Top 93%
LiveCodeBench
84
Rank #4 of 45 — Top 91%
HumanEval
95
Rank #3 of 45 — Top 93%
BigCodeBench
73
Rank #3 of 45 — Top 93%
Strengths & Weaknesses
Strengths
- Strong step-by-step reasoning
- Best at math-heavy coding
Weaknesses
- Expensive
- Slow
Compare with Similar-Priced Models
| Model | Overall Score | Input $/M |
|---|---|---|
| o1 | 83 | $15.00 |
| Claude Opus 4 | 86 | $15.00 |
| Claude 3 Opus | 78 | $15.00 |
| GPT-4 Turbo | 70 | $10.00 |
| OpenAI o3 | 85 | $10.00 |