Claude 3 Opus — Benchmark Results
Anthropic Overall Score: 78/100
Performance Scores
Overall
78
Rank #6 of 45 — Top 87%
SWE-bench
74
Rank #6 of 45 — Top 87%
LiveCodeBench
80
Rank #7 of 45 — Top 84%
HumanEval
94
Rank #4 of 45 — Top 91%
BigCodeBench
64
Rank #6 of 45 — Top 87%
Strengths & Weaknesses
Strengths
- Strong reasoning
- Proven track record
Weaknesses
- Older generation
- Expensive
Compare with Similar-Priced Models
| Model | Overall Score | Input $/M |
|---|---|---|
| Claude 3 Opus | 78 | $15.00 |
| Claude Opus 4 | 86 | $15.00 |
| GPT-4 Turbo | 70 | $10.00 |
| OpenAI o1 | 83 | $15.00 |
| OpenAI o3 | 85 | $10.00 |