Claude 3 Sonnet — Benchmark Results
Anthropic Overall Score: 65/100
Performance Scores
Overall
65
Rank #22 of 45 — Top 51%
SWE-bench
58
Rank #22 of 45 — Top 51%
LiveCodeBench
68
Rank #22 of 45 — Top 51%
HumanEval
85
Rank #22 of 45 — Top 51%
BigCodeBench
50
Rank #23 of 45 — Top 49%
Strengths & Weaknesses
Strengths
- Reliable
- Good value
Weaknesses
- Two generations behind
Compare with Similar-Priced Models
| Model | Overall Score | Input $/M |
|---|---|---|
| Claude 3 Sonnet | 65 | $3.00 |
| Claude Sonnet 4 | 78 | $3.00 |
| Claude 3.5 Sonnet | 72 | $3.00 |
| GPT-4o | 75 | $2.50 |
| Qwen 3.6 Plus | 72 | $3.00 |
| Qwen Max | 68 | $1.60 |