DeepSeek Reasoner (R1) — Benchmark Results
DeepSeek Overall Score: 72/100
Performance Scores
Overall
72
Rank #10 of 45 — Top 78%
SWE-bench
68
Rank #10 of 45 — Top 78%
LiveCodeBench
76
Rank #10 of 45 — Top 78%
HumanEval
90
Rank #9 of 45 — Top 80%
BigCodeBench
56
Rank #12 of 45 — Top 73%
Strengths & Weaknesses
Strengths
- Strong reasoning chain
- Good value
Weaknesses
- Slow on simple tasks
Compare with Similar-Priced Models
| Model | Overall Score | Input $/M |
|---|---|---|
| DeepSeek Reasoner (R1) | 72 | $0.550 |
| Claude 3.5 Haiku | 52 | $0.800 |
| Claude 3 Haiku | 45 | $0.250 |
| GPT-4o mini | 58 | $0.150 |
| GPT-3.5 Turbo | 40 | $0.500 |
| OpenAI o1-mini | 70 | $1.10 |