GPT-4 — Benchmark Results
OpenAI Overall Score: 68/100
Performance Scores
Overall
68
Rank #18 of 45 — Top 60%
SWE-bench
60
Rank #21 of 45 — Top 53%
LiveCodeBench
70
Rank #18 of 45 — Top 60%
HumanEval
86
Rank #18 of 45 — Top 60%
BigCodeBench
54
Rank #17 of 45 — Top 62%
Strengths & Weaknesses
Strengths
- Original breakthrough model
Weaknesses
- Two generations behind
- Expensive
Compare with Similar-Priced Models
| Model | Overall Score | Input $/M |
|---|---|---|
| GPT-4 | 68 | $30.00 |