Performance Scores

Overall

80
Rank #4 of 45 — Top 91%

SWE-bench

76
Rank #4 of 45 — Top 91%

LiveCodeBench

82
Rank #5 of 45 — Top 89%

HumanEval

94
Rank #4 of 45 — Top 91%

BigCodeBench

68
Rank #4 of 45 — Top 91%

Strengths & Weaknesses

Strengths

  • Latest GPT model
  • Strong across all benchmarks

Weaknesses

  • Premium pricing

Compare with Similar-Priced Models

ModelOverall ScoreInput $/M
GPT-4.1 80 $2.00
GPT-4o 75 $2.50
OpenAI o1-mini 70 $1.10
OpenAI o3-mini 80 $1.10
OpenAI o4-mini 72 $1.10