Performance Scores

Overall

68
Rank #18 of 45 — Top 60%

SWE-bench

62
Rank #18 of 45 — Top 60%

LiveCodeBench

70
Rank #18 of 45 — Top 60%

HumanEval

86
Rank #18 of 45 — Top 60%

BigCodeBench

54
Rank #17 of 45 — Top 62%

Strengths & Weaknesses

Strengths

  • Solid performance
  • Good context

Weaknesses

  • Superseded by 2.5

Compare with Similar-Priced Models

ModelOverall ScoreInput $/M
Gemini 2.0 Pro 68