Performance Scores

Overall

65
Rank #22 of 45 — Top 51%

SWE-bench

58
Rank #22 of 45 — Top 51%

LiveCodeBench

68
Rank #22 of 45 — Top 51%

HumanEval

85
Rank #22 of 45 — Top 51%

BigCodeBench

50
Rank #23 of 45 — Top 49%

Strengths & Weaknesses

Strengths

  • Very cheap
  • Fast
  • Large context

Weaknesses

  • Weaker reasoning than Pro

Compare with Similar-Priced Models

ModelOverall ScoreInput $/M
Gemini 2.5 Flash 65