Performance Scores

Overall

55
Rank #32 of 45 — Top 29%

SWE-bench

48
Rank #32 of 45 — Top 29%

LiveCodeBench

58
Rank #32 of 45 — Top 29%

HumanEval

78
Rank #31 of 45 — Top 31%

BigCodeBench

40
Rank #32 of 45 — Top 29%

Strengths & Weaknesses

Strengths

  • Fast
  • Good value
  • Improved over 3.5 Haiku

Weaknesses

  • Limited reasoning depth

Compare with Similar-Priced Models

ModelOverall ScoreInput $/M
Claude 4 Haiku 55 $0.800
Claude 3.5 Haiku 52 $0.800
Claude 3 Haiku 45 $0.250
GPT-4o mini 58 $0.150
GPT-3.5 Turbo 40 $0.500
OpenAI o1-mini 70 $1.10