Performance Scores

Overall

42
Rank #42 of 45 — Top 7%

SWE-bench

35
Rank #42 of 45 — Top 7%

LiveCodeBench

44
Rank #42 of 45 — Top 7%

HumanEval

65
Rank #42 of 45 — Top 7%

BigCodeBench

28
Rank #42 of 45 — Top 7%

Strengths & Weaknesses

Strengths

  • Cheapest option
  • Fast

Weaknesses

  • Basic coding only

Compare with Similar-Priced Models

ModelOverall ScoreInput $/M
Qwen Turbo 42 $0.080
Claude 3.5 Haiku 52 $0.800
Claude 3 Haiku 45 $0.250
GPT-4o mini 58 $0.150
GPT-3.5 Turbo 40 $0.500