Performance Scores

Overall

50
Rank #37 of 45 — Top 18%

SWE-bench

42
Rank #37 of 45 — Top 18%

LiveCodeBench

52
Rank #37 of 45 — Top 18%

HumanEval

72
Rank #37 of 45 — Top 18%

BigCodeBench

36
Rank #37 of 45 — Top 18%

Strengths & Weaknesses

Strengths

  • Budget option
  • Fast

Weaknesses

  • Limited capabilities

Compare with Similar-Priced Models

ModelOverall ScoreInput $/M
Grok 3 Mini 50 $0.300
Claude 3.5 Haiku 52 $0.800
Claude 3 Haiku 45 $0.250
GPT-4o mini 58 $0.150
GPT-3.5 Turbo 40 $0.500
OpenAI o1-mini 70 $1.10