Performance Scores

Overall

52
Rank #35 of 45 — Top 22%

SWE-bench

44
Rank #36 of 45 — Top 20%

LiveCodeBench

54
Rank #36 of 45 — Top 20%

HumanEval

76
Rank #35 of 45 — Top 22%

BigCodeBench

38
Rank #35 of 45 — Top 22%

Strengths & Weaknesses

Strengths

  • Open source
  • Self-hostable

Weaknesses

  • Requires own infrastructure

Compare with Similar-Priced Models

ModelOverall ScoreInput $/M
Llama 3.3 70B 52