Performance Scores

Overall

68
Rank #18 of 45 — Top 60%

SWE-bench

62
Rank #18 of 45 — Top 60%

LiveCodeBench

70
Rank #18 of 45 — Top 60%

HumanEval

86
Rank #18 of 45 — Top 60%

BigCodeBench

54
Rank #17 of 45 — Top 62%

Strengths & Weaknesses

Strengths

  • Strong Chinese language support
  • Good value

Weaknesses

  • Less tested on English coding

Compare with Similar-Priced Models

ModelOverall ScoreInput $/M
Qwen Max 68 $1.60
Claude 3.5 Haiku 52 $0.800
GPT-4o 75 $2.50
OpenAI o1-mini 70 $1.10
OpenAI o3-mini 80 $1.10
OpenAI o4-mini 72 $1.10