Performance Scores

Overall

72
Rank #10 of 45 — Top 78%

SWE-bench

66
Rank #12 of 45 — Top 73%

LiveCodeBench

74
Rank #12 of 45 — Top 73%

HumanEval

88
Rank #14 of 45 — Top 69%

BigCodeBench

58
Rank #10 of 45 — Top 78%

Strengths & Weaknesses

Strengths

  • Latest Qwen architecture
  • Strong performance

Weaknesses

  • Newer model

Compare with Similar-Priced Models

ModelOverall ScoreInput $/M
Qwen3 6 Plus 72 $3.00
Claude Sonnet 4 78 $3.00
Claude 3.5 Sonnet 72 $3.00
Claude 3 Sonnet 65 $3.00
GPT-4o 75 $2.50
Qwen Max 68 $1.60