Microsoft Phi-4 Benchmark Results — Coding Performance 2026 | AI Dev Tools — AI Coding Model Pricing Comparison 2026

Performance Scores

Overall

45

Rank #40 of 45 — Top 11%

SWE-bench

38

Rank #40 of 45 — Top 11%

LiveCodeBench

46

Rank #40 of 45 — Top 11%

HumanEval

68

Rank #40 of 45 — Top 11%

BigCodeBench

30

Rank #40 of 45 — Top 11%

Strengths & Weaknesses

Strengths

Small model, runs locally

Weaknesses

Limited capacity

Compare with Similar-Priced Models

Model	Overall Score	Input $/M
Microsoft Phi-4	45	$0.100
Claude 3.5 Haiku	52	$0.800
Claude 3 Haiku	45	$0.250
GPT-4o mini	58	$0.150
GPT-3.5 Turbo	40	$0.500