113 Total Models Tracked
20 Providers
22 Release Months

Release Timeline

September 2025

xAI

Grok 4

Next-generation xAI model with enhanced reasoning and coding capability. Competes with Claude Opus and o3 Pro tier.

Input: $5.00/M Output: $25.00/M Context: 256K tokens
Qwen

Qwen 3 Max

Flagship Qwen 3 model. Top-tier reasoning and coding, competitive with Claude Opus and GPT-4.1.

Input: $5.00/M Output: $20.00/M Context: 256K tokens

August 2025

Anthropic

Claude 4 Haiku

Updated Haiku model with improved reasoning over Claude 3.5 Haiku. Fast and affordable for high-volume tasks.

Input: $0.800/M Output: $4.00/M Context: 200K tokens
Qwen

Qwen 3 Coder

Latest Qwen coding-specialized model. Strong performance on HumanEval and competitive programming benchmarks.

Input: $0.500/M Output: $2.00/M Context: 256K tokens

July 2025

DeepSeek

DeepSeek V3.2

Updated V3 model with improved general reasoning and multilingual capability. Strong value proposition.

Input: $0.300/M Output: $1.20/M Context: 128K tokens
Anthropic

Claude Sonnet 4 Lite

Lighter version of Claude Sonnet 4. Good balance of quality and cost for day-to-day coding.

Input: $1.00/M Output: $5.00/M Context: 200K tokens

June 2025

Qwen

Qwen 3.6 Plus

Qwen's latest general-purpose model. Competitive with Claude Sonnet pricing.

Input: $3.00/M Output: $15.00/M Context: 128K tokens
OpenAI

OpenAI o3 Pro

Top-tier reasoning model combining o3's coding strength with extended compute. The most powerful OpenAI model for reasoning-heavy coding.

Input: $20.00/M Output: $80.00/M Context: 200K tokens
Google

Gemini 2.5 Flash Lite

The most affordable Gemini model. Ultra-low cost for high-volume, simple coding and text tasks.

Input: $0.037/M Output: $0.150/M Context: 1M tokens
Mistral

Mistral Large 3

Latest Mistral flagship model. Improved coding and multilingual capability over Large 2.

Input: $2.00/M Output: $6.00/M Context: 128K tokens
Qwen

Qwen Coder Turbo V2

Updated Qwen Coder Turbo with improved code generation quality. Strong value for budget coding.

Input: $0.300/M Output: $1.20/M Context: 128K tokens

May 2025

Anthropic

Claude Sonnet 4

Anthropic's balanced model for coding and general tasks. Best price-performance ratio in the Claude family.

Input: $3.00/M Output: $15.00/M Context: 200K tokens
Anthropic

Claude Opus 4

Anthropic's most powerful model. Best for complex reasoning and challenging coding tasks.

Input: $15.00/M Output: $75.00/M Context: 200K tokens
Qwen

Qwen 3 Turbo

Fast and affordable Qwen 3 generation model. Good for high-volume tasks with improved quality over Qwen Turbo.

Input: $0.150/M Output: $0.600/M Context: 128K tokens
xAI

Grok Code

xAI's coding-specialized model. Optimized for code generation, debugging, and software engineering tasks.

Input: $1.50/M Output: $7.50/M Context: 128K tokens
DeepSeek

DeepSeek Jiuge

Ultra-budget DeepSeek model for high-volume tasks. Competitive with Gemini Flash pricing.

Input: $0.150/M Output: $0.600/M Context: 128K tokens

April 2025

OpenAI

OpenAI o4-mini

Updated mini reasoning model. Similar pricing to o3-mini with updated capabilities.

Input: $1.10/M Output: $4.40/M Context: 200K tokens
Google

Gemini 2.5 Pro

Google's most capable model. Strong coding, multimodal understanding, and very competitive pricing.

Input: $1.25/M Output: $10.00/M Context: 1M tokens
Google

Gemini 2.5 Flash

Fast and affordable Google model. Great for high-volume coding and processing.

Input: $0.150/M Output: $0.600/M Context: 1M tokens
DeepSeek

DeepSeek Coder V3

Latest generation DeepSeek coding model. Improved code understanding and generation over V2.

Input: $0.270/M Output: $1.10/M Context: 128K tokens
OpenAI

GPT-4.1

Updated GPT-4 generation with improved instruction following and reduced hallucination. Better coding accuracy than GPT-4o.

Input: $2.00/M Output: $8.00/M Context: 128K tokens
OpenAI

GPT-4.1 mini

Cost-optimized GPT-4.1 variant. Strong coding capability at budget pricing, replacing GPT-4o mini for many use cases.

Input: $0.400/M Output: $1.60/M Context: 128K tokens
xAI

Grok 3 Vision

xAI's multimodal vision model. Combines Grok 3's reasoning with image and diagram understanding.

Input: $5.00/M Output: $20.00/M Context: 128K tokens
OpenAI

GPT-4.1 Nano

OpenAI's smallest and cheapest GPT-4.1 model. Fast responses for simple tasks.

Input: $0.100/M Output: $0.400/M Context: 1M tokens

March 2025

OpenAI

OpenAI o1 Pro

Premium reasoning model with extended compute. Best-in-class for complex math, science, and advanced coding challenges.

Input: $20.00/M Output: $80.00/M Context: 200K tokens
Mistral

Mistral Medium

Mid-tier Mistral model between Small and Large. Strong coding capability at a moderate price point.

Input: $0.400/M Output: $2.00/M Context: 128K tokens
Google

Gemma 3 27B

Google's open-weight 27B model. Budget-friendly with strong coding capability and Google's research backing.

Input: $0.100/M Output: $0.400/M Context: 128K tokens
Cohere

Cohere Command A

Cohere's newest model with strong agentic capabilities. Optimized for tool use and autonomous tasks.

Input: $2.00/M Output: $8.00/M Context: 256K tokens
Google

Gemini 2.5 Flash

Google's latest Flash model with improved reasoning. Excellent price-performance for multimodal tasks.

Input: $0.150/M Output: $0.600/M Context: 1M tokens
Meta

Llama 4 Scout

Meta's Llama 4 mid-tier multimodal model. Native multimodal with efficient inference.

Input: $0.200/M Output: $0.800/M Context: 10M tokens
Meta

Llama 4 Maverick

Meta's Llama 4 flagship model. Strong multimodal and coding with MoE architecture.

Input: $0.400/M Output: $1.60/M Context: 10M tokens

February 2025

OpenAI

OpenAI o3

Next generation reasoning model. Improved coding and math over o1.

Input: $10.00/M Output: $40.00/M Context: 200K tokens
xAI

Grok 3

xAI's flagship model. Strong general-purpose capability with real-time knowledge access through X platform integration.

Input: $3.00/M Output: $15.00/M Context: 128K tokens
xAI

Grok 3 Mini

Cost-effective xAI model for high-volume tasks. Good balance of capability and affordability.

Input: $0.300/M Output: $0.500/M Context: 128K tokens
Perplexity

Perplexity Sonar Reasoning Pro

Perplexity's most advanced search model with deep reasoning. Complex research tasks with cited sources.

Input: $5.00/M Output: $20.00/M Context: 128K tokens
Google

Gemini 2.0 Flash Lite

Google's most cost-effective Gemini model. Great for high-volume, latency-sensitive applications.

Input: $0.075/M Output: $0.300/M Context: 1M tokens

January 2025

OpenAI

OpenAI o3-mini

Affordable reasoning model for coding tasks. Best price-performance for algorithm-heavy work.

Input: $1.10/M Output: $4.40/M Context: 200K tokens
DeepSeek

DeepSeek Reasoner (R1)

DeepSeek's reasoning model. Comparable to OpenAI's o1 but at much lower cost.

Input: $0.550/M Output: $2.19/M Context: 128K tokens
Mistral

Mistral Small 3

Mistral's cost-effective model. Very affordable for general-purpose tasks.

Input: $0.100/M Output: $0.300/M Context: 32K tokens
Microsoft

Microsoft Phi-4

Microsoft's compact 14B model with strong reasoning and coding capability. Excellent value for small-scale deployments.

Input: $0.100/M Output: $0.300/M Context: 128K tokens
Amazon

Amazon Nova Premier

Amazon's most capable Nova model. Designed for complex reasoning and large-context enterprise tasks.

Input: $2.50/M Output: $12.50/M Context: 1M tokens
Together AI

Together Mistral Small 3

Mistral Small 3 via Together AI. Efficient mid-size model for general tasks.

Input: $0.800/M Output: $0.800/M Context: 32K tokens
OpenAI

O3 Mini

OpenAI's reasoning model at lower cost. Strong at math, coding, and science tasks.

Input: $1.10/M Output: $4.40/M Context: 200K tokens
DeepSeek

DeepSeek V3

DeepSeek's latest general model. Competitive with Claude Sonnet at a fraction of the cost.

Input: $0.140/M Output: $0.280/M Context: 128K tokens
DeepSeek

DeepSeek R1

DeepSeek's reasoning model. Open-weight model that rivals o1 for complex reasoning tasks.

Input: $0.140/M Output: $0.550/M Context: 128K tokens
Mistral

Mistral Small 3

Mistral's efficient small model. Great performance for its size at very competitive pricing.

Input: $0.100/M Output: $0.300/M Context: 32K tokens
Microsoft

Phi-4 Mini

Microsoft's latest small model with improved coding ability. Better than Phi-3 for developer tasks.

Input: $0.100/M Output: $0.300/M Context: 128K tokens

December 2024

Google

Gemini 2.0 Flash

Cheapest Google model. Fast responses for simple coding tasks.

Input: $0.100/M Output: $0.400/M Context: 1M tokens
DeepSeek

DeepSeek Chat V3

Very affordable general-purpose model from DeepSeek. Strong coding and reasoning at low cost.

Input: $0.270/M Output: $1.10/M Context: 128K tokens
Google

Gemini 2.0 Pro

Mid-tier Gemini 2.0 model. Better quality than Flash at a competitive price point for enterprise coding tasks.

Input: $2.50/M Output: $10.00/M Context: 1M tokens
Meta

Llama 3.3 70B

Meta's open-weight 70B model. Strong coding and general capability, widely supported across AI platforms.

Input: $0.250/M Output: $1.00/M Context: 128K tokens
Amazon

Amazon Nova Micro

Amazon's most cost-effective model. Optimized for speed and low-cost text generation tasks.

Input: $0.035/M Output: $0.140/M Context: 128K tokens
Amazon

Amazon Nova Lite

Amazon's lightweight multimodal model. Good balance of cost and capability for image + text.

Input: $0.060/M Output: $0.240/M Context: 300K tokens
Amazon

Amazon Nova Pro

Amazon's flagship model via Bedrock. Competitive with GPT-4o and Claude Sonnet for enterprise workloads.

Input: $0.800/M Output: $3.20/M Context: 300K tokens
Groq

Groq Llama 3.3 70B

Llama 3.3 70B running on Groq's ultra-fast LPU inference. Sub-100ms responses for 70B model.

Input: $0.590/M Output: $0.790/M Context: 128K tokens
Together AI

Together Llama 3.3 70B

Llama 3.3 70B via Together AI. Cost-effective inference for open models.

Input: $0.880/M Output: $0.880/M Context: 128K tokens
Qwen

QVQ 72B Preview

Qwen's visual reasoning model. Advanced image + text reasoning capabilities.

Input: $0.500/M Output: $1.50/M Context: 32K tokens

November 2024

Perplexity

Perplexity Sonar Pro

Perplexity's search-optimized model. Built for real-time web search with cited answers.

Input: $3.00/M Output: $15.00/M Context: 128K tokens
Mistral

Pixtral Large

Mistral's multimodal model with strong image understanding. Competitive with GPT-4o Vision.

Input: $2.00/M Output: $6.00/M Context: 128K tokens

October 2024

Anthropic

Claude 3.5 Sonnet

Previous generation Sonnet. Still excellent for coding tasks at the same price point.

Input: $3.00/M Output: $15.00/M Context: 200K tokens
Anthropic

Claude 3.5 Haiku

Fast, cost-effective model for high-volume tasks. Great for code review and simple queries.

Input: $0.800/M Output: $4.00/M Context: 200K tokens
Zhipu AI

GLM-4-AllTools

Zhipu AI's most capable model with full tool use support. Code interpreter, web search, and image generation.

Input: $7.00/M Output: $7.00/M Context: 128K tokens
Perplexity

Perplexity Sonar

Perplexity's standard search model. Fast, cited answers at lower cost than Sonar Pro.

Input: $1.00/M Output: $1.00/M Context: 128K tokens
Mistral

Pixtral 12B

Mistral's lightweight vision-language model. Affordable image understanding with good performance.

Input: $0.150/M Output: $0.150/M Context: 32K tokens
xAI

Grok 2 Vision

Grok 2 with image input support. Vision capabilities combined with real-time X knowledge.

Input: $2.00/M Output: $10.00/M Context: 128K tokens

September 2024

OpenAI

OpenAI o1

Reasoning model optimized for complex problem-solving. Excels at math, science, and advanced coding.

Input: $15.00/M Output: $60.00/M Context: 200K tokens
OpenAI

OpenAI o1-mini

Cost-effective reasoning model. Good for coding tasks that require logical reasoning.

Input: $1.10/M Output: $4.40/M Context: 128K tokens
Reka

Reka Flash

Reka's fast multimodal model. Compact and efficient for high-volume tasks with vision capability.

Input: $0.200/M Output: $0.800/M Context: 128K tokens
Zhipu AI

GLM-4-Plus

Zhipu AI's balanced model. Strong Chinese language understanding with competitive coding ability.

Input: $0.700/M Output: $0.700/M Context: 128K tokens
MiniMax

MiniMax-M1

MiniMax's flagship model. Strong performance in Chinese and English with competitive pricing.

Input: $0.150/M Output: $0.600/M Context: 4M tokens
OpenAI

O1 Preview

OpenAI's first reasoning model. Strong at complex problem solving but expensive.

Input: $15.00/M Output: $60.00/M Context: 128K tokens
Qwen

Qwen 2.5 72B

Qwen's open-weight 72B model. Strong Chinese and English performance at competitive pricing.

Input: $0.400/M Output: $0.800/M Context: 128K tokens
Qwen

Qwen 2.5 Coder 32B

Qwen's code-specialized 32B model. Trained on 130+ programming languages.

Input: $0.200/M Output: $0.400/M Context: 128K tokens

August 2024

Zhipu AI

GLM-4-Flash

Zhipu AI's ultra-cheap model. Near-free pricing for high-volume Chinese and English text tasks.

Input: $0.010/M Output: $0.010/M Context: 128K tokens
xAI

Grok 2

xAI's previous generation model. Strong performance with real-time X/Twitter knowledge access.

Input: $2.00/M Output: $10.00/M Context: 128K tokens

July 2024

OpenAI

GPT-4o mini

Affordable small model. Fast and cost-effective for high-volume coding tasks.

Input: $0.150/M Output: $0.600/M Context: 128K tokens
Mistral

Mistral Large 2

Mistral's flagship model. Strong multilingual and coding capability.

Input: $2.00/M Output: $6.00/M Context: 128K tokens
Mistral

Mistral Nemo

Compact 12B open-weight model co-developed with NVIDIA. Excellent coding performance at minimal cost.

Input: $0.150/M Output: $0.150/M Context: 128K tokens
Zhipu AI

GLM-4-Air

Zhipu AI's mid-tier model. Good balance of cost and performance for Chinese-language applications.

Input: $0.140/M Output: $0.140/M Context: 128K tokens
Groq

Groq Gemma 2 9B

Google's Gemma 2 9B on Groq's LPU. Extremely fast small model for simple tasks.

Input: $0.200/M Output: $0.200/M Context: 8K tokens
Databricks

Databricks Llama 3.1 405B

Meta's 405B model hosted on Databricks. Largest open-weight model available for enterprise use.

Input: $5.00/M Output: $15.00/M Context: 128K tokens
Mistral

Mistral Large 24.07

Mistral's enterprise-grade large model. Strong multilingual and coding capabilities.

Input: $2.00/M Output: $6.00/M Context: 128K tokens
Meta

Llama 3.1 8B

Meta's smallest Llama 3.1 model. Open weights, deploy anywhere. Great for self-hosted applications.

Input: $0.050/M Output: $0.100/M Context: 128K tokens
Meta

Llama 3.1 70B

Meta's mid-size Llama 3.1. Strong general performance with open weights for custom deployment.

Input: $0.200/M Output: $0.400/M Context: 128K tokens

June 2024

Qwen

Qwen Coder Plus

Qwen model specifically optimized for coding tasks.

Input: $0.800/M Output: $4.00/M Context: 128K tokens
Qwen

Qwen Coder Turbo

Fast coding model from Qwen. Good price-performance for code generation.

Input: $0.250/M Output: $1.25/M Context: 128K tokens
DeepSeek

DeepSeek Coder V2

DeepSeek's coding-specialized model. Open-source and very affordable.

Input: $0.270/M Output: $1.10/M Context: 128K tokens
Mistral

Codestral

Mistral's dedicated coding model. Open-weight and highly optimized for code generation and completion.

Input: $0.300/M Output: $0.900/M Context: 128K tokens
Cohere

Cohere Command R+

Cohere's premium model with higher accuracy. Optimized for complex reasoning and tool use tasks.

Input: $2.50/M Output: $10.00/M Context: 128K tokens
01.ai

Yi-Lightning

01.ai's cost-effective model. Competitive Chinese-English bilingual model at very low prices.

Input: $0.150/M Output: $0.600/M Context: 16K tokens
MiniMax

MiniMax Text 01

MiniMax's cost-effective text model. Optimized for high-volume Chinese text generation.

Input: $0.050/M Output: $0.200/M Context: 8K tokens
Reka

Reka Core

Reka's flagship multimodal model. Strong image and video understanding with multilingual support.

Input: $1.00/M Output: $2.50/M Context: 128K tokens

May 2024

OpenAI

GPT-4o

OpenAI's flagship multimodal model. Strong coding and reasoning at competitive pricing.

Input: $2.50/M Output: $10.00/M Context: 128K tokens
Google

Gemini 1.5 Flash

Cheapest Gemini model. Good for high-volume, simple tasks.

Input: $0.075/M Output: $0.300/M Context: 1M tokens
01.ai

Yi-Large

01.ai's flagship model. Strong bilingual capabilities for enterprise applications.

Input: $2.50/M Output: $10.00/M Context: 32K tokens
Microsoft

Phi-3 Medium

Microsoft's mid-size Phi-3 model. Better performance than Mini for moderate complexity tasks.

Input: $0.100/M Output: $0.300/M Context: 128K tokens

April 2024

OpenAI

GPT-4 Turbo

Previous generation high-performance model. Good for complex reasoning tasks.

Input: $10.00/M Output: $30.00/M Context: 128K tokens
Google

Gemini 1.5 Pro

Previous generation Google pro model. Good for general tasks.

Input: $1.25/M Output: $5.00/M Context: 1M tokens
Stability AI

Stable Code 3B

Stability AI's code-focused model. Small, efficient model for code completion and generation.

Input: $0.050/M Output: $0.200/M Context: 32K tokens
Stability AI

Stable LM 2

Stability AI's general-purpose language model. Open weights with competitive performance.

Input: $0.100/M Output: $0.400/M Context: 32K tokens
Microsoft

Phi-3 Mini

Microsoft's compact Phi-3 model. Small but capable model for edge and IoT deployment.

Input: $0.050/M Output: $0.100/M Context: 128K tokens
Reka

Reka Edge

Reka's lightweight multimodal model. Affordable image and video understanding.

Input: $0.400/M Output: $1.00/M Context: 32K tokens

March 2024

Anthropic

Claude 3 Haiku

Cheapest Claude model. Fast responses for simple tasks and basic coding.

Input: $0.250/M Output: $1.25/M Context: 200K tokens
Qwen

Qwen Max

Qwen's most powerful model. Strong reasoning and coding capabilities.

Input: $1.60/M Output: $6.40/M Context: 32K tokens
Cohere

Cohere Command R

Cohere's RAG-optimized model. Built for search, retrieval, and enterprise knowledge management.

Input: $0.150/M Output: $0.600/M Context: 128K tokens
Groq

Groq Mixtral 8x7B

Mixtral MoE on Groq's LPU. Fast, cost-effective inference for general tasks.

Input: $0.240/M Output: $0.240/M Context: 32K tokens
Databricks

Databricks DBRX Instruct

Databricks' open MoE model. Competitive with GPT-3.5 for coding and general tasks.

Input: $0.750/M Output: $2.25/M Context: 32K tokens

February 2024

Anthropic

Claude 3 Opus

First generation Opus. Highest reasoning capability in the Claude 3 family.

Input: $15.00/M Output: $75.00/M Context: 200K tokens
Anthropic

Claude 3 Sonnet

First generation Sonnet. Balanced performance for general tasks.

Input: $3.00/M Output: $15.00/M Context: 200K tokens

January 2024

Qwen

Qwen Plus

Balanced Qwen model for general tasks. Good price-performance ratio.

Input: $0.400/M Output: $1.20/M Context: 128K tokens
Qwen

Qwen Turbo

Fastest and cheapest Qwen model. Good for high-volume tasks.

Input: $0.080/M Output: $0.240/M Context: 1M tokens

March 2023

OpenAI

GPT-4

Original GPT-4. Most expensive OpenAI model, largely superseded by newer options.

Input: $30.00/M Output: $60.00/M Context: 8K tokens
OpenAI

GPT-3.5 Turbo

Budget model for simple tasks. Being phased out but still widely used.

Input: $0.500/M Output: $1.50/M Context: 16K tokens