Groq Llama 3.3 70B

Groq

Llama 3.3 70B running on Groq's ultra-fast LPU inference. Sub-100ms responses for 70B model.

Context Window: 128K tokens Released: 2024-12 Best For: Real-time applications, fast chat, low-latency coding

Ultra-fast inference

LPU hardware

Open weights

Groq Llama 3.3 70B Pricing

Token Type	Price per Million
Input tokens	$0.590
Output tokens	$0.790

Estimated Cost by Project Size

Realistic cost estimates for common coding scenarios. Assumes 30% cache hit rate where caching is available.

Scenario	Token Usage	Estimated Cost
Small Script (1K lines)	50K input / 30K output	$0.04
Medium Feature (10K lines)	500K input / 200K output	$0.36
Large Project (50K lines)	2,500K input / 1,000K output	$1.82
Code Review (5K lines)	250K input / 25K output	$0.12

Get Access to Groq Llama 3.3 70B

Ready to start using Groq Llama 3.3 70B? Get API access directly from Groq.

Get API Access → Try Groq Llama 3.3 70B Free →

How Does Groq Llama 3.3 70B Compare?

Model	Input ($/M)	Medium Feature Cost
Groq Llama 3.3 70B	$0.590	$0.36	selected
DeepSeek Reasoner (R1)	$0.550	$0.63	Compare
GPT-3.5 Turbo	$0.500	$0.48	Compare
Qwen 3 Coder	$0.500	$0.57	Compare
QVQ 72B Preview	$0.500	$0.48	Compare
GLM-4-Plus	$0.700	$0.38	Compare

Related Models

Claude Sonnet 4

Anthropic's balanced model for coding and general tasks. Best price-performance ratio in the Claude family.

$3.00/M input $15.00/M output ~$4.66 per medium feature

Claude Opus 4

Anthropic's most powerful model. Best for complex reasoning and challenging coding tasks.

$15.00/M input $75.00/M output ~$23.29 per medium feature

Claude 3.5 Sonnet

Previous generation Sonnet. Still excellent for coding tasks at the same price point.

$3.00/M input $15.00/M output ~$4.66 per medium feature

Claude 3.5 Haiku

Fast, cost-effective model for high-volume tasks. Great for code review and simple queries.

$0.800/M input $4.00/M output ~$1.24 per medium feature

Categories

General-Purpose AI Models AI Coding Models