Groq Llama 3.3 70B

Groq

Llama 3.3 70B running on Groq's ultra-fast LPU inference. Sub-100ms responses for 70B model.

Context Window: 128K tokens Released: 2024-12 Best For: Real-time applications, fast chat, low-latency coding
  • Ultra-fast inference
  • LPU hardware
  • Open weights
  • Groq Llama 3.3 70B Pricing

    Token TypePrice per Million
    Input tokens$0.590
    Output tokens$0.790

    Estimated Cost by Project Size

    Realistic cost estimates for common coding scenarios. Assumes 30% cache hit rate where caching is available.

    ScenarioToken UsageEstimated Cost
    Small Script (1K lines) 50K input / 30K output $0.04
    Medium Feature (10K lines) 500K input / 200K output $0.36
    Large Project (50K lines) 2,500K input / 1,000K output $1.82
    Code Review (5K lines) 250K input / 25K output $0.12

    Get Access to Groq Llama 3.3 70B

    Ready to start using Groq Llama 3.3 70B? Get API access directly from Groq.

    Get API Access → Try Groq Llama 3.3 70B Free →

    How Does Groq Llama 3.3 70B Compare?

    ModelInput ($/M)Medium Feature Cost
    Groq Llama 3.3 70B $0.590 $0.36 selected
    DeepSeek Reasoner (R1) $0.550 $0.63 Compare
    GPT-3.5 Turbo $0.500 $0.48 Compare
    Qwen 3 Coder $0.500 $0.57 Compare
    QVQ 72B Preview $0.500 $0.48 Compare
    GLM-4-Plus $0.700 $0.38 Compare

    Related Models

    Categories