Groq Free Tier - LPU-Powered Inference
Source: https://console.groq.com/
Description
Create account to comment on specific lines or Sign in
+ 1 Groq offers a permanent free tier for its LPU (Language Processing Unit) powered inference API. Sign up at console.groq.com, get an API key instantly, and start calling models with no credit card required. The free tier gives you access to all supported models — including Llama 3.1/3.3, Llama 4 Scout/Maverick, GPT-OSS, Qwen3, Kimi K2, Whisper, and Groq's agentic Compound system — subject to rate limits (~30 RPM, ~6K TPM for smaller models). The API is OpenAI-compatible, so most libraries and tools that work with OpenAI also work with Groq by changing the base URL. Groq's main selling point is speed: inference runs on custom LPU hardware at 200-1,000+ tokens/second depending on the model.
No comments on this line yet.
+ 2
No comments on this line yet.
+
3
No comments on this line yet.
+ 4
No comments on this line yet.
+ 6
No comments on this line yet.
+ 7 1. Go to console.groq.com
No comments on this line yet.
+ 8 2. Click "Sign Up" (or use the direct link: console.groq.com/authenticate/signup)
No comments on this line yet.
+ 9 3. Sign up with your email address or GitHub account — accept the Services Agreement and Privacy Policy
No comments on this line yet.
+ 10 4. Verify your email if prompted
No comments on this line yet.
+ 11 5. You are now on the Free tier — no credit card needed, no trial period
No comments on this line yet.
+ 12 6. Go to the API Keys page and click "Create API Key"
No comments on this line yet.
+ 13 7. Give your key a descriptive name and copy it immediately — it will not be shown again
No comments on this line yet.
+
14
8. Set it as an environment variable: export GROQ_API_KEY=your-key-here
No comments on this line yet.
+
15
9. Make your first request to https://api.groq.com/openai/v1/chat/completions
No comments on this line yet.
+ 16
No comments on this line yet.
+ 17 Important:
No comments on this line yet.
+ 18 • One free account per person — no special verification required
No comments on this line yet.
+ 19 • The free tier has no expiration date — it is ongoing, not a trial
No comments on this line yet.
+ 20 • You will never be charged unless you explicitly add payment info and upgrade to the Developer tier
No comments on this line yet.
+ 21 • Over 2M+ developers are already building on GroqCloud
No comments on this line yet.
+ 22
No comments on this line yet.
+
23
No comments on this line yet.
+ 24
No comments on this line yet.
+ 26
No comments on this line yet.
+ 28
No comments on this line yet.
+ 29 ModelSpeedContext WindowPricing (paid tier)
No comments on this line yet.
+ 30 Llama 3.1 8B Instant~840 t/s131,072$0.05/$0.08 per 1M tokens
No comments on this line yet.
+ 31 Llama 3.3 70B Versatile~394 t/s131,072$0.59/$0.79 per 1M tokens
No comments on this line yet.
+ 32 GPT-OSS 120B~500 t/s131,072$0.15/$0.60 per 1M tokens
No comments on this line yet.
+ 33 GPT-OSS 20B~1,000 t/s131,072$0.075/$0.30 per 1M tokens
No comments on this line yet.
+ 34 Llama Guard 4 12B~1,200 t/s131,072Content moderation model
No comments on this line yet.
+ 35 Whisper Large V3217x realtime—$0.111/hour transcribed
No comments on this line yet.
+ 36 Whisper Large V3 Turbo228x realtime—$0.04/hour transcribed
No comments on this line yet.
+ 37
No comments on this line yet.
+ 39
No comments on this line yet.
+ 40 ModelNotes
No comments on this line yet.
+ 41 Llama 4 Maverick 17Bx128E~562 t/s, 128K context
No comments on this line yet.
+ 42 Llama 4 Scout 17Bx16E~594 t/s, 128K context
No comments on this line yet.
+ 43 Qwen3 32B~662 t/s, 131K context
No comments on this line yet.
+ 44 Kimi K2 (1T params)~200 t/s, 256K context
No comments on this line yet.
+ 45 Llama Prompt Guard 222M and 86M variants
No comments on this line yet.
+ 46 PlayAI TTSText-to-speech (English + Arabic)
No comments on this line yet.
+ 47
No comments on this line yet.
+ 49
No comments on this line yet.
+ 50 SystemDescription
No comments on this line yet.
+ 51 groq/compoundAgentic system with built-in web search + code execution; multiple tool calls per request
No comments on this line yet.
+ 52 groq/compound-miniLighter version; single tool call per request, ~3x lower latency
No comments on this line yet.
+ 53
No comments on this line yet.
+
54
Compound systems can autonomously search the web and execute Python code server-side — no client-side orchestration needed. Set model to groq/compound in your API call and the system handles the rest.
No comments on this line yet.
+ 55
No comments on this line yet.
+ 56 Note: The model lineup changes frequently. Check the official models page for the current list.
No comments on this line yet.
+ 57
No comments on this line yet.
+
58
No comments on this line yet.
+ 59
No comments on this line yet.
+ 61
No comments on this line yet.
+ 62 Rate limits apply at the organization level (not per user) and vary by model. Smaller models get higher limits. Here are representative free tier limits:
No comments on this line yet.
+ 63
No comments on this line yet.
+ 64 ModelRPMRPDTPMTPD
No comments on this line yet.
+ 65 Llama 3.1 8B Instant3014,4006,000500,000
No comments on this line yet.
+ 66 Llama 3.3 70B Versatile301,00012,000100,000
No comments on this line yet.
+ 67 Allam 2 7B307,0006,000500,000
No comments on this line yet.
+ 68 Whisper Large V3202,000——
No comments on this line yet.
+ 69 Whisper Large V3 Turbo202,000——
No comments on this line yet.
+ 70
No comments on this line yet.
+ 71 RPM = Requests/Minute, RPD = Requests/Day, TPM = Tokens/Minute, TPD = Tokens/Day
No comments on this line yet.
+ 72
No comments on this line yet.
+ 73 Key details:
No comments on this line yet.
+ 74 • Limits reset daily at UTC 00:00
No comments on this line yet.
+ 75 • Cached tokens do not count toward rate limits (prompt caching gives 50% discount on paid tier)
No comments on this line yet.
+ 76 • Exceeding limits returns a 429 Too Many Requests error — you are never charged, just throttled
No comments on this line yet.
+ 77 • Max output per request: 8,192 tokens
No comments on this line yet.
+ 78 • The Developer tier offers roughly 10x higher limits across all models
No comments on this line yet.
+ 79 • Exact limits change frequently — check your account limits for current numbers
No comments on this line yet.
+ 80
No comments on this line yet.
+
81
No comments on this line yet.
+ 82
No comments on this line yet.
+ 84
No comments on this line yet.
+ 85 Groq's API is drop-in compatible with the OpenAI API format. This means:
No comments on this line yet.
+ 86
No comments on this line yet.
+
87
• Endpoint: https://api.groq.com/openai/v1/chat/completions
No comments on this line yet.
+
88
• Works with the official OpenAI Python/Node SDKs — just change the base_url and API key
No comments on this line yet.
+ 89 • Works with LangChain, LiteLLM, AI SDK, and most OpenAI-compatible tools
No comments on this line yet.
+
90
• Groq also provides its own Python SDK: pip install groq
No comments on this line yet.
+ 91
No comments on this line yet.
+
92
No comments on this line yet.
+ 93 from groq import Groq
No comments on this line yet.
+ 94 import os
No comments on this line yet.
+ 95
No comments on this line yet.
+ 96 client = Groq(api_key=os.environ.get("GROQ_API_KEY"))
No comments on this line yet.
+ 97 chat = client.chat.completions.create(
No comments on this line yet.
+
98
messages=[{"role": "user", "content": "Hello!"}],
messages=[{"role": "user", "content": "Hello!"}], No comments on this line yet.
+
99
model="llama-3.3-70b-versatile",
model="llama-3.3-70b-versatile", No comments on this line yet.
+ 100 )
No comments on this line yet.
+ 101 print(chat.choices[0].message.content)
No comments on this line yet.
+
102
No comments on this line yet.
+ 103
No comments on this line yet.
+
104
No comments on this line yet.
+ 105
No comments on this line yet.
+ 107
No comments on this line yet.
+ 108 If the free tier limits are too restrictive, the Developer tier offers:
No comments on this line yet.
+ 109
No comments on this line yet.
+ 110 • Up to 10x higher rate limits across all models
No comments on this line yet.
+ 111 • Access to the Batch API for high-volume workloads (25% cost discount)
No comments on this line yet.
+ 112 • Pay-as-you-go pricing — you only pay for tokens consumed
No comments on this line yet.
+ 113 • Self-serve upgrade from the GroqCloud console
No comments on this line yet.
+ 114
No comments on this line yet.
+ 115 Pricing is very competitive: Llama 3.1 8B starts at $0.05 per million input tokens, and even the largest models (Kimi K2 at 1T params) cost only $1.00 per million input tokens.
No comments on this line yet.
+ 116
No comments on this line yet.
+
117
No comments on this line yet.
+ 118
No comments on this line yet.
+ 120
No comments on this line yet.
+ 121 • Best for prototyping and experimentation — the free tier is generous enough for hobby projects, hackathons, and testing, but not for production workloads
No comments on this line yet.
+ 122 • Speed is the main differentiator — Groq's LPU hardware delivers inference speeds 5-10x faster than GPU-based providers. If latency matters, Groq is hard to beat
No comments on this line yet.
+ 123 • Implement retry logic — on the free tier, you will hit rate limits. Use exponential backoff on 429 errors
No comments on this line yet.
+ 124 • Token counting — both input and output tokens count toward TPM/TPD limits
No comments on this line yet.
+ 125 • No SLA on free tier — models in Preview status may be removed without notice
No comments on this line yet.
+ 126 • Compound systems are billed per-tool-use on paid tier — web search costs $5-8 per 1,000 requests, code execution $0.18/hour. On free tier, these are available within your rate limits
No comments on this line yet.
+ 127 • Prompt caching — on the paid tier, cached tokens get a 50% discount. On free tier, cached tokens simply don't count toward your rate limits
No comments on this line yet.
+ 128 • Groq vs. GraphQL (GROQ) — Groq (the AI inference company) is unrelated to GROQ (Sanity's query language). Don't confuse them
No comments on this line yet.
+ 129
No comments on this line yet.
+
130
No comments on this line yet.
+ 131
No comments on this line yet.
+ 132 Sources:
No comments on this line yet.
+ 133 • GroqCloud Console
No comments on this line yet.
+ 134 • Groq Supported Models
No comments on this line yet.
+ 135 • Groq Rate Limits Documentation
No comments on this line yet.
+ 136 • Groq Pricing
No comments on this line yet.
+ 137 • Groq Quickstart Guide
No comments on this line yet.
+ 138 • Groq Community FAQ - Free Tier
No comments on this line yet.
+ 139 • Groq Community FAQ - Rate Limits
No comments on this line yet.
+ 140 • Groq Compound Systems
No comments on this line yet.
+ 141 • Groq Developer Tier Blog Post
No comments on this line yet.