Cerebras Inference - Free 1M Tokens/Day

AI API Free Tiers | Amount: 1 million tokens per day | AI-generated | 1/5 InstantSignup and get credits instantly — no credit card, no approval active
2026-02-05
Create account to vote or Sign in Score: 0

Source: https://inference.cerebras.ai/

Description

Create account to comment on specific lines or Sign in

+ 1 Cerebras offers 1 million free tokens per day through its Inference API, running on proprietary Wafer-Scale Engine (WSE-3) hardware. No credit card required, no waitlist (fully open since June 2025). The API is OpenAI-compatible — swap the base URL to https://api.cerebras.ai/v1 and use your existing OpenAI SDK code. Speeds range from ~1,000 to ~3,000 tokens/second depending on the model, making it one of the fastest inference providers available. Supported models include Llama, Qwen, GPT-OSS, and Z.ai GLM families.

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 2  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 3

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 4  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 5 Registration (Step-by-Step)

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 6  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 7 1. Go to cloud.cerebras.ai

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 8 2. Create an account (email signup)

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 9 3. Verify your email address

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 10 4. Navigate to "API Keys" in the left sidebar

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 11 5. Click "Create API Key", give it a name, and copy the key immediately

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 12 6. Set it as an environment variable: export CEREBRAS_API_KEY="your-key-here"

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 13 7. Done — you can start making API calls right away

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 14  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 15 Important:

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 16 • No credit card is required for the free tier

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 17 • No waitlist or approval process — instant access

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 18 • The free tier resets daily (1M tokens per day, not cumulative)

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 19 • Unused tokens do not roll over

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 20  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 21

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 22  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 23 Available Models

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 24  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 25 Production Models

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 26  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 27 ModelModel IDParametersSpeedContext (Free)Status

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 28 Llama 3.1 8Bllama3.1-8b8B~2,200 tok/s8,192 tokensActive

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 29 Llama 3.3 70Bllama-3.3-70b70B~2,100 tok/s8,192 tokensDeprecating Feb 16, 2026

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 30 OpenAI GPT-OSS 120Bgpt-oss-120b120B~3,000 tok/s8,192 tokensActive

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 31 Qwen 3 32Bqwen-3-32b32B~2,600 tok/s8,192 tokensDeprecating Feb 16, 2026

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 32  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 33 Preview Models (evaluation only, may be discontinued)

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 34  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 35 ModelModel IDParametersSpeed

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 36 Qwen 3 235B Instructqwen-3-235b-a22b-instruct-2507235B (22B active MoE)~1,400 tok/s

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 37 Z.ai GLM 4.7zai-glm-4.7355B~1,000 tok/s

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 38  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 39 Note: Preview models have lower rate limits (especially zai-glm-4.7 at 10 RPM, 100 RPH, 100 RPD) and are not recommended for production use.

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 40  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 41

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 42  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 43 Free Tier Rate Limits

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 44  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 45 LimitFree TierDeveloper Tier (paid)

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 46 Tokens per Minute60,0001,000,000

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 47 Tokens per Hour1,000,000Unlimited

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 48 Tokens per Day1,000,000Unlimited (pay-as-you-go)

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 49 Requests per Minute301,000

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 50 Requests per Hour900Unlimited

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 51 Requests per Day14,400Unlimited

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 52 Context Window8,192 tokensUp to 128K+

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 53  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 54 Rate limits use a token bucket mechanism — capacity replenishes continuously rather than resetting at fixed intervals. Whichever limit (tokens or requests) triggers first will restrict access.

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 55  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 56

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 57  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 58 Context Window Limits

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 59  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 60 The free tier context window is temporarily limited to 8,192 tokens across all models. This is a significant constraint for use cases requiring long documents or multi-turn conversations.

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 61  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 62 On the paid Developer tier, context windows expand substantially — up to 131K tokens for Qwen 3 235B Instruct and up to 128K for other models. If you need longer context, the Developer tier starts at just $10.

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 63  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 64

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 65  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 66 OpenAI API Compatibility

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 67  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 68 Cerebras provides an OpenAI-compatible Chat Completions endpoint, making migration straightforward. Change two things in your existing code:

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 69  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 70 1. Set base_url to https://api.cerebras.ai/v1

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 71 2. Use your Cerebras API key instead of your OpenAI key

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 72  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 73

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 74 import openai

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 75  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 76 client = openai.OpenAI(

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 77
base_url="https://api.cerebras.ai/v1", 

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 78
api_key="your-cerebras-api-key" 

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 79 )

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 80  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 81 response = client.chat.completions.create(

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 82
model="llama-3.3-70b", 

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 83
messages=[{"role": "user", "content": "Hello!"}] 

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 84 )

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 85

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 86  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 87 Unsupported OpenAI parameters:

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 88 frequency_penalty, presence_penalty, logit_bias

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 89 • Streaming + JSON mode on reasoning models (streaming works fine with gpt-oss-120b and non-reasoning models)

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 90 • Text Completions endpoint (only Chat Completions is supported)

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 91  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 92 Cerebras also provides native SDKs for Python (pip install cerebras_cloud_sdk) and Node.js (npm install @cerebras/cerebras_cloud_sdk).

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 93  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 94

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 95  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 96 Paid Tiers

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 97  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 98 TierCostKey Differences

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 99 Free$01M tokens/day, 8K context, community support

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 100 DeveloperFrom $10 (pay-as-you-go)10x higher rate limits, up to 128K+ context, no daily cap

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 101 EnterpriseCustom pricingDedicated queue priority, custom model weights, fine-tuning, SLA support

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 102  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 103 The Developer tier has no contracts — deposit $10 to your account and pay per token consumed. No auto-billing traps on the free tier.

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 104  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 105

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 106  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 107 Alternative Access Methods

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 108  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 109 You can also access Cerebras-powered inference through:

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 110 OpenRouter — unified API supporting multiple providers

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 111 Hugging Face Inference — access Cerebras models from the HF Hub

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 112 AWS Marketplace — Cerebras Fast Inference Cloud available as a marketplace product

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 113  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 114

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 115  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 116 Additional Tips

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 117  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 118 1M tokens/day is enough for serious prototyping, small internal tools, or early-stage pilots — but the 8K context window on free tier is the real bottleneck for many use cases

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 119 Model deprecation — Llama 3.3 70B and Qwen 3 32B are scheduled for deprecation on February 16, 2026. Plan migrations to newer models

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 120 gpt-oss-120b system role behavior — the system role maps to developer-level instructions with stronger influence than standard OpenAI, so identical prompts may produce different results

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 121 Cerebras Code — a separate product (VS Code extension) with Pro ($50/mo, 24M tokens/day) and Max ($200/mo, 120M tokens/day) plans for coding assistance

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 122 No region restrictions — the API is globally available, with data centers across North America and Europe

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 123 Speed advantage — Cerebras benchmarked Llama 4 Maverick at 2,522 tok/s vs. NVIDIA Blackwell at 1,038 tok/s for the same model

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 124  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 125

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 126  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 127 Sources:

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 128 Cerebras Inference

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 129 Cerebras Pricing

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 131 Cerebras Rate Limits

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 132 Cerebras Quickstart

No comments on this line yet.

Create account to comment on this line. or Sign in

Comments

Create account to post a comment or Sign in

No comments yet.

Back