Cloudflare Workers AI - 10,000 Free Neurons/Day
Source: https://developers.cloudflare.com/workers-ai
Description
Create account to comment on specific lines or Sign in
+ 1 Cloudflare Workers AI gives you 10,000 free Neurons per day to run AI models on Cloudflare's global edge network — no credit card required, no time limit. Neurons measure GPU compute across all model types (LLMs, image generation, embeddings, speech-to-text, text-to-speech, and more). With the free daily allowance you can generate roughly 1,300 LLM responses, 8,300 image classifications, or 12,500 embeddings. The limit resets every day at 00:00 UTC. Workers Free plan users hit a hard rate limit after 10,000 neurons; Workers Paid plan users ($5/month) get the same 10,000 free neurons plus overflow at $0.011 per 1,000 neurons.
No comments on this line yet.
+ 2
No comments on this line yet.
+
3
No comments on this line yet.
+ 4
No comments on this line yet.
+ 6
No comments on this line yet.
+ 7 1. Go to dash.cloudflare.com/sign-up and create a free Cloudflare account (email + password, no credit card)
No comments on this line yet.
+ 8 2. Verify your email address
No comments on this line yet.
+ 9 3. In the dashboard, navigate to Workers & Pages
No comments on this line yet.
+ 10 4. You now have three ways to start using Workers AI:
No comments on this line yet.
+ 11
No comments on this line yet.
+ 13 5. Click Create application and select the "LLM Chat App" template
No comments on this line yet.
+ 14 6. Name your Worker and click "Create and deploy"
No comments on this line yet.
+
15
7. Your app is live on a workers.dev subdomain — click "Edit Code" to modify
No comments on this line yet.
+ 16
No comments on this line yet.
+ 18 5. In the dashboard, go to Workers AI and click "Use REST API"
No comments on this line yet.
+
19
6. Click "Create a Workers AI API Token" — review the prefilled permissions (Workers AI - Read and Workers AI - Edit) and generate the token
No comments on this line yet.
+ 20 7. Copy your Account ID from the same page
No comments on this line yet.
+ 21 8. Make your first call:
No comments on this line yet.
+
22
No comments on this line yet.
+ 23 curl https://api.cloudflare.com/client/v4/accounts/{ACCOUNT_ID}/ai/run/@cf/meta/llama-3.1-8b-instruct \
No comments on this line yet.
+ 24 -H 'Authorization: Bearer {API_TOKEN}' \
No comments on this line yet.
+ 25 -d '{"prompt": "Hello, what can you do?"}'
No comments on this line yet.
+
26
No comments on this line yet.
+ 27
No comments on this line yet.
+
29
5. Install Node.js (16.17+) and run npm create cloudflare@latest
No comments on this line yet.
+
30
6. Add an [ai] binding to your wrangler.toml
No comments on this line yet.
+
31
7. Call env.AI.run() in your Worker code
No comments on this line yet.
+
32
8. Deploy with wrangler deploy
No comments on this line yet.
+ 33
No comments on this line yet.
+ 34 Important:
No comments on this line yet.
+ 35 • No credit card is required for the free plan
No comments on this line yet.
+
36
• Using Workers AI locally via wrangler dev still consumes your Cloudflare account's neuron quota
No comments on this line yet.
+ 37 • The API token needs both Read and Edit permissions for Workers AI
No comments on this line yet.
+ 38
No comments on this line yet.
+
39
No comments on this line yet.
+ 40
No comments on this line yet.
+ 42
No comments on this line yet.
+ 43 Workers AI supports OpenAI-compatible endpoints, so you can use the standard OpenAI SDK by swapping the base URL:
No comments on this line yet.
+ 44
No comments on this line yet.
+ 45 EndpointURL
No comments on this line yet.
+
46
Chat Completionshttps://api.cloudflare.com/client/v4/accounts/{account_id}/ai/v1/chat/completions
No comments on this line yet.
+
47
Text Completionshttps://api.cloudflare.com/client/v4/accounts/{account_id}/ai/v1/completions
No comments on this line yet.
+
48
Embeddingshttps://api.cloudflare.com/client/v4/accounts/{account_id}/ai/v1/embeddings
No comments on this line yet.
+ 49
No comments on this line yet.
+
50
All standard OpenAI parameters work: temperature, max_tokens, top_p, frequency_penalty, presence_penalty. Many models also support function calling, streaming, and batch processing.
No comments on this line yet.
+ 51
No comments on this line yet.
+
52
No comments on this line yet.
+ 53
No comments on this line yet.
+ 55
No comments on this line yet.
+ 56 Workers AI hosts 100+ open-source models across multiple task types. All models run on Cloudflare's edge GPUs — no setup, no cold starts for popular models.
No comments on this line yet.
+ 57
No comments on this line yet.
+ 59
No comments on this line yet.
+ 60 ModelAuthorNotes
No comments on this line yet.
+ 61 gpt-oss-120bOpenAIPowerful reasoning, agentic tasks, general purpose
No comments on this line yet.
+ 62 gpt-oss-20bOpenAILower latency, specialized use cases
No comments on this line yet.
+ 63 Llama 4 Scout 17BMetaMixture-of-experts, function calling
No comments on this line yet.
+ 64 Llama 3.3 70B Instruct (FP8)MetaQuantized, batch + function calling
No comments on this line yet.
+ 65 Llama 3.1 8B InstructMetaMultilingual dialogue, multiple quantizations available
No comments on this line yet.
+ 66 Llama 3.2 11B VisionMetaVision + text, image reasoning
No comments on this line yet.
+ 67 Qwen3 30B-A3B (FP8)QwenReasoning, instruction-following, agent capabilities
No comments on this line yet.
+ 68 QwQ 32BQwenReasoning model for complex problems
No comments on this line yet.
+ 69 Qwen 2.5 Coder 32BQwenCode-specialized
No comments on this line yet.
+ 70 Gemma 3 12B ITGoogleMultimodal, 128K context, 140+ languages
No comments on this line yet.
+ 71 Mistral Small 3.1 24BMistralVision-capable, 128K tokens
No comments on this line yet.
+ 72 DeepSeek R1 Distill Qwen 32BDeepSeekDistilled reasoning model
No comments on this line yet.
+ 73 Granite 4.0-H MicroIBMAgentic tasks, instruction following
No comments on this line yet.
+ 74 Hermes 2 Pro Mistral 7BNous ResearchFunction calling + JSON mode
No comments on this line yet.
+ 75
No comments on this line yet.
+ 77
No comments on this line yet.
+ 78 ModelAuthorNotes
No comments on this line yet.
+ 79 FLUX.2 Klein 9BBlack Forest LabsEnhanced quality, generation + editing
No comments on this line yet.
+ 80 FLUX.2 Klein 4BBlack Forest LabsUltra-fast distilled
No comments on this line yet.
+ 81 FLUX.2 DevBlack Forest LabsHighly realistic, multi-reference support
No comments on this line yet.
+ 82 FLUX.1 SchnellBlack Forest Labs12B parameter, fast generation
No comments on this line yet.
+ 83 Lucid OriginLeonardoStrong prompt adherence, text rendering
No comments on this line yet.
+ 84 Phoenix 1.0LeonardoCoherent text in images
No comments on this line yet.
+ 85 Stable Diffusion XL LightningByteDanceLightning-fast 1024px generation
No comments on this line yet.
+ 86
No comments on this line yet.
+ 88
No comments on this line yet.
+ 89 ModelAuthorNotes
No comments on this line yet.
+ 90 EmbeddingGemma 300MGoogle100+ languages
No comments on this line yet.
+ 91 Qwen3 Embedding 0.6BQwenText embedding and ranking
No comments on this line yet.
+ 92 BGE-M3BAAIMulti-lingual, multi-granularity
No comments on this line yet.
+ 93 BGE Large/Base/Small EN v1.5BAAIEnglish embeddings (384–1024 dims)
No comments on this line yet.
+ 94
No comments on this line yet.
+ 96
No comments on this line yet.
+ 97 ModelAuthorNotes
No comments on this line yet.
+ 98 Nova 3DeepgramHigh-accuracy transcription
No comments on this line yet.
+ 99 FluxDeepgramConversational speech recognition for voice agents
No comments on this line yet.
+ 100 Whisper Large v3 TurboOpenAIMultilingual ASR and translation
No comments on this line yet.
+ 101 Whisper / Whisper Tiny ENOpenAIGeneral-purpose speech recognition
No comments on this line yet.
+ 102
No comments on this line yet.
+ 104
No comments on this line yet.
+ 105 ModelAuthorNotes
No comments on this line yet.
+ 106 Aura 2 (EN/ES)DeepgramContext-aware, natural pacing, real-time
No comments on this line yet.
+ 107 Aura 1DeepgramNatural prosody
No comments on this line yet.
+ 108 MeloTTSMyShell AIMulti-lingual TTS
No comments on this line yet.
+ 109
No comments on this line yet.
+ 111
No comments on this line yet.
+ 112 • Image Classification: ResNet-50 (Microsoft)
No comments on this line yet.
+ 113 • Object Detection: DETR ResNet-50 (Facebook)
No comments on this line yet.
+ 114 • Translation: M2M-100 1.2B (Meta), IndicTrans2 (AI4Bharat)
No comments on this line yet.
+ 115 • Summarization: BART Large CNN (Facebook)
No comments on this line yet.
+ 116 • Text Classification / Reranking: BGE Reranker, DistilBERT
No comments on this line yet.
+ 117 • Image-to-Text: LLaVA 1.5 7B, UForm Gen2
No comments on this line yet.
+ 118 • Safety: Llama Guard 3 8B (prompt/response classification)
No comments on this line yet.
+ 119 • Voice Activity Detection: Pipecat Smart Turn v2
No comments on this line yet.
+ 120
No comments on this line yet.
+
121
No comments on this line yet.
+ 122
No comments on this line yet.
+ 124
No comments on this line yet.
+ 125 Neurons abstract GPU compute across model types. Here is what 10,000 neurons (one day's free allowance) roughly translates to:
No comments on this line yet.
+ 126
No comments on this line yet.
+ 127 TaskApproximate Free Daily Volume
No comments on this line yet.
+ 128 LLM responses (e.g., Llama 3.1 8B)~1,300 responses
No comments on this line yet.
+ 129 Image classifications~8,300 classifications
No comments on this line yet.
+ 130 Text embeddings~12,500 embeddings
No comments on this line yet.
+ 131 Image generation (FLUX.1 Schnell, 512x512)~2,000 images
No comments on this line yet.
+ 132
No comments on this line yet.
+ 133 Neuron cost varies significantly by model size. Smaller models (Llama 3.2 1B) consume far fewer neurons per request than large ones (Llama 3.3 70B or DeepSeek R1 32B). Choose your model strategically to maximize free usage.
No comments on this line yet.
+ 134
No comments on this line yet.
+ 136
No comments on this line yet.
+ 137 ModelInput (per M tokens)Output (per M tokens)
No comments on this line yet.
+ 138 Llama 3.2 1B Instruct$0.027$0.201
No comments on this line yet.
+ 139 Llama 3.2 3B Instruct$0.051$0.335
No comments on this line yet.
+ 140 Llama 3.1 8B Instruct (FP8)$0.045$0.384
No comments on this line yet.
+ 141 Llama 3.1 70B Instruct (FP8)$0.293$2.253
No comments on this line yet.
+ 142
No comments on this line yet.
+
143
No comments on this line yet.
+ 144
No comments on this line yet.
+ 146
No comments on this line yet.
+ 147 PlanDaily NeuronsAfter LimitPer-Model Request Limit
No comments on this line yet.
+ 148 Workers Free10,000Hard block (error returned)1,500–3,000 RPM depending on model
No comments on this line yet.
+ 149 Workers Paid ($5/mo)10,000 free$0.011 per 1,000 additional neuronsHigher RPM limits
No comments on this line yet.
+ 150
No comments on this line yet.
+ 151 The daily neuron quota resets at 00:00 UTC. Per-model rate limits (requests per minute) are separate from the neuron budget and vary by model — check the docs for each model's specific RPM cap.
No comments on this line yet.
+ 152
No comments on this line yet.
+
153
No comments on this line yet.
+ 154
No comments on this line yet.
+ 156
No comments on this line yet.
+ 157 Workers AI integrates with other Cloudflare developer platform products on the free plan:
No comments on this line yet.
+ 158
No comments on this line yet.
+ 159 ServiceFree Tier
No comments on this line yet.
+ 160 Workers (compute)100,000 requests/day, 10ms CPU time per invocation
No comments on this line yet.
+ 161 Vectorize (vector database)5M stored vectors, 30M queried vectors/month
No comments on this line yet.
+ 162 AI Gateway (observability)Unlimited requests — caching, rate limiting, logging, model fallback
No comments on this line yet.
+ 163 D1 (SQL database)5 GB storage, 5M rows read/day
No comments on this line yet.
+ 164 R2 (object storage)10 GB storage, 1M Class A ops/month
No comments on this line yet.
+ 165 KV (key-value store)100,000 reads/day
No comments on this line yet.
+ 166
No comments on this line yet.
+ 167 This means you can build a full-stack AI application (LLM + embeddings + vector search + storage + caching) entirely on Cloudflare's free tier.
No comments on this line yet.
+ 168
No comments on this line yet.
+
169
No comments on this line yet.
+ 170
No comments on this line yet.
+ 172
No comments on this line yet.
+ 173 • No credit card required — the free plan is genuinely free with no payment info needed. The $5/month paid plan only matters if you need to exceed 10,000 neurons/day
No comments on this line yet.
+ 174 • LoRA support — you can bring your own fine-tuned LoRA adapters and run them on supported base models (Llama, Mistral, Gemma) at no extra cost beyond normal neuron usage
No comments on this line yet.
+ 175 • Edge latency — models run on Cloudflare's global network, so inference happens close to your users. Cold starts are minimal for popular models
No comments on this line yet.
+ 176 • AI Gateway for observability — route your requests through AI Gateway (free) to get logging, caching (avoid re-running identical prompts), and automatic retries/fallback across providers
No comments on this line yet.
+ 177 • Neuron tracking — monitor your usage in the Cloudflare dashboard under Workers AI analytics. The dashboard shows usage in both neurons and conventional units (tokens, seconds, images)
No comments on this line yet.
+ 178 • Cloudflare is transitioning pricing from neurons to per-unit pricing (tokens, audio seconds, image tiles) for clarity, but neurons remain the underlying billing metric
No comments on this line yet.
+ 179 • Startup / enterprise programs — Cloudflare offers a Workers Launchpad program for startups building on their platform, which may include additional credits
No comments on this line yet.
+ 180 • Model availability — not every model is available in every Cloudflare data center. Popular models have wider distribution; less-used models may route to fewer locations with slightly higher latency
No comments on this line yet.
+ 181
No comments on this line yet.
+
182
No comments on this line yet.
+ 183
No comments on this line yet.
+ 184 Sources:
No comments on this line yet.
+ 185 • Workers AI Overview
No comments on this line yet.
+ 186 • Workers AI Pricing
No comments on this line yet.
+ 187 • Workers AI Models Catalog
No comments on this line yet.
+ 188 • Get Started — REST API
No comments on this line yet.
+ 189 • Get Started — Dashboard
No comments on this line yet.
+ 190 • Get Started — Workers & Wrangler
No comments on this line yet.
+ 191 • OpenAI Compatible API Endpoints
No comments on this line yet.
+ 192 • Workers AI GA Announcement
No comments on this line yet.
+ 193 • Workers AI Updated Pricing Changelog
No comments on this line yet.
+ 194 • OpenAI Open Models on Workers AI
No comments on this line yet.