Hugging Face - Free Inference API Credits
Source: https://huggingface.co/inference-api
Description
Create account to comment on specific lines or Sign in
+ 1 Every Hugging Face user receives $0.10/month in free inference credits (subject to change) to experiment with Inference Providers — a unified API that routes requests to 200+ models across 18+ inference partners. No credit card required. Free users cannot continue past their monthly credit limit (no pay-as-you-go). The PRO plan ($9/month) bumps credits to $2/month and unlocks pay-as-you-go billing after credits run out. Hugging Face charges provider rates with zero markup.
No comments on this line yet.
+ 2
No comments on this line yet.
+
3
No comments on this line yet.
+ 4
No comments on this line yet.
+ 6
No comments on this line yet.
+ 7 1. Go to huggingface.co/join
No comments on this line yet.
+ 8 2. Sign up with email or sign in with Google, GitHub, or other OAuth providers
No comments on this line yet.
+ 9 3. Confirm your email address
No comments on this line yet.
+ 10 4. Go to Token Settings and create a fine-grained token with the "Make calls to Inference Providers" permission
No comments on this line yet.
+ 11 5. Done — you can now use the Inference API immediately with your $0.10/month free credits
No comments on this line yet.
+ 12
No comments on this line yet.
+ 13 Important:
No comments on this line yet.
+ 14 • No credit card required at any stage
No comments on this line yet.
+ 15 • No approval process — access is instant
No comments on this line yet.
+ 16 • Free credits reset monthly; unused credits do not roll over
No comments on this line yet.
+
17
• When credits run out, you get a 402 Payment Required error — free users cannot continue past the limit
No comments on this line yet.
+ 18
No comments on this line yet.
+
19
No comments on this line yet.
+ 20
No comments on this line yet.
+ 22
No comments on this line yet.
+ 23 Hugging Face Inference Providers is a proxy layer that sits between your app and multiple AI providers (Groq, Together AI, SambaNova, Fireworks, Replicate, etc.). You send requests using a single Hugging Face token and the system routes them to the best available provider.
No comments on this line yet.
+ 24
No comments on this line yet.
+ 25 Two billing modes:
No comments on this line yet.
+ 26
No comments on this line yet.
+ 27 ModeHow It WorksCredits Apply?Pay-as-you-go?
No comments on this line yet.
+ 28 Routed by Hugging Face (default)Request routes through HF to a providerYesOnly for PRO/Enterprise
No comments on this line yet.
+ 29 Custom Provider KeyYou use your own provider API keyNoBilled by provider directly
No comments on this line yet.
+ 30
No comments on this line yet.
+
31
The API is OpenAI-compatible — you can swap in https://router.huggingface.co/v1 as a base URL with any OpenAI SDK.
No comments on this line yet.
+ 32
No comments on this line yet.
+
33
No comments on this line yet.
+ 34
No comments on this line yet.
+ 36
No comments on this line yet.
+ 37 Access 200+ models across these providers:
No comments on this line yet.
+ 38
No comments on this line yet.
+ 39 ProviderLLMsVision LLMsEmbeddingsText-to-ImageText-to-VideoSpeech-to-Text
No comments on this line yet.
+ 40 CerebrasYes
No comments on this line yet.
+ 41 CohereYesYes
No comments on this line yet.
+ 42 Fal AI YesYesYes
No comments on this line yet.
+ 43 FireworksYesYes
No comments on this line yet.
+ 44 GroqYesYes
No comments on this line yet.
+ 45 HF InferenceYesYesYesYes Yes
No comments on this line yet.
+ 46 HyperbolicYesYes
No comments on this line yet.
+ 47 NovitaYesYes Yes
No comments on this line yet.
+ 48 NscaleYesYes Yes
No comments on this line yet.
+ 49 Replicate YesYesYes
No comments on this line yet.
+ 50 SambaNovaYes Yes
No comments on this line yet.
+ 51 Together AIYesYes Yes
No comments on this line yet.
+ 52
No comments on this line yet.
+ 53 Popular models include DeepSeek-R1, DeepSeek-V3, Llama 3/4 family, Mistral/Mixtral, Qwen 2.5/3, FLUX.1 (image gen), GPT-OSS-120B, and many more. Browse the full list at huggingface.co/inference/models.
No comments on this line yet.
+ 54
No comments on this line yet.
+
55
No comments on this line yet.
+ 56
No comments on this line yet.
+ 58
No comments on this line yet.
+ 59 You can control which provider handles your request:
No comments on this line yet.
+ 60
No comments on this line yet.
+ 61 • Automatic (default): Routes to the first available provider based on your preference order
No comments on this line yet.
+
62
• :fastest suffix: Selects the provider with highest throughput (e.g., deepseek-ai/DeepSeek-R1:fastest)
No comments on this line yet.
+
63
• :cheapest suffix: Selects the provider with lowest price per output token
No comments on this line yet.
+
64
• Explicit provider: Specify directly (e.g., deepseek-ai/DeepSeek-R1:sambanova)
No comments on this line yet.
+ 65
No comments on this line yet.
+
66
No comments on this line yet.
+ 67
No comments on this line yet.
+ 69
No comments on this line yet.
+ 70 FeatureFree ($0)PRO ($9/month)
No comments on this line yet.
+ 71 Monthly inference credits~$0.10 (subject to change)$2.00
No comments on this line yet.
+ 72 Pay-as-you-go after creditsNo — hard stop at limitYes — billed at provider rates
No comments on this line yet.
+ 73 ZeroGPU Spaces usage~300 seconds, low priority8x quota (~2,400 sec), highest priority
No comments on this line yet.
+ 74 ZeroGPU Spaces hostingCannot hostUp to 10 Spaces (H200 GPU)
No comments on this line yet.
+ 75 Private storage100 GB1 TB (10x)
No comments on this line yet.
+ 76 Queue priorityStandardHighest
No comments on this line yet.
+ 77
No comments on this line yet.
+
78
No comments on this line yet.
+ 79
No comments on this line yet.
+ 81
No comments on this line yet.
+ 82 Separate from Inference Providers, Hugging Face offers ZeroGPU Spaces — public Gradio apps that dynamically allocate NVIDIA H200 GPUs on demand:
No comments on this line yet.
+ 83
No comments on this line yet.
+ 84 • Free users can use any public ZeroGPU Space with a rate limit of ~300 seconds per session (refills at 1 ZeroGPU second per 30 real seconds)
No comments on this line yet.
+ 85 • PRO users get 8x the quota and highest queue priority
No comments on this line yet.
+ 86 • Only PRO users and Enterprise orgs can host ZeroGPU Spaces; anyone can use them
No comments on this line yet.
+ 87 • Currently only works with the Gradio SDK
No comments on this line yet.
+ 88
No comments on this line yet.
+
89
No comments on this line yet.
+ 90
No comments on this line yet.
+ 92
No comments on this line yet.
+ 93 • Hugging Face charges zero markup on provider rates — you pay exactly what the provider charges
No comments on this line yet.
+ 94 • Billing is per-request based on compute time x hardware cost per second
No comments on this line yet.
+ 95 • Example: A FLUX.1-dev image generation taking 10 seconds on a GPU at $0.00012/sec = $0.0012 per image
No comments on this line yet.
+ 96 • Track spending at huggingface.co/settings/billing
No comments on this line yet.
+ 97 • Detailed per-model, per-provider breakdown at Inference Providers Settings
No comments on this line yet.
+ 98
No comments on this line yet.
+
99
No comments on this line yet.
+ 100
No comments on this line yet.
+ 102
No comments on this line yet.
+ 103 The API works as a drop-in OpenAI replacement:
No comments on this line yet.
+ 104
No comments on this line yet.
+
105
No comments on this line yet.
+ 106 from openai import OpenAI
No comments on this line yet.
+ 107
No comments on this line yet.
+ 108 client = OpenAI(
No comments on this line yet.
+
109
base_url="https://router.huggingface.co/v1",
base_url="https://router.huggingface.co/v1", No comments on this line yet.
+
110
api_key="hf_YOUR_TOKEN",
api_key="hf_YOUR_TOKEN", No comments on this line yet.
+ 111 )
No comments on this line yet.
+ 112
No comments on this line yet.
+ 113 completion = client.chat.completions.create(
No comments on this line yet.
+
114
model="deepseek-ai/DeepSeek-V3-0324",
model="deepseek-ai/DeepSeek-V3-0324", No comments on this line yet.
+
115
messages=[{"role": "user", "content": "Hello!"}],
messages=[{"role": "user", "content": "Hello!"}], No comments on this line yet.
+ 116 )
No comments on this line yet.
+
117
No comments on this line yet.
+ 118
No comments on this line yet.
+
119
Also available via the native huggingface_hub Python/JS SDK, direct HTTP/cURL, and the Inference Playground for browser-based testing.
No comments on this line yet.
+ 120
No comments on this line yet.
+
121
No comments on this line yet.
+ 122
No comments on this line yet.
+ 124
No comments on this line yet.
+ 125 • $0.10 goes fast — a few LLM chat completions or one image generation can exhaust your monthly free credits. Treat it as a trial, not a production budget
No comments on this line yet.
+ 126 • Limits have been tightened — multiple community reports confirm that free tier limits were reduced in late 2024/early 2025. Users who previously ran hundreds of requests now hit limits much sooner
No comments on this line yet.
+ 127 • Bring your own key — if you already have accounts with Groq (free tier), Together AI, or SambaNova, you can use their API keys through Hugging Face without consuming HF credits
No comments on this line yet.
+ 128 • Team plan ($20/user/month) and Enterprise (from $50/user/month) provide $2/seat in shared credits plus centralized billing
No comments on this line yet.
+ 129 • No SLA on free tier — there is no uptime or latency guarantee for free users
No comments on this line yet.
+ 130 • HF Inference provider (the built-in one) focuses mostly on CPU inference as of mid-2025 — for GPU-accelerated LLMs, requests are routed to external providers like Groq, SambaNova, or Together AI
No comments on this line yet.
+ 131 • OpenAI-compatible endpoint only supports chat completions; for image generation, embeddings, or speech tasks, use the native HF SDK
No comments on this line yet.
+ 132
No comments on this line yet.
+
133
No comments on this line yet.
+ 134
No comments on this line yet.
+ 135 Sources:
No comments on this line yet.
+ 136 • Hugging Face Inference Providers Pricing
No comments on this line yet.
+ 137 • Hugging Face Pricing Page
No comments on this line yet.
+ 138 • Hugging Face Inference Providers Documentation
No comments on this line yet.
+ 139 • Hugging Face Supported Models
No comments on this line yet.
+ 140 • Hugging Face PRO Plan
No comments on this line yet.
+ 141 • ZeroGPU Spaces Documentation
No comments on this line yet.
+ 142 • Inference Providers Blog Post
No comments on this line yet.
+ 143 • Hugging Face Forum: Free Monthly Limit Reached
No comments on this line yet.
+ 144 • Hugging Face Forum: API Inference Limit Changed
No comments on this line yet.