Kluster.ai Free Batch Inference Tier
Source: https://www.kluster.ai/
Description
Create account to comment on specific lines or Sign in
+
1
Kluster.ai is an AI cloud platform specializing in batch inference for open-source models like DeepSeek-R1, Llama 4 Maverick/Scout, and Qwen3. The free tier supports up to 1,000 batch requests per file with a 100MB max file size. The platform also offers adaptive real-time inference with sub-second latency. Kluster.ai uses an OpenAI-compatible API (api.kluster.ai/v1), so you can use the standard OpenAI Python SDK. Batch inference pricing starts as low as $0.10/M tokens for smaller models with 72-hour completion windows — up to 50% cheaper than competitors.
No comments on this line yet.
+ 2
No comments on this line yet.
+
3
No comments on this line yet.
+ 4
No comments on this line yet.
+ 6
No comments on this line yet.
+ 7 1. Go to kluster.ai and click Sign Up
No comments on this line yet.
+ 8 2. Create your account
No comments on this line yet.
+ 9 3. Navigate to the API section to generate your API key
No comments on this line yet.
+
10
4. Set your base URL to https://api.kluster.ai/v1
No comments on this line yet.
+ 11 5. Use the OpenAI Python SDK with your kluster.ai API key
No comments on this line yet.
+ 12 6. Prepare your batch request as a JSONL file (one request per line)
No comments on this line yet.
+ 13 7. Submit your batch job via the API
No comments on this line yet.
+ 14
No comments on this line yet.
+ 15 Important:
No comments on this line yet.
+ 16 • Free tier has a hard limit of 1,000 requests per batch file
No comments on this line yet.
+ 17 • Max file size is 100MB per batch file (applies to all tiers)
No comments on this line yet.
+
18
• The API is OpenAI-compatible — use from openai import OpenAI and change the base URL
No comments on this line yet.
+ 19 • Batch jobs are processed asynchronously — you submit and poll for results
No comments on this line yet.
+ 20
No comments on this line yet.
+
21
No comments on this line yet.
+ 22
No comments on this line yet.
+ 24
No comments on this line yet.
+ 25 ModelCategoryNotes
No comments on this line yet.
+ 26 DeepSeek-R1Reasoning671B parameter reasoning model
No comments on this line yet.
+ 27 Llama 4 MaverickChatMeta's latest MoE model
No comments on this line yet.
+ 28 Llama 4 ScoutChatMeta's efficient MoE model
No comments on this line yet.
+ 29 Llama 3.3 70B InstructChatMeta's instruction-following model
No comments on this line yet.
+ 30 Llama 3.1 405B Instruct TurboChatMeta's largest Llama model
No comments on this line yet.
+ 31 Llama 3.1 8BChatFast, lightweight model
No comments on this line yet.
+ 32 Qwen3-235B-A22BChatAlibaba's flagship MoE
No comments on this line yet.
+ 33 Gemma 3ChatGoogle's open model
No comments on this line yet.
+ 34
No comments on this line yet.
+
35
No comments on this line yet.
+ 36
No comments on this line yet.
+ 38
No comments on this line yet.
+ 39 FeatureFree TierStandard Tier
No comments on this line yet.
+ 40 Max batch requests per file1,000Unlimited
No comments on this line yet.
+ 41 Max file size100MB100MB
No comments on this line yet.
+ 42 Models availableAll supportedAll supported
No comments on this line yet.
+ 43 Completion windows24h, 48h, 72h24h, 48h, 72h
No comments on this line yet.
+ 44 Real-time inferenceLimitedYes
No comments on this line yet.
+ 45
No comments on this line yet.
+
46
No comments on this line yet.
+ 47
No comments on this line yet.
+ 49
No comments on this line yet.
+ 50 Model24-Hour48-Hour72-Hour
No comments on this line yet.
+ 51 DeepSeek-R1$3.50$3.00$2.50
No comments on this line yet.
+ 52 Llama 4 Scout 17Bx16E$0.15$0.12$0.10
No comments on this line yet.
+ 53 Llama 3.1 405B TurboHigherMidLower
No comments on this line yet.
+ 54
No comments on this line yet.
+ 55 Longer completion windows = lower cost. Choose 72h for maximum savings on non-urgent workloads.
No comments on this line yet.
+ 56
No comments on this line yet.
+
57
No comments on this line yet.
+ 58
No comments on this line yet.
+ 60
No comments on this line yet.
+ 61 ModeLatencyBest For
No comments on this line yet.
+ 62 Real-timeSub-secondInteractive apps, chat
No comments on this line yet.
+ 63 AsynchronousMinutesFlexible timing, moderate volume
No comments on this line yet.
+ 64 Batch24-72 hoursHigh-volume, bulk processing
No comments on this line yet.
+ 65
No comments on this line yet.
+
66
No comments on this line yet.
+ 67
No comments on this line yet.
+ 69
No comments on this line yet.
+ 70 • Batch inference is the killer feature — if you have large-scale processing jobs (data labeling, content generation, analysis), batch mode at 72h completion can be 50%+ cheaper than real-time
No comments on this line yet.
+ 71 • JSONL format — each line in your batch file is a separate request in JSON format. Make sure your file is valid JSONL before submitting
No comments on this line yet.
+ 72 • Use with Bespoke Curator — kluster.ai integrates with Bespoke Labs' data curation tool for efficient large-scale inference pipelines
No comments on this line yet.
+
73
• OpenAI SDK compatible — no custom SDK needed. Just pip install openai, set the base URL to api.kluster.ai/v1, and use your API key
No comments on this line yet.
+ 74 • Free tier is per-file, not per-month — you can submit multiple batch files with up to 1,000 requests each
No comments on this line yet.
+ 75 • Promotional credits — kluster.ai has periodically offered $100 in free credits (e.g., for DeepSeek-R1 usage). Watch their blog for announcements
No comments on this line yet.
+ 76 • Adaptive inference — the platform dynamically adjusts computing resources based on workload, which helps keep costs low
No comments on this line yet.
+ 77 • Compare with alternatives — for batch inference, also consider OpenAI Batch API (50% discount), Together.ai, and Fireworks.ai
No comments on this line yet.
+ 78
No comments on this line yet.
+
79
No comments on this line yet.
+ 80
No comments on this line yet.
+ 81 Sources:
No comments on this line yet.
+ 82 • Kluster.ai
No comments on this line yet.
+ 83 • Kluster.ai Documentation
No comments on this line yet.
+ 84 • Kluster.ai Supported Models
No comments on this line yet.
+ 85 • Kluster.ai Batch Inference Guide
No comments on this line yet.
+ 86 • Kluster.ai Adaptive Inference Blog
No comments on this line yet.
+ 87 • Kluster.ai on Artificial Analysis
No comments on this line yet.
+ 88 • Using kluster.ai with Bespoke Curator
No comments on this line yet.