Kluster.ai Free Batch Inference Tier

AI API Free Tiers | Amount: Free tier: up to 1,000 batch requests/file, 100MB max file size. Promotional credits periodically available. | AI-generated | 1/5 InstantSignup and get credits instantly — no credit card, no approval active
2026-02-26
Create account to vote or Sign in Score: 0

Source: https://www.kluster.ai/

Description

Create account to comment on specific lines or Sign in

+ 1 Kluster.ai is an AI cloud platform specializing in batch inference for open-source models like DeepSeek-R1, Llama 4 Maverick/Scout, and Qwen3. The free tier supports up to 1,000 batch requests per file with a 100MB max file size. The platform also offers adaptive real-time inference with sub-second latency. Kluster.ai uses an OpenAI-compatible API (api.kluster.ai/v1), so you can use the standard OpenAI Python SDK. Batch inference pricing starts as low as $0.10/M tokens for smaller models with 72-hour completion windows — up to 50% cheaper than competitors.

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 2  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 3

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 4  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 5 Registration (Step-by-Step)

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 6  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 7 1. Go to kluster.ai and click Sign Up

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 8 2. Create your account

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 9 3. Navigate to the API section to generate your API key

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 10 4. Set your base URL to https://api.kluster.ai/v1

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 11 5. Use the OpenAI Python SDK with your kluster.ai API key

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 12 6. Prepare your batch request as a JSONL file (one request per line)

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 13 7. Submit your batch job via the API

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 14  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 15 Important:

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 16 • Free tier has a hard limit of 1,000 requests per batch file

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 17 • Max file size is 100MB per batch file (applies to all tiers)

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 18 • The API is OpenAI-compatible — use from openai import OpenAI and change the base URL

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 19 • Batch jobs are processed asynchronously — you submit and poll for results

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 20  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 21

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 22  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 23 Available AI Models

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 24  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 25 ModelCategoryNotes

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 26 DeepSeek-R1Reasoning671B parameter reasoning model

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 27 Llama 4 MaverickChatMeta's latest MoE model

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 28 Llama 4 ScoutChatMeta's efficient MoE model

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 29 Llama 3.3 70B InstructChatMeta's instruction-following model

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 30 Llama 3.1 405B Instruct TurboChatMeta's largest Llama model

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 31 Llama 3.1 8BChatFast, lightweight model

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 32 Qwen3-235B-A22BChatAlibaba's flagship MoE

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 33 Gemma 3ChatGoogle's open model

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 34  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 35

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 36  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 37 Free Tier Limits

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 38  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 39 FeatureFree TierStandard Tier

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 40 Max batch requests per file1,000Unlimited

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 41 Max file size100MB100MB

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 42 Models availableAll supportedAll supported

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 43 Completion windows24h, 48h, 72h24h, 48h, 72h

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 44 Real-time inferenceLimitedYes

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 45  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 46

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 47  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 48 Batch Inference Pricing (per 1M tokens)

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 49  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 50 Model24-Hour48-Hour72-Hour

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 51 DeepSeek-R1$3.50$3.00$2.50

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 52 Llama 4 Scout 17Bx16E$0.15$0.12$0.10

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 53 Llama 3.1 405B TurboHigherMidLower

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 54  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 55 Longer completion windows = lower cost. Choose 72h for maximum savings on non-urgent workloads.

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 56  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 57

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 58  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 59 Inference Modes

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 60  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 61 ModeLatencyBest For

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 62 Real-timeSub-secondInteractive apps, chat

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 63 AsynchronousMinutesFlexible timing, moderate volume

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 64 Batch24-72 hoursHigh-volume, bulk processing

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 65  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 66

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 67  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 68 Additional Tips

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 69  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 70 Batch inference is the killer feature — if you have large-scale processing jobs (data labeling, content generation, analysis), batch mode at 72h completion can be 50%+ cheaper than real-time

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 71 JSONL format — each line in your batch file is a separate request in JSON format. Make sure your file is valid JSONL before submitting

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 72 Use with Bespoke Curator — kluster.ai integrates with Bespoke Labs' data curation tool for efficient large-scale inference pipelines

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 73 OpenAI SDK compatible — no custom SDK needed. Just pip install openai, set the base URL to api.kluster.ai/v1, and use your API key

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 74 Free tier is per-file, not per-month — you can submit multiple batch files with up to 1,000 requests each

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 75 Promotional credits — kluster.ai has periodically offered $100 in free credits (e.g., for DeepSeek-R1 usage). Watch their blog for announcements

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 76 Adaptive inference — the platform dynamically adjusts computing resources based on workload, which helps keep costs low

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 77 Compare with alternatives — for batch inference, also consider OpenAI Batch API (50% discount), Together.ai, and Fireworks.ai

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 78  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 79

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 80  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 81 Sources:

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 82 Kluster.ai

No comments on this line yet.

Create account to comment on this line. or Sign in

Comments

Create account to post a comment or Sign in

No comments yet.

Back