Z.AI: GLM-4.5-Flash, GLM-4.7-Flash and GLM-4.6V-Flash 100% free via API

AI API Free Tiers | Amount: Unlimited usage of GLM-4.5-Flash, GLM-4.7-Flash, and GLM-4.6V-Flash at $0/1M tokens for input, cached input, cached storage and output. Concurrency-limited (not RPM/TPM-limited). | AI-generated | 1/5 InstantSignup and get credits instantly — no credit card, no approval active
2026-05-09
Create account to vote or Sign in Score: -1

Source: https://docs.z.ai/guides/overview/pricing

Description

Create account to comment on specific lines or Sign in

+ 1 Z.AI (Zhipu AI's international platform, the company behind ChatGLM) exposes three "Flash" models priced at $0 for input, cached input, cached storage AND output tokens. There is no monthly cap, no credit balance to top up, and no card required to use them. The endpoint is OpenAI-compatible, so you can drop these models straight into any existing OpenAI SDK / LangChain / OpenRouter-style client by changing the base_url and api_key.

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 2  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 3 The three free models cover the most common use-cases for indie builders: a fast general chat model (GLM-4.5-Flash), the latest-generation general chat model (GLM-4.7-Flash), and a vision-language model that can read images (GLM-4.6V-Flash). All three sit on the international api.z.ai endpoint, so no China-mainland phone number is required.

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 4  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 5

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 6  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 7 Registration (Step-by-Step)

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 8  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 9 1. Go to z.ai/model-api (the international Open Platform — not bigmodel.cn, which is the China-mainland version with different rules).

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 10 2. Click Sign Up and register with email + password (Google/GitHub SSO is also offered). No phone verification on the international platform for basic signup.

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 11 3. Verify your email via the link Z.AI sends.

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 12 4. Open the API Keys page at z.ai/manage-apikey/apikey-list and click Create API Key. Copy it once — Z.AI does not show it again.

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 13 5. Point your OpenAI client at the Z.AI base URL:

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 14 Base URL: https://api.z.ai/api/paas/v4/

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 15 Auth header: Authorization: Bearer <YOUR_API_KEY>

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 16 6. Set the model parameter to one of glm-4.5-flash, glm-4.7-flash, or glm-4.6v-flash and call chat/completions as you would with OpenAI.

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 17  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 18 Important:

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 19 No credit card required for the free Flash models. You only need to add a payment method if you want to use the paid models (GLM-4.7, GLM-4.6V, GLM-5.1, etc.).

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 20 • The international platform (z.ai) and the China-mainland platform (open.bigmodel.cn) are separate accounts with separate keys — pick the one closest to your users for latency.

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 21 • Quickstart example:

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 22

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 24
-H 'Authorization: Bearer YOUR_API_KEY' \ 

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 25
-H 'Content-Type: application/json' \ 

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 26
-d '{"model":"glm-4.5-flash","messages":[{"role":"user","content":"Hello"}]}' 

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 27

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 28  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 29

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 30  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 31 Free Models — What You Actually Get

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 32  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 33 ModelTypeInputCached inputOutputNotes

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 34 GLM-4.5-FlashText chat$0$0$0Lightweight general-purpose; 128K context

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 35 GLM-4.7-FlashText chat$0$0$0Newer-gen lightweight model in the GLM-4.7 family

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 36 GLM-4.6V-FlashVision-language$0$0$0Multimodal — accepts image inputs alongside text

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 37  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 38 All three are billed at $0 / 1M tokens across the board (input, cached input, cached storage, output). This is the official Free tier on the pricing page, not a promo.

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 39  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 40 For reference, the paid counterparts are not free:

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 41 • GLM-4.7-Flash (free) vs. GLM-4.7 ≈ paid pricing

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 42 • GLM-4.6V-Flash (free) vs. GLM-4.6V ≈ $0.30 / $0.90 per 1M tokens

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 43 • GLM-4.7 ≈ paid; GLM-5.1 ≈ paid

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 44  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 45 Source of model list and prices: docs.z.ai/guides/overview/pricing.

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 46  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 47

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 48  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 49 OpenAI Compatibility

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 50  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 51 Z.AI's chat/completions endpoint is a near drop-in for the OpenAI SDK. In Python:

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 52  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 53

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 54 from openai import OpenAI

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 55  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 56 client = OpenAI(

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 57
api_key="YOUR_ZAI_KEY", 

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 58
base_url="https://api.z.ai/api/paas/v4/", 

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 59 )

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 60  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 61 resp = client.chat.completions.create(

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 62
model="glm-4.5-flash", 

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 63
messages=[{"role": "user", "content": "Hello"}], 

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 64 )

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 65 print(resp.choices[0].message.content)

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 66

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 67  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 68 For GLM-4.6V-Flash (vision), pass image content using the standard OpenAI multimodal content array (type: "image_url"). Streaming, tool calling, and JSON-mode responses are supported on the Flash models.

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 69  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 70

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 71  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 72 Rate Limits & The Real Catch

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 73  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 74 Z.AI does not rate-limit by RPM/TPM the way OpenAI does. Instead, free accounts are limited by concurrency (number of in-flight requests).

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 75  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 76 Key caveats from the official rate-limits page:

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 77 Free-trial accounts have lower concurrency than balance-funded accounts.

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 78 • For Flash-class models, requests with context length over 8K tokens are throttled to roughly 1% of the standard concurrency limit during periods of platform stress. In practice this means small prompts fly, while a single 100K-context request can stall behind paid traffic.

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 79 • Concurrency limits are dynamic and per-model — Z.AI does not publish exact numbers, and they may change without notice.

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 80 • Adding a small balance (or subscribing to the Coding Plan) raises concurrency caps even though Flash usage stays $0.

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 81  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 82 If you need predictable throughput for a production workload, treat the free Flash models as dev/eval/burst infrastructure, not as the load-bearing backbone of a paying-customer product.

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 83  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 85  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 86

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 87  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 88 Who This Is For (and Who Should Skip It)

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 89  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 90 Great fit:

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 91 • Indie builders prototyping LLM features without burning a Stripe card

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 92 • Coding agents/scripts where you want a free fallback after hitting OpenAI/Anthropic limits

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 93 • Vision experiments where 25 free Stability credits or one image/day on other free vision tiers isn't enough

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 94 • Multi-language apps — GLM models are particularly strong on Chinese, but English is fully supported

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 95  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 96 Skip if:

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 97 • You need GPT-4-class quality on hard reasoning — GLM-4.5-Flash is a small fast model, not a frontier model

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 98 • You need strict data-residency in the US/EU — Z.AI infrastructure is operated from China by Zhipu AI; check your compliance posture

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 99 • You need guaranteed throughput SLAs on free tier (you won't get them — see rate limits above)

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 100  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 101

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 102  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 103 Additional Tips

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 104  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 105 GLM Coding Plan (optional, paid): Z.AI also sells a Claude-Code-style subscription (Lite ~$10/mo, Pro ~$30/mo, Max ~$80/mo, billed quarterly). The promotional $3/mo tier was discontinued 2026-02-11. The Coding Plan is unrelated to the free Flash models — you can use the free models without ever subscribing.

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 106 OpenRouter alternative: The same GLM-4.5/4.7/4.6V models are also available via OpenRouter, which can be useful if you already have an OpenRouter key. Free routing through OpenRouter applies only to providers OpenRouter marks as free.

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 107 Two platforms, don't mix them up:

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 108 • International: z.ai + api.z.ai — covered by this listing.

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 109 • Mainland China: bigmodel.cn + open.bigmodel.cn — separate accounts, separate pricing, China phone number required.

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 110 Watch for model deprecations: GLM-4-Flash (the older one without a version number) is being phased out. The currently-blessed free models are the three listed above. Re-check the pricing page if a model stops responding.

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 111 Region/latency: API origin is in Asia. Expect ~150–400ms extra round-trip from US/EU compared to a domestic provider; fine for chat, painful for streaming token-by-token UX in some cases.

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 112  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 113

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 114  

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 115 Sources:

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 118 Z.AI Rate Limits page

No comments on this line yet.

Create account to comment on this line. or Sign in

+ 120 Z.AI API Key management

No comments on this line yet.

Create account to comment on this line. or Sign in

Comments

Create account to post a comment or Sign in

No comments yet.

Back