What is qlaud?

The pitch

You shipped an AI app. You have N end-users hitting Claude / GPT / Sora through your backend. Now you need:

per-user usage tracking
per-user spending caps
monthly invoicing
failed-payment handling
one bill from one provider, not five

qlaud is the billing layer. Mint a qlk_live_… key per user when they sign up to your app. We meter every request to that key, enforce a hard cap, and report per-user spend you can pipe straight into Stripe.

What you get

Per-user keys

POST /v1/keys returns a qlk_live_… you store with that user. Optional max_spend_usd is enforced gateway-side on every request.

Per-user usage

GET /v1/usage returns spend, requests, and tokens broken down by every key you’ve minted. Pipe it into Stripe at month-end.

Frontier models

Claude Opus 4.7, GPT-5.4, Sora 2, Eleven, Whisper, Deepgram, Perplexity — all behind one key. No per-provider integration.

Anthropic + OpenAI shape

Native /v1/messages AND /v1/chat/completions — drop-in for Claude Code, Cursor, Cline, openai-py, LangChain, Vercel AI SDK.

Who it’s for

If you’re building a product that wraps an AI model and sells it to end-users, qlaud removes the entire billing-infrastructure layer.

Building an AI writing tool? Mint a key per writer.
Coding agent for teams? Mint a key per developer seat.
Voice agent SaaS? Mint a key per phone number.
Image-gen for designers? Mint a key per designer.

You write your app. We do the metering, capping, and per-user reporting.

The 30-second demo

# 1. Mint your master key in the dashboard, then:
export QLAUD_MASTER_KEY=qlk_live_...

# 2. Mint a key for user_42 with a $5 monthly cap
curl https://api.qlaud.ai/v1/keys \
  -H "x-api-key: $QLAUD_MASTER_KEY" \
  -H "content-type: application/json" \
  -d '{"name":"user_42","max_spend_usd":5}'
# → {"id":"...","secret":"qlk_live_abc...","scope":"standard","max_spend_usd":5}

# 3. user_42 makes requests with their qlk_live_abc... key. We enforce the cap.

# 4. End of month — pull spend per user
curl https://api.qlaud.ai/v1/usage -H "x-api-key: $QLAUD_MASTER_KEY"
# → {"by_key":[{"key_name":"user_42","cost_micros":2347000,...}], ...}

That’s it. Read the per-user billing quickstart for the full flow with Node + Python + Stripe wiring.

Beyond billing — the app substrate

Once your end-users are minted as keys, qlaud manages the rest of the AI app stack so you don’t have to:

Threads

Conversation memory primitive. Send just the new turn — qlaud loads history server-side, persists both sides, returns the assistant response. Kills the messages table.

Tools

Register a webhook URL once. When the assistant emits tool_use, qlaud calls your endpoint, awaits the result, re-calls the model. Cross-provider — same shape for Claude or GPT.

Search

Every turn auto-embedded into Cloudflare Vectorize. Query with plain text, get tenant-isolated semantic hits. No vector DB to provision.

Jobs

Async submit + polled retrieval for long-running batch work. Same request body as the synchronous endpoints, wrapped in /v1/jobs.

For the full picture — building a complete chat product (per-user threads, tools, search, streaming UX, billing) end-to-end — see the Build a chat app tutorial.

Get started

Tutorials

Concepts

Integrations

What is qlaud?

The pitch

What you get

Per-user keys

Per-user usage

Frontier models

Anthropic + OpenAI shape

Who it’s for

The 30-second demo

Beyond billing — the app substrate

Threads

Tools

Search

Jobs

Get started

Tutorials

Concepts

Integrations

​The pitch

​What you get

Per-user keys

Per-user usage

Frontier models

Anthropic + OpenAI shape

​Who it’s for

​The 30-second demo

​Beyond billing — the app substrate

Threads

Tools

Search

Jobs

The pitch

What you get

Who it’s for

The 30-second demo

Beyond billing — the app substrate