Wallet & spending caps

qlaud has two billing layers, both checked before every upstream call:

Account wallet — your prepaid balance. Top up via Stripe Checkout from the dashboard. Shared across every key you own.
Per-key cap — optional max_spend_usd set when you mint a key. Hard ceiling on what that key can spend, regardless of wallet balance.

A request is allowed when both checks pass:

wallet.balance > 0   AND   key.spend_so_far < key.max_spend

If either fails → 402 Payment Required with a clear error message.

The wallet

Source of truth: a Cloudflare Durable Object holding your balance in micro-dollars.
Updated on Stripe webhook (checkout.session.completed) and after every upstream call (debit).
Atomic — concurrent requests can’t double-spend.

Per-key caps

Set when you mint a key:

curl https://api.qlaud.ai/v1/keys \
  -H "x-api-key: $QLAUD_MASTER_KEY" \
  -H "content-type: application/json" \
  -d '{"name":"user_42","max_spend_usd":5}'

Internally we store this as max_spend_micros = 5_000_000. On every request, we check SUM(usage_events.cost_micros WHERE key_id = ?) against this cap. The sum is cached in KV for 60s — at scale this is one D1 read per minute per active key.

What “cap” actually means

It’s a lifetime cap on the key, not monthly. We chose lifetime semantics for v1 simplicity:

Want monthly resets? Rotate keys monthly (POST /v1/keys once a month per user, store the new one).
Want a hard limit per period? Roll your own logic on top — pull /v1/usage with from_ms/to_ms and decide whether to revoke + re-mint.

Period-based caps + auto-rotation are on the roadmap.

The overdraft policy

Pre-flight is cheap (DO read + KV read). The actual upstream cost is only known after the response streams back. So a request that takes the balance from

0.01 → -

0.49 is allowed to complete; the next request is blocked at 402. This is the same model as OpenAI’s prepaid credits and lets the request finish without abandoning a half-streamed response. Worst case: a few cents of overdraft per stolen-key incident, bounded by the per-key cap.

What happens at the cap

Customer-facing response when a key is over its cap:

{
  "type": "error",
  "error": {
    "type": "authentication_error",
    "message": "this API key has reached its spending cap. Mint a fresh key or raise the cap."
  }
}

HTTP status 402. Forward this to your end-user as “your AI usage limit is reached, [click to upgrade]” or similar.

Get started

Tutorials

Concepts

Integrations

Wallet & spending caps

The wallet

Per-key caps

What “cap” actually means

The overdraft policy

What happens at the cap

Get started

Tutorials

Concepts

Integrations

​The wallet

​Per-key caps

​What “cap” actually means

​The overdraft policy

​What happens at the cap

The wallet

Per-key caps

What “cap” actually means

The overdraft policy

What happens at the cap