Skip to main content
qlaud has two billing layers, both checked before every upstream call:
  1. Account wallet — your prepaid balance. Top up via Stripe Checkout from the dashboard. Shared across every key you own.
  2. Per-key cap — optional max_spend_usd set when you mint a key. Hard ceiling on what that key can spend, regardless of wallet balance.
A request is allowed when both checks pass:
wallet.balance > 0   AND   key.spend_so_far < key.max_spend
If either fails → 402 Payment Required with a clear error message.

The wallet

  • Source of truth: a Cloudflare Durable Object holding your balance in micro-dollars.
  • Updated on Stripe webhook (checkout.session.completed) and after every upstream call (debit).
  • Atomic — concurrent requests can’t double-spend.

Per-key caps

Set when you mint a key:
curl https://api.qlaud.ai/v1/keys \
  -H "x-api-key: $QLAUD_MASTER_KEY" \
  -H "content-type: application/json" \
  -d '{"name":"user_42","max_spend_usd":5}'
Internally we store this as max_spend_micros = 5_000_000. On every request, we check SUM(usage_events.cost_micros WHERE key_id = ?) against this cap. The sum is cached in KV for 60s — at scale this is one D1 read per minute per active key.

What “cap” actually means

It’s a lifetime cap on the key, not monthly. We chose lifetime semantics for v1 simplicity:
  • Want monthly resets? Rotate keys monthly (POST /v1/keys once a month per user, store the new one).
  • Want a hard limit per period? Roll your own logic on top — pull /v1/usage with from_ms/to_ms and decide whether to revoke + re-mint.
Period-based caps + auto-rotation are on the roadmap.

The overdraft policy

Pre-flight is cheap (DO read + KV read). The actual upstream cost is only known after the response streams back. So a request that takes the balance from 0.010.01 → -0.49 is allowed to complete; the next request is blocked at 402. This is the same model as OpenAI’s prepaid credits and lets the request finish without abandoning a half-streamed response. Worst case: a few cents of overdraft per stolen-key incident, bounded by the per-key cap.

What happens at the cap

Customer-facing response when a key is over its cap:
{
  "type": "error",
  "error": {
    "type": "authentication_error",
    "message": "this API key has reached its spending cap. Mint a fresh key or raise the cap."
  }
}
HTTP status 402. Forward this to your end-user as “your AI usage limit is reached, [click to upgrade]” or similar.