- Account wallet — your prepaid balance. Top up via Stripe Checkout from the dashboard. Shared across every key you own.
- Per-key cap — optional
max_spend_usdset when you mint a key. Hard ceiling on what that key can spend, regardless of wallet balance.
402 Payment Required with a clear error message.
The wallet
- Source of truth: a Cloudflare Durable Object holding your balance in micro-dollars.
- Updated on Stripe webhook (
checkout.session.completed) and after every upstream call (debit). - Atomic — concurrent requests can’t double-spend.
Per-key caps
Set when you mint a key:max_spend_micros = 5_000_000.
On every request, we check SUM(usage_events.cost_micros WHERE key_id = ?)
against this cap. The sum is cached in KV for 60s — at scale this is one
D1 read per minute per active key.
What “cap” actually means
It’s a lifetime cap on the key, not monthly. We chose lifetime semantics for v1 simplicity:- Want monthly resets? Rotate keys monthly (
POST /v1/keysonce a month per user, store the new one). - Want a hard limit per period? Roll your own logic on top — pull
/v1/usagewithfrom_ms/to_msand decide whether to revoke + re-mint.
The overdraft policy
Pre-flight is cheap (DO read + KV read). The actual upstream cost is only known after the response streams back. So a request that takes the balance from 0.49 is allowed to complete; the next request is blocked at 402. This is the same model as OpenAI’s prepaid credits and lets the request finish without abandoning a half-streamed response. Worst case: a few cents of overdraft per stolen-key incident, bounded by the per-key cap.What happens at the cap
Customer-facing response when a key is over its cap:402. Forward this to your end-user as “your AI usage limit is
reached, [click to upgrade]” or similar.