Documentation Index
Fetch the complete documentation index at: https://docs.qlaud.ai/llms.txt
Use this file to discover all available pages before exploring further.
qlaud has two HTTP surfaces. Same auth, same wallet, same per-user
billing. Different power-vs-simplicity tradeoff. The one you pick
determines what your AI app gets out of the box.
TL;DR
| You want… | Use this |
|---|
| Just route to providers, bill per end-user | POST /v1/messages (or /v1/chat/completions) |
| Build a chatbot, agent, or anything with end-user identity | POST /v1/threads/:id/messages |
Side-by-side
| /v1/messages | /v1/threads/:id/messages |
|---|
| What it is | Anthropic-shape passthrough (also /v1/chat/completions for OpenAI-shape) | qlaud’s stateful Threads API |
| Conversation memory | You manage it. Pass the full messages array every turn. | Auto-loaded from prior turns. Just send the new user content. |
| Persistence | Nothing persisted (qlaud doesn’t store your messages array). | Each turn persisted in qlaud’s D1; deletable via API. |
| Tool dispatch | Forwarded as-is. You parse tool_use blocks and call your own tools. | Built-in dispatch loop runs end-to-end inside qlaud — you get the final assistant message. |
| Catalog connectors (105 vendors) | Not available — you’d have to register each as an explicit tool. | Auto-discovered via tools_mode: "dynamic" (the default). End-users authorize via hosted URL. |
| Semantic search | Not available. | GET /v1/search?q=… indexes every persisted message. |
| Per-user billing | Yes — keys + spend caps + usage rollup. | Yes — same. |
| Provider routing + fallback | Yes. | Yes. |
| Streaming | Yes (Anthropic SSE shape). | Yes (Anthropic SSE shape, with extra qlaud.tool_dispatch_* events multiplexed in). |
| Easiest migration | One-line URL swap from your existing Anthropic / OpenAI SDK. | New API surface — small code change in your chat route. |
When to use /v1/messages
- Your existing app already manages conversation history (Postgres,
Supabase, in-memory, whatever).
- You don’t want a managed connector layer.
- You’re already shipping and just want billing + multi-provider
routing without rewriting anything.
- One-line migration:
ANTHROPIC_BASE_URL=https://api.qlaud.ai.
What you get: routing, fallback, per-user keys with spend caps,
per-user usage rollup. What you don’t: connectors, threads,
semantic search.
When to use /v1/threads/:id/messages
- You’re building a chatbot, agent, or AI feature with end-user
identity.
- You want to give the model access to Linear, GitHub, Notion,
Stripe, ClickUp, and the other 100+ vendors in the catalog —
without writing a per-vendor integration.
- You want conversation history without running a database for it.
- You want semantic search across past chats without running a
vector index.
What you get: everything from /v1/messages PLUS auto-managed
threads, 105 catalog connectors auto-discoverable per end-user,
semantic search, and the built-in tool dispatch loop. Send the
next user turn, qlaud handles the rest, returns the assistant
reply.
/v1/threads/:id/messages accepts tools_mode in the body. It
controls whether the model gets the meta-tools (auto-discovery)
or only the explicit tools you list:
tools_mode | tools array | Behavior |
|---|
"dynamic" | (omit) | Default when no tools. 4 meta-tools injected. Model discovers + invokes anything in catalog + your registered tools. |
"explicit" | ["tool_id_1", ...] | Default when tools array IS provided. Only those tool IDs visible. No meta-tools. |
"dynamic" | provided | Rejected with 400 — incompatible. |
"explicit" | (omit) | Empty toolset. |
For most chatbots, the default behavior is the right one — pass no
tools, get dynamic discovery, the model self-serves.
Disabling specific catalog vendors
If you want to suppress Linear (or any catalog vendor) from your
end-users’ discovery without disabling all of them:
curl -X POST https://api.qlaud.ai/v1/mcp-catalog/disable \
-H "Authorization: Bearer $QLAUD_MASTER_KEY" \
-d '{"catalog_slug":"qlaud-mcp/linear"}'
Reversible via /v1/mcp-catalog/enable. List currently disabled via
GET /v1/mcp-catalog/disabled.
Both surfaces support custom tools you register yourself. They appear
alongside catalog tools in dynamic-mode discovery.
- Custom MCP server:
POST /v1/mcp-servers with server_url +
optional auth_headers. Any HTTPS-reachable MCP server (your
own, or a long-tail vendor not in our catalog).
- Custom webhook tool:
POST /v1/tools with webhook_url +
input_schema. qlaud HMAC-signs the dispatch; your endpoint
returns the result.
See /api-reference/mcp and
/api-reference/tools for details.
Migration shape
// /v1/messages — existing Anthropic SDK code, one-line URL change
import Anthropic from '@anthropic-ai/sdk';
const claude = new Anthropic({
baseURL: 'https://api.qlaud.ai',
apiKey: user.qlaudKey,
});
await claude.messages.create({
model: 'claude-sonnet-4-6',
max_tokens: 1024,
messages: [/* you manage history */],
});
// /v1/threads/:id/messages — managed backend
const r = await fetch(
`https://api.qlaud.ai/v1/threads/${threadId}/messages`,
{
method: 'POST',
headers: { Authorization: `Bearer ${user.qlaudKey}` },
body: JSON.stringify({
model: 'claude-sonnet-4-6',
max_tokens: 1024,
content: 'next user turn', // qlaud loads prior history
stream: true,
}),
},
);
// qlaud persists, dispatches tools, indexes for search, returns SSE
Common questions
Can I mix the two on the same wallet? Yes. They share auth, billing, and
key scopes. Pick per-route.
If I switch from /v1/messages to Threads, what migrates? Nothing
historical — your prior conversations live in your existing DB.
New conversations start fresh in qlaud.
Does the model see prior messages on Threads? Yes —
qlaud auto-loads them. Long threads use a sliding-window strategy;
override by passing an explicit messages array if you need control.