Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.qlaud.ai/llms.txt

Use this file to discover all available pages before exploring further.

qlaud has two HTTP surfaces. Same auth, same wallet, same per-user billing. Different power-vs-simplicity tradeoff. The one you pick determines what your AI app gets out of the box.

TL;DR

You want…Use this
Just route to providers, bill per end-userPOST /v1/messages (or /v1/chat/completions)
Build a chatbot, agent, or anything with end-user identityPOST /v1/threads/:id/messages

Side-by-side

/v1/messages/v1/threads/:id/messages
What it isAnthropic-shape passthrough (also /v1/chat/completions for OpenAI-shape)qlaud’s stateful Threads API
Conversation memoryYou manage it. Pass the full messages array every turn.Auto-loaded from prior turns. Just send the new user content.
PersistenceNothing persisted (qlaud doesn’t store your messages array).Each turn persisted in qlaud’s D1; deletable via API.
Tool dispatchForwarded as-is. You parse tool_use blocks and call your own tools.Built-in dispatch loop runs end-to-end inside qlaud — you get the final assistant message.
Catalog connectors (105 vendors)Not available — you’d have to register each as an explicit tool.Auto-discovered via tools_mode: "dynamic" (the default). End-users authorize via hosted URL.
Semantic searchNot available.GET /v1/search?q=… indexes every persisted message.
Per-user billingYes — keys + spend caps + usage rollup.Yes — same.
Provider routing + fallbackYes.Yes.
StreamingYes (Anthropic SSE shape).Yes (Anthropic SSE shape, with extra qlaud.tool_dispatch_* events multiplexed in).
Easiest migrationOne-line URL swap from your existing Anthropic / OpenAI SDK.New API surface — small code change in your chat route.

When to use /v1/messages

  • Your existing app already manages conversation history (Postgres, Supabase, in-memory, whatever).
  • You don’t want a managed connector layer.
  • You’re already shipping and just want billing + multi-provider routing without rewriting anything.
  • One-line migration: ANTHROPIC_BASE_URL=https://api.qlaud.ai.
What you get: routing, fallback, per-user keys with spend caps, per-user usage rollup. What you don’t: connectors, threads, semantic search.

When to use /v1/threads/:id/messages

  • You’re building a chatbot, agent, or AI feature with end-user identity.
  • You want to give the model access to Linear, GitHub, Notion, Stripe, ClickUp, and the other 100+ vendors in the catalog — without writing a per-vendor integration.
  • You want conversation history without running a database for it.
  • You want semantic search across past chats without running a vector index.
What you get: everything from /v1/messages PLUS auto-managed threads, 105 catalog connectors auto-discoverable per end-user, semantic search, and the built-in tool dispatch loop. Send the next user turn, qlaud handles the rest, returns the assistant reply.

The tools_mode flag

/v1/threads/:id/messages accepts tools_mode in the body. It controls whether the model gets the meta-tools (auto-discovery) or only the explicit tools you list:
tools_modetools arrayBehavior
"dynamic"(omit)Default when no tools. 4 meta-tools injected. Model discovers + invokes anything in catalog + your registered tools.
"explicit"["tool_id_1", ...]Default when tools array IS provided. Only those tool IDs visible. No meta-tools.
"dynamic"providedRejected with 400 — incompatible.
"explicit"(omit)Empty toolset.
For most chatbots, the default behavior is the right one — pass no tools, get dynamic discovery, the model self-serves.

Disabling specific catalog vendors

If you want to suppress Linear (or any catalog vendor) from your end-users’ discovery without disabling all of them:
curl -X POST https://api.qlaud.ai/v1/mcp-catalog/disable \
  -H "Authorization: Bearer $QLAUD_MASTER_KEY" \
  -d '{"catalog_slug":"qlaud-mcp/linear"}'
Reversible via /v1/mcp-catalog/enable. List currently disabled via GET /v1/mcp-catalog/disabled.

Bringing your own MCP server or webhook tool

Both surfaces support custom tools you register yourself. They appear alongside catalog tools in dynamic-mode discovery.
  • Custom MCP server: POST /v1/mcp-servers with server_url + optional auth_headers. Any HTTPS-reachable MCP server (your own, or a long-tail vendor not in our catalog).
  • Custom webhook tool: POST /v1/tools with webhook_url + input_schema. qlaud HMAC-signs the dispatch; your endpoint returns the result.
See /api-reference/mcp and /api-reference/tools for details.

Migration shape

// /v1/messages — existing Anthropic SDK code, one-line URL change
import Anthropic from '@anthropic-ai/sdk';
const claude = new Anthropic({
  baseURL: 'https://api.qlaud.ai',
  apiKey: user.qlaudKey,
});
await claude.messages.create({
  model: 'claude-sonnet-4-6',
  max_tokens: 1024,
  messages: [/* you manage history */],
});

// /v1/threads/:id/messages — managed backend
const r = await fetch(
  `https://api.qlaud.ai/v1/threads/${threadId}/messages`,
  {
    method: 'POST',
    headers: { Authorization: `Bearer ${user.qlaudKey}` },
    body: JSON.stringify({
      model: 'claude-sonnet-4-6',
      max_tokens: 1024,
      content: 'next user turn',  // qlaud loads prior history
      stream: true,
    }),
  },
);
// qlaud persists, dispatches tools, indexes for search, returns SSE

Common questions

Can I mix the two on the same wallet? Yes. They share auth, billing, and key scopes. Pick per-route. If I switch from /v1/messages to Threads, what migrates? Nothing historical — your prior conversations live in your existing DB. New conversations start fresh in qlaud. Does the model see prior messages on Threads? Yes — qlaud auto-loads them. Long threads use a sliding-window strategy; override by passing an explicit messages array if you need control.