Threads are qlaud’s conversation primitive. Create a thread once, then send
just the new user turn each call — qlaud loads the prior history, calls the
upstream model, persists both turns, and returns the assistant message in
standard Anthropic Messages shape.
Kills the per-app messages table, the context-window loader, and the
“how do I switch models mid-conversation” question — every endpoint that
follows uses the same Anthropic shape regardless of underlying model.
POST /v1/threads — Create
curl https://api.qlaud.ai/v1/threads \
-H "x-api-key: $QLAUD_API_KEY" \
-H "content-type: application/json" \
-d '{
"end_user_id": "user_42",
"metadata": {"plan": "pro", "feature": "/refunds"}
}'
Body
| Field | Type | Required | Description |
|---|
end_user_id | string | no | Opaque id for YOUR end-user (distinct from your qlaud account). Used to filter /v1/threads listings + /v1/search results. |
metadata | object | no | Arbitrary JSON. Stored verbatim, surfaced on read paths. |
Response (201)
{
"id": "2f1d0c7f-e2a1-40e4-8e21-182cf27deeb7",
"object": "thread",
"end_user_id": "user_42",
"metadata": {"plan": "pro", "feature": "/refunds"},
"created_at": 1777262997717,
"last_active_at": 1777262997717
}
POST /v1/threads/:id/messages — Send a turn
The meat. Customer sends just the new user content; qlaud loads thread
history, runs the upstream call, persists both turns, returns the assistant
response.
curl https://api.qlaud.ai/v1/threads/$THREAD_ID/messages \
-H "x-api-key: $QLAUD_API_KEY" \
-H "content-type: application/json" \
-d '{
"model": "claude-sonnet-4-6",
"max_tokens": 1024,
"content": "What did we just discuss?"
}'
Body
Standard Anthropic Messages fields PLUS:
| Field | Type | Required | Description |
|---|
model | string | yes | Any catalog model id. |
max_tokens | number | yes | Response cap. |
content | string | content blocks | yes | The NEW user turn (NOT a messages array). |
stream | boolean | no | When true, returns Anthropic SSE. Not supported with tools (yet). |
tools | string[] | no | Array of registered tool IDs (see /v1/tools). qlaud handles the dispatch loop. |
system, tool_choice, temperature, top_p, stop_sequences | — | no | Passed through to upstream verbatim. |
Response
Standard Anthropic Messages response with two extras attached:
{
"id": "msg_xxx",
"type": "message",
"role": "assistant",
"content": [{"type": "text", "text": "..."}],
"stop_reason": "end_turn",
"usage": { "input_tokens": 12, "output_tokens": 18 },
"thread_id": "2f1d0c7f-...",
"seq": 4,
"cost_micros": 465
}
When stream: true, the response is text/event-stream instead and the
thread/seq attribution lands in headers:
content-type: text/event-stream
x-qlaud-thread-id: 2f1d0c7f-...
x-qlaud-assistant-seq: 4
Cross-shape works: pass model: "gpt-5.4" to a thread of Claude turns and
qlaud translates transparently. The conversation history persists; the
underlying model can change per request.
GET /v1/threads — List
curl 'https://api.qlaud.ai/v1/threads?end_user_id=user_42&limit=20' \
-H "x-api-key: $QLAUD_API_KEY"
Query
| Param | Default | Description |
|---|
limit | 20 (max 100) | Page size. |
end_user_id | — | Narrow to one of your end-users. |
Response
{
"object": "list",
"data": [
{
"id": "2f1d0c7f-...",
"object": "thread",
"end_user_id": "user_42",
"metadata": {"plan": "pro"},
"created_at": 1777262997717,
"last_active_at": 1777263012890
}
]
}
GET /v1/threads/:id — Get one
Returns the same shape as a list entry. 404 if you don’t own the thread or
it’s been soft-deleted.
GET /v1/threads/:id/messages — List turns
curl 'https://api.qlaud.ai/v1/threads/$THREAD_ID/messages?limit=50' \
-H "x-api-key: $QLAUD_API_KEY"
Query
| Param | Default | Description |
|---|
limit | 50 (max 200) | Page size. |
after_seq | — | Cursor for forward paging. |
Response
{
"object": "list",
"data": [
{
"seq": 1,
"role": "user",
"content": "My name is Bob.",
"request_id": null,
"created_at": 1777262997900
},
{
"seq": 2,
"role": "assistant",
"content": [{"type": "text", "text": "Got it, Bob!"}],
"request_id": "msg_xxx",
"created_at": 1777262998765
}
],
"has_more": false,
"next_after_seq": 2
}
DELETE /v1/threads/:id — Soft delete
curl -X DELETE https://api.qlaud.ai/v1/threads/$THREAD_ID \
-H "x-api-key: $QLAUD_API_KEY"
Soft delete — the row stays for audit, hard-delete cron sweeps later.
Subsequent GETs return 404.
Errors
| Status | Meaning |
|---|
| 400 | Invalid body (missing model/max_tokens/content); streaming + tools combo |
| 401 | Bad / revoked qlk key |
| 402 | Wallet exhausted OR per-key cap exceeded |
| 404 | Thread not found OR not owned by caller |
Limits (v1)
- History capped at last 50 turns when loading for the upstream call. Token-aware truncation comes later.
- Streaming +
tools combo not yet supported. Use one or the other.
- Each turn embeds asynchronously into /v1/search — search becomes available within a few seconds of the turn persisting.