/v1/threads

Threads are qlaud’s conversation primitive. Create a thread once, then send just the new user turn each call — qlaud loads the prior history, calls the upstream model, persists both turns, and returns the assistant message in standard Anthropic Messages shape. Kills the per-app messages table, the context-window loader, and the “how do I switch models mid-conversation” question — every endpoint that follows uses the same Anthropic shape regardless of underlying model.

POST /v1/threads — Create

curl https://api.qlaud.ai/v1/threads \
  -H "x-api-key: $QLAUD_API_KEY" \
  -H "content-type: application/json" \
  -d '{
    "end_user_id": "user_42",
    "metadata": {"plan": "pro", "feature": "/refunds"}
  }'

Body

Field	Type	Required	Description
`end_user_id`	string	no	Opaque id for YOUR end-user (distinct from your qlaud account). Used to filter `/v1/threads` listings + `/v1/search` results.
`metadata`	object	no	Arbitrary JSON. Stored verbatim, surfaced on read paths.

Response (201)

{
  "id": "2f1d0c7f-e2a1-40e4-8e21-182cf27deeb7",
  "object": "thread",
  "end_user_id": "user_42",
  "metadata": {"plan": "pro", "feature": "/refunds"},
  "created_at": 1777262997717,
  "last_active_at": 1777262997717
}

POST /v1/threads/:id/messages — Send a turn

The meat. Customer sends just the new user content; qlaud loads thread history, runs the upstream call, persists both turns, returns the assistant response.

curl https://api.qlaud.ai/v1/threads/$THREAD_ID/messages \
  -H "x-api-key: $QLAUD_API_KEY" \
  -H "content-type: application/json" \
  -d '{
    "model": "claude-sonnet-4-6",
    "max_tokens": 1024,
    "content": "What did we just discuss?"
  }'

Body

Standard Anthropic Messages fields PLUS:

Field	Type	Required	Description
`model`	string	yes	Any catalog model id.
`max_tokens`	number	yes	Response cap.
`content`	string \| content blocks	yes	The NEW user turn (NOT a `messages` array).
`stream`	boolean	no	When `true`, returns Anthropic SSE. Not supported with `tools` (yet).
`tools`	string[]	no	Array of registered tool IDs (see /v1/tools). qlaud handles the dispatch loop.
`system`, `tool_choice`, `temperature`, `top_p`, `stop_sequences`	—	no	Passed through to upstream verbatim.

Response

Standard Anthropic Messages response with two extras attached:

{
  "id": "msg_xxx",
  "type": "message",
  "role": "assistant",
  "content": [{"type": "text", "text": "..."}],
  "stop_reason": "end_turn",
  "usage": { "input_tokens": 12, "output_tokens": 18 },
  "thread_id": "2f1d0c7f-...",
  "seq": 4,
  "cost_micros": 465
}

When stream: true, the response is text/event-stream instead and the thread/seq attribution lands in headers:

content-type: text/event-stream
x-qlaud-thread-id: 2f1d0c7f-...
x-qlaud-assistant-seq: 4

Cross-shape works: pass model: "gpt-5.4" to a thread of Claude turns and qlaud translates transparently. The conversation history persists; the underlying model can change per request.

GET /v1/threads — List

curl 'https://api.qlaud.ai/v1/threads?end_user_id=user_42&limit=20' \
  -H "x-api-key: $QLAUD_API_KEY"

Query

Param	Default	Description
`limit`	`20` (max `100`)	Page size.
`end_user_id`	—	Narrow to one of your end-users.

Response

{
  "object": "list",
  "data": [
    {
      "id": "2f1d0c7f-...",
      "object": "thread",
      "end_user_id": "user_42",
      "metadata": {"plan": "pro"},
      "created_at": 1777262997717,
      "last_active_at": 1777263012890
    }
  ]
}

GET /v1/threads/:id — Get one

Returns the same shape as a list entry. 404 if you don’t own the thread or it’s been soft-deleted.

GET /v1/threads/:id/messages — List turns

curl 'https://api.qlaud.ai/v1/threads/$THREAD_ID/messages?limit=50' \
  -H "x-api-key: $QLAUD_API_KEY"

Query

Param	Default	Description
`limit`	`50` (max `200`)	Page size.
`after_seq`	—	Cursor for forward paging.

Response

{
  "object": "list",
  "data": [
    {
      "seq": 1,
      "role": "user",
      "content": "My name is Bob.",
      "request_id": null,
      "created_at": 1777262997900
    },
    {
      "seq": 2,
      "role": "assistant",
      "content": [{"type": "text", "text": "Got it, Bob!"}],
      "request_id": "msg_xxx",
      "created_at": 1777262998765
    }
  ],
  "has_more": false,
  "next_after_seq": 2
}

DELETE /v1/threads/:id — Soft delete

curl -X DELETE https://api.qlaud.ai/v1/threads/$THREAD_ID \
  -H "x-api-key: $QLAUD_API_KEY"

Soft delete — the row stays for audit, hard-delete cron sweeps later. Subsequent GETs return 404.

Errors

Status	Meaning
400	Invalid body (missing `model`/`max_tokens`/`content`); streaming + tools combo
401	Bad / revoked qlk key
402	Wallet exhausted OR per-key cap exceeded
404	Thread not found OR not owned by caller

Limits (v1)

History capped at last 50 turns when loading for the upstream call. Token-aware truncation comes later.
Streaming + tools combo not yet supported. Use one or the other.
Each turn embeds asynchronously into /v1/search — search becomes available within a few seconds of the turn persisting.

Inference

Substrate

Account

POST /v1/threads — Create

Body

Response (201)

POST /v1/threads/:id/messages — Send a turn

Body

Response

GET /v1/threads — List

Query

Response

GET /v1/threads/:id — Get one

GET /v1/threads/:id/messages — List turns

Query

Response

DELETE /v1/threads/:id — Soft delete

Errors

Limits (v1)

Inference

Substrate

Account

​POST /v1/threads — Create

​Body

​Response (201)

​POST /v1/threads/:id/messages — Send a turn

​Body

​Response

​GET /v1/threads — List

​Query

​Response

​GET /v1/threads/:id — Get one

​GET /v1/threads/:id/messages — List turns

​Query

​Response

​DELETE /v1/threads/:id — Soft delete

​Errors

​Limits (v1)

POST /v1/threads — Create

Body

Response (201)

POST /v1/threads/:id/messages — Send a turn

Body

Response

GET /v1/threads — List

Query

Response

GET /v1/threads/:id — Get one

GET /v1/threads/:id/messages — List turns

Query

Response

DELETE /v1/threads/:id — Soft delete

Errors

Limits (v1)