Tools are functions the assistant can call mid-conversation. Three flavours:
- Webhook tools — you host an HTTP endpoint, qlaud signs + POSTs
to it, your code runs the business logic. Documented below.
- Built-in tools — pick a handler from qlaud’s curated catalog
(web search, image gen, send email, Slack/Linear/Zendesk/GitHub/Notion
actions, code execution), supply your provider API key, no webhook
to host. See /v1/builtins.
- MCP servers — connect any Model Context Protocol
server URL (Linear, Stripe, Atlassian, Sentry, your own) and we
surface every tool it exposes. Zero wrappers to write — vendors
already wrote them. See /v1/mcp-servers.
All three register into the same per-account name namespace and look
identical to the model at dispatch time. Pick built-in for the curated
common path, MCP for full vendor coverage, webhooks for custom
business logic no public tool can express.
When the assistant emits a tool_use block, qlaud dispatches it
(webhook POST or in-process handler), awaits the result, appends a
tool_result, re-calls the assistant, and loops until a non-tool-use
turn — same behaviour either flavour.
Kills the per-app tool-call state machine. You write one HTTP handler per
custom tool; qlaud owns the rest.
All /v1/tools endpoints are master-key only. Tool registration is
control plane (set up by you, the developer) — not data plane.
Per-user qlk_live_… keys you’ve minted for your end-users will get
403 here. This is by design: a leaked per-user key shouldn’t let an
end-user point a tool’s webhook_url at attacker.com (qlaud signs
the dispatch payload, which contains user input + thread id +
end_user_id), squat on tool names, or revoke your registered tools.
POST /v1/tools — Register
curl https://api.qlaud.ai/v1/tools \
-H "x-api-key: $QLAUD_API_KEY" \
-H "content-type: application/json" \
-d '{
"name": "get_weather",
"description": "Get current weather for a location",
"input_schema": {
"type": "object",
"properties": {"location": {"type": "string"}},
"required": ["location"]
},
"webhook_url": "https://my-app.example/qlaud/tools/weather",
"timeout_ms": 15000
}'
Body
| Field | Type | Required | Default | Description |
|---|
name | string | yes | — | The function name passed to the LLM. Per-account unique among non-revoked tools. |
description | string | yes | — | Forwarded to the LLM as the tool description. |
input_schema | JSON Schema | yes | — | Forwarded as input_schema. |
webhook_url | string | yes | — | Must be https://. qlaud POSTs the tool_use payload here. |
timeout_ms | number | no | 30000 | Per-tool override of the 30 s default. Max 120000. |
Response (201)
{
"id": "tool_f5d6afcef39743bcbb3114ce3b9c8e66",
"object": "tool",
"name": "get_weather",
"description": "Get current weather for a location",
"input_schema": { "type": "object", "...": "..." },
"webhook_url": "https://my-app.example/qlaud/tools/weather",
"timeout_ms": 15000,
"secret": "wsk_AbCdEf...XyZ",
"created_at": 1777262997717
}
secret is returned once. You use it to verify the HMAC-SHA256 signature
on every webhook delivery. Lose it and you’ll need to revoke + re-register
the tool to get a new one.
curl https://api.qlaud.ai/v1/tools -H "x-api-key: $QLAUD_API_KEY"
Returns every non-revoked tool you’ve registered. secret is not included.
curl -X DELETE https://api.qlaud.ai/v1/tools/$TOOL_ID \
-H "x-api-key: $QLAUD_API_KEY"
Soft revoke. New thread messages can’t reference the tool by id; existing
thread audits still resolve cleanly.
How registered webhooks reach the model
Once you’ve registered a webhook tool with POST /v1/tools, getting it
in front of the model takes one of two shapes — pick based on whether
you want auto-discovery or explicit listing.
Recommended: dynamic discovery (default)
Send your thread message with no tools array. tools_mode
defaults to "dynamic" and qlaud injects 4 meta-tools. The model then
calls qlaud_search_tools(intent: "...") and every webhook you’ve
registered appears in the results — alongside built-ins and catalog
MCP connectors. The search is over the full tools table for your
account; there’s no kind filter.
# tools_mode defaults to "dynamic" because no `tools` array is passed.
# The model auto-discovers your webhook via qlaud_search_tools.
curl https://api.qlaud.ai/v1/threads/$THREAD_ID/messages \
-H "x-api-key: $USER_KEY" \
-d '{
"model": "claude-sonnet-4-6",
"max_tokens": 1024,
"content": "What is the weather in Tokyo right now?",
"stream": true
}'
The model will call qlaud_search_tools({intent: "current weather"}),
your weather webhook will be in the results, and the model will
invoke it through qlaud_multi_execute. qlaud POSTs the tool input to
your webhook_url, your endpoint returns the JSON, qlaud streams the
result back into the same SSE.
Explicit: pin a fixed list
If you want only specific tools available (no auto-discovery), pass
tools: ["tool_xxx", ...] with the tool IDs from
POST /v1/tools. tools_mode defaults to "explicit" in this case;
no meta-tools are injected.
curl https://api.qlaud.ai/v1/threads/$THREAD_ID/messages \
-H "x-api-key: $USER_KEY" \
-d '{
"model": "claude-sonnet-4-6",
"max_tokens": 1024,
"content": "What is the weather in Tokyo right now?",
"tools": ["tool_a3b4c5d6e7f8..."]
}'
Streaming works for either
stream: true flows the dispatch loop through a single SSE
connection regardless of tool kind — webhook, builtin, or MCP all
emit the same qlaud.tool_dispatch_start / qlaud.tool_dispatch_done
events. See the
Streaming section in /api-reference/threads
for the full event vocabulary and a worked example.
Model support
| Path | Today |
|---|
Non-streaming dispatch loop (stream: false, tools registered) | ✅ Every model in the catalog (Claude, GPT, Gemini, DeepSeek, Mistral, Qwen, Grok, Groq, etc.) |
Streaming dispatch loop (stream: true, tools registered) | ✅ Every Anthropic-shape AND OpenAI-shape host — same content_block_delta event vocabulary at the client regardless of upstream |
Streaming without tools (stream: true, no tools) | ✅ Every model |
The cross-shape SSE bridge translates each upstream’s native
streaming format (Anthropic content_block_delta passthrough,
OpenAI delta.tool_calls[].function.arguments chunks → equivalent
Anthropic events) so the qlaud.tool_dispatch_* event vocabulary
fires identically across providers. Vertex’s native Gemini SSE
shape is a small follow-up; until then, route Gemini through the
AI Studio OpenAI-compat endpoint (the default) for full streaming
Webhook contract
When the assistant emits a tool_use block, qlaud POSTs the following
payload to your webhook_url:
X-Qlaud-Timestamp: 1777262997717
X-Qlaud-Signature: <hex hmac-sha256 of "{timestamp}.{body}">
X-Qlaud-Tool-Id: tool_f5d6afcef...
X-Qlaud-Request-Id: msg_xxx
content-type: application/json
Body
{
"tool_id": "tool_f5d6afcef...",
"tool_use_id": "toolu_xxx",
"name": "get_weather",
"input": { "location": "San Francisco" },
"request_id": "msg_xxx",
"thread_id": "2f1d0c7f-..."
}
Expected response
{ "output": "It is 72°F sunny in San Francisco" }
output can be a string or any JSON value (objects/arrays get stringified
before going back to the assistant). To signal a non-fatal error so the
model can decide what to do:
{ "output": "rate limit hit", "is_error": true }
Verifying the signature
import hmac, hashlib
def verify(headers, body_bytes, secret):
ts = headers["X-Qlaud-Timestamp"]
sig = headers["X-Qlaud-Signature"]
payload = f"{ts}.{body_bytes.decode()}".encode()
expected = hmac.new(
secret.encode(), payload, hashlib.sha256
).hexdigest()
return hmac.compare_digest(sig, expected)
Loop semantics
- Iteration cap: 8 by default. Hitting it returns a partial conversation with
stop_reason: "tool_loop_limit".
- Parallel dispatch: when the model emits multiple
tool_use blocks in one turn, qlaud dispatches all of them in parallel via Promise.all. Total latency = max(per-webhook), not sum.
- Retries: 3 attempts with exponential backoff (250 ms / 1 s / 4 s) on 5xx + network errors. 4xx terminates immediately and the result is sent back to the assistant as
is_error: true so it can decide how to proceed.
- Webhook timeout: 30 s default,
timeout_ms per-tool override.
- Cross-provider: same Anthropic-shape
tool_use/tool_result semantics whether the underlying model is Claude or GPT or DeepSeek. You write one handler.
POST /v1/threads/:id/messages supports stream: true together with
tools. qlaud opens one upstream call per iteration of the dispatch
loop, tees the response, pipes the customer-facing branch through
verbatim, and inspects the other branch for tool_use blocks. Between
iterations qlaud injects extra SSE events so the UI can render tool
progress inline.
SSE events seen by the customer:
Standard Anthropic events flow as-is during each iteration
(message_start, content_block_start, content_block_delta,
content_block_stop, message_delta, message_stop). Block indexes
reset on every message_start — match tool dispatches by
tool_use_id, not by index.
qlaud-injected events around each tool dispatch:
data: {"type":"qlaud.tool_dispatch_start","tool_use_id":"toolu_xxx",
"name":"web_search","iteration":1}
data: {"type":"qlaud.tool_dispatch_done","tool_use_id":"toolu_xxx",
"name":"web_search","iteration":1,"is_error":false,
"output":{"query":"…","results":[…]}}
Iteration boundary (only emitted for iteration 2 onward):
data: {"type":"qlaud.iteration_start","iteration":2}
Terminal events:
data: {"type":"qlaud.done","iterations":3,"hit_max_iterations":false}
# OR, on mid-stream failure (we've already sent 200, can't change status):
data: {"type":"qlaud.error","message":"…","status":502,"iteration":2}
Provider support: streaming + tools currently requires an
Anthropic-native passthrough host (Anthropic API, Bedrock-Anthropic,
Vertex-Anthropic). Other providers return 503 if you combine
stream: true with tools — drop stream: true to use the
non-streaming dispatch loop.
Errors
| Status | Meaning |
|---|
| 400 | Invalid body (missing required field, non-https:// URL, schema not an object) |
| 409 | A non-revoked tool with the same name already exists |
| 401 | Bad / revoked qlk key |
| 403 | Caller used a per-user (standard-scope) key. All /v1/tools endpoints require a master (admin-scope) key. |
| 404 | Tool not found OR not owned by caller |