Chat
Chat requests use an OpenAI-style messages array plus explicit tenant and user fields. In normal browser use, the frontend fills those fields and the gateway derives trusted tenant/user headers from the authenticated JWT.
Request Shape
{
"messages": [
{"role": "user", "content": "Summarize this document"}
],
"agent_id": "default",
"session_id": "session-123",
"tenant_id": "default",
"user_id": "alice",
"model": "openai/gpt-5.4",
"effort": "medium",
"routing_mode": "auto",
"privacy_mode": false
}
content can be a string or a list of OpenAI-style content parts for multimodal turns.
Response Shape
{
"message": {
"role": "assistant",
"content": "Here is the summary...",
"tool_calls": [],
"processing_trace": {},
"receipt": {
"turn_id": "01J...",
"status": "pending",
"receipt_url": "/v1/receipts/01J...",
"proof_url": "/v1/receipts/01J.../proof"
}
},
"session_id": "session-123",
"agent_id": "default",
"receipt": {
"turn_id": "01J...",
"status": "pending",
"receipt_url": "/v1/receipts/01J...",
"proof_url": "/v1/receipts/01J.../proof"
}
}
The receipt starts as pending because audit sealing happens after terminal turn events arrive. The frontend polls /v1/receipts/{turn_id} until the receipt is sealed.
Streaming
Streaming uses typed SSE events, not the older untyped delta/done payload.
event: token
data: {"content":"The"}
event: tool_call
data: {"tool":"knowledge_search","arguments":"...","call_id":"call_1"}
event: done
data: {"session_id":"session-123","agent_id":"default","receipt":{"turn_id":"01J...","status":"pending"}}
Agent Selection
agent_id: "default" runs the adaptive main agent. DB-backed custom agents can also be selected. Subagent types such as coder and researcher are not top-level chat targets; the main agent invokes them through Delegate.
Webhooks
Agent-runtime has a direct /webhook endpoint. The gateway does not currently register /v1/webhook, so do not document webhook traffic as gateway-routed unless that route is added.