Chat
Send an authenticated chat turn through the UI, understand the streaming events you see in the timeline, and retrieve the audit receipt for the completed turn.
Prerequisites
- Quickstart completed and the stack running (
make healthreports OK). - Signed in to http://localhost as
adminortestuser.
Steps
1. Start a conversation
Open http://localhost and send a message. Responses stream as server-sent events, so the UI renders tokens, tool activity, artifacts, guardrail decisions, and the final answer as they arrive.
The current /v1/chat request schema still requires tenant_id and
user_id. Treat this as a schema wart: gateway traffic derives both from the
signed principal and overwrites the body values before execution, so the
fields must be present but cannot expand caller scope.
{
"messages": [
{"role": "user", "content": "Give me a short platform overview."}
],
"agent_id": "default",
"session_id": "optional-session",
"tenant_id": "default",
"user_id": "admin"
}
Direct API calls go through the gateway and require a Keycloak access token:
TOKEN=<paste-from-browser-devtools>
curl http://localhost:8080/v1/chat \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"messages": [{"role": "user", "content": "Hello."}],
"agent_id": "default",
"tenant_id": "default",
"user_id": "admin"
}'
2. Watch tool activity
Ask a question that benefits from tools:
Search the internal knowledge base for deployment runbook notes.
The stream emits typed events such as tool_call, tool_result,
trace_update, artifact, blocked, token, and done. The trace UI
summarizes model-visible progress; hidden reasoning is never rendered as raw
text.
3. Quote part of a response
Select any text inside an assistant message — a Quote to agent bubble appears above the selection. Click it and the excerpt is staged as a collapsed chip above the composer (remove it with the chip's ✕). Type your follow-up and send: the agent receives the quoted excerpt as a referenced snippet, so it knows exactly which part of its earlier reply you mean. The chip stays attached to your message in the transcript and survives a reload; clicking it scrolls back to — and briefly highlights — the original passage. With privacy mode on, the quoted text is de-identified on the same boundary as the rest of your input.
4. Upload a file
Use the paperclip control in the chat input, then ask a question about the
file. The platform indexes the document through the knowledge service and the
agent can call knowledge_search when answering. The full ingestion walkthrough
is in Upload Documents.
5. Start a new session
Use New Chat to start a fresh session. Session messages, durable memory, and knowledge are three separate stores:
- Session messages belong to that conversation only.
- Memory stores durable preferences, corrections, and project facts. See Memory.
- Knowledge stores uploaded documents and wiki pages.
6. Retrieve the receipt
Completed turns return receipt metadata when turn sealing is enabled. The
audit service exposes proof material under /v1/receipts/*:
curl "http://localhost:8080/v1/receipts/{turn_id}" \
-H "Authorization: Bearer $TOKEN"
curl "http://localhost:8080/v1/receipts/{turn_id}/proof" \
-H "Authorization: Bearer $TOKEN"
Tenant scoping is enforced from the principal — no X-Tenant-ID header is
required.
Verify
- A response renders end-to-end in the chat surface, including a final
doneevent in the stream. - The session appears in the sidebar after the first turn.
- The receipt endpoint above returns a JSON body containing a
turn_idand aproofsummary (withmerkle_rootandaudit_seq).
Troubleshooting
- 401 from
/v1/chat— your bearer token expired. Sign in again and copy a freshAuthorizationheader from browser devtools. - Inference error in the response — the chat surface shows friendly,
audience-aware error copy instead of a raw router string. Signed-in admins
get an error banner with a link into the Inference workspace
(
/admin/inference) to fix the root cause: add a missing provider API key forprovider_key_missing, enable or repair a provider forbackend_unavailable, or assign a model to a role forrole_not_configured. For a bundled local vLLM, confirm the container is ready withmake health.
Next
- Chat reference — full request/response contract and stream event types.
- Agents reference — managed agents vs. the default main-agent runtime.
Verified against commit 7f571493 (2026-06-11) · sources 0db2bd2775d0.