API reference
External callers hit the gateway at http://localhost:8080 (or the deployment's public URL). The gateway terminates JWT auth, strips spoofable identity headers, mints signed X-Aibox-* headers, applies the admin path guard plus per-tenant rate limiting, and proxies to the owning service over gRPC or HTTP.
Routing table
Routes are mounted in services/gateway/main.go and services/gateway/internal/proxy/proxy.go. The Go http.ServeMux longest-pattern match decides which handler wins, so service-specific paths (/v1/auth/installer-handoff, /v1/admin/inference/*, /v1/chat/agui/stream, /v1/oauth/.../callback) preempt the catch-all /v1/ proxy.
Public auth surface
| Method | Path | Auth | Backend |
|---|---|---|---|
GET | /v1/auth/config | none | Gateway (returns {passwordLoginEnabled, …}). |
GET | /v1/auth/idps | none | Gateway (sanitized Keycloak IdP list). |
POST | /v1/auth/login | none | Gateway → Keycloak ROPC; routes through mobile-push 2FA when enforced. |
POST | /v1/auth/login/2fa | challenge token | Gateway → auth /login-challenge/*. |
POST | /v1/auth/refresh | refresh token | Gateway → Keycloak. |
POST | /v1/auth/installer-handoff | one-shot UUID | Agent Runtime. |
GET | /v1/auth/verify | bearer | Gateway (sentinel for protected upstream surfaces). |
Agent runtime
| Method | Path | Notes |
|---|---|---|
POST | /v1/chat | Synchronous chat turn. |
POST | /v1/chat/stream | Typed SSE stream. |
POST | /v1/chat/agui/stream | AG-UI protocol SSE stream (gateway translates upstream events). |
| any | /v1/agents, /v1/agents/* | Managed-agent catalog + CRUD. |
| any | /v1/sessions, /v1/sessions/* | Chat sessions, branching, titles, folder moves. |
| any | /v1/folders, /v1/folders/* | Chat folder CRUD. |
| any | /v1/conversations, /v1/conversations/* | Durable Conversations + Items + Runs (gated on AGENT_RUNTIME_CONVERSATIONS_ENABLED). |
GET / POST | /v1/notifications, /v1/notifications/* | Durable per-user notification inbox: GET /v1/notifications (list, limit/cursor), GET /v1/notifications/unread-count, POST /v1/notifications/{id}/read, POST /v1/notifications/read-all. Self-scoped to the caller; retention capped per user (NOTIFICATIONS_MAX_PER_USER, evicting read notifications before unread). Consumed by the in-app notification viewer in the web UI — a header bell with an unread badge and a dropdown inbox (mark-read / mark-all-read), kept live by polling unread-count and refreshing the list on open; high-urgency notifications also surface as toasts. |
POST | /v1/internal/notifications | Service-auth-only producer entrypoint: enqueue a notification for a target (tenant_id, user_id). Caller must be in the (fail-closed) producer allow-list; payload is bounded (≤ 8 KiB, ≤ 8 levels deep). Idempotent: a fresh enqueue returns 201; a retry carrying a previously-seen idempotency_key returns 200 with the original notification (no duplicate row or audit entry). Not part of the user surface. |
| any | /v1/skills, /v1/skills/* | User-editable Skills (tenant + personal). |
| any | /v1/connectors, /v1/connectors/* | Mounted by the gateway, but no agent-runtime route is mounted for /v1/connectors in the current app; use /v1/admin/context/connectors* for connector administration. |
GET | /v1/overview | Workspace overview (mail providers, agents, recent activity, inference providers). |
| any | /v1/context/* | Per-turn context surfaces / trace. |
| mixed | POST /v1/oauth/{provider_slug}/start, GET /v1/oauth/{provider_slug}/callback, POST /v1/oauth/{provider_slug}/poll, GET /v1/oauth/{provider_slug}/status, DELETE /v1/oauth/{provider_slug} | Per-user OAuth flows (callback bypasses Bearer chain). |
| any | /v1/user/* | User-scoped: secrets, profile, MCP servers, plugins, approvals, integrations status. |
| any | /v1/admin/mission-control, /v1/admin/users, /v1/admin/users/*, /v1/admin/roles, /v1/admin/onboarding/*, /v1/admin/policies/*, /v1/admin/mcp-servers/*, /v1/admin/plugins/*, /v1/admin/tenant-secrets/*, /v1/admin/oauth-providers/*, /v1/admin/connector-capabilities/*, /v1/admin/agentops/*, /v1/admin/approvals, /v1/admin/approvals/, /v1/admin/context/* | Admin surfaces. /v1/admin/approvals is list-only; approve/reject are user routes. |
POST | /v1/webhook, /v1/webhook/* | HMAC-authenticated; bypasses Bearer chain. |
Connector administration routes exposed through agent-runtime:
| Method | Path |
|---|---|
GET / POST | /v1/admin/context/connectors |
POST | /v1/admin/context/sync |
GET | /v1/admin/context/connectors/{connector_id} |
POST | /v1/admin/context/connectors/{connector_id}/check |
GET / POST | /v1/admin/context/connectors/{connector_id}/surfaces |
Exact user-scoped gateway mux patterns from proxy.go:
| Pattern | Surface |
|---|---|
/v1/user/secrets, /v1/user/secrets/ | Per-user secret CRUD and validation. |
/v1/user/me | Current verified user profile summary. |
/v1/user/profile, /v1/user/profile/ | Profile read/update, draft, and dismissal routes. |
/v1/user/integrations/status | Connected integration status summary. |
/v1/user/mcp-servers, /v1/user/mcp-servers/ | User MCP server list, opt-out, and health. |
/v1/user/plugins, /v1/user/plugins/ | User plugin visibility preferences. |
/v1/user/approvals, /v1/user/approvals/ | Human-in-the-loop approval inbox and decisions. |
/v1/workflows is mounted by the gateway, but no agent-runtime route is mounted
for it in the current agent_runtime.main app. Treat it as a reserved or stale
gateway pattern until a backing route is implemented.
Other services
| Method | Path | Backend |
|---|---|---|
| any | /v1/memory, /v1/memory/* | Memory. |
| any | /v1/knowledge/* | Knowledge (documents, wiki, audiences, artifacts, items). |
| any | /v1/guard/* | Guardrail. |
| any | /v1/sandbox/* | Code Sandbox (prefix stripped before forwarding). |
GET | /v1/models, /v1/routes | Inference Router (always HTTP). |
POST | /v1/audit | Audit append. |
| any | /v1/admin/audit, /v1/admin/audit/* | Audit admin / query. |
GET | /v1/receipts, /v1/receipts/{turn_id}, /v1/receipts/{turn_id}/proof, /v1/receipts/{turn_id}/bundle | Audit receipts and offline proof bundles. |
| any | /v1/audit/turns/* | Audit forensic detail; POST …/replay/live carved out to Agent Runtime. |
| any | /v1/admin/observability/* | Observability. |
| any | /v1/account/devices, /v1/account/devices/* | auth (mobile-push device management; handlers moved from egauth in the egauth→auth split). |
Gateway-owned admin
The gateway hosts an operator surface backed by docker-compose. All paths require an admin-role JWT (or an AIBOX_ADMIN_EMAIL_ALLOWLIST match that promotes the caller). Mounted at:
/v1/admin/{status,topology,stack,services,secrets,backups,deploy,update,jobs,auth/egauth,inference,egress,sso/idps}
/v1/admin/inference/* is a reverse-proxy to the inference-router's internal surface (/v1/admin/inference/<rest> → /v1/internal/<rest>): the provider CRUD (/v1/internal/providers, /v1/internal/providers/{id}, /v1/internal/providers/{id}/models) plus the runtime model-role map (/v1/internal/roles, /v1/internal/roles/{role}), where admins assign which model serves each role alias. /v1/admin/egress/* reverse-proxies the egress control plane's runtime allowlist (/v1/internal/egress/*) that governs whether inference may leave the box. /v1/admin/sso/idps is only mounted when KEYCLOAK_ADMIN_* credentials are present.
Identity headers
The gateway emits these to every downstream call. All inbound copies are stripped first (services/gateway/internal/middleware/auth.go).
| Header | Direction | Source |
|---|---|---|
Authorization: Bearer ... | client → gateway | Required when auth is enabled. Swapped for a Keycloak service token on internal proxy hops. |
X-Correlation-ID | gateway → downstream | Generated if absent. |
X-Tenant-ID | gateway → downstream | Pinned to tenancy.single_tenant_id. The gateway ignores the inbound JWT tenant claim and rejects a mismatched inbound X-Tenant-ID with 403. |
X-User-ID, X-User-Email, X-User-Roles | gateway → downstream | Verified JWT claims. |
X-Aibox-Principal | gateway → downstream | HMAC-signed canonical Principal (see services/shared-identity/aibox_identity/principal.py). |
X-Aibox-Turn-Id | gateway → downstream | Gateway-minted turn id. |
X-Aibox-Cap-Token | gateway → downstream | HMAC capability token bound to turn id + principal + scopes (services/shared-identity/aibox_identity/turnctx.py). |
X-User-Timezone | client → gateway → agent-runtime | Optional; seeds user profile timezone on first contact. |
gRPC metadata uses lowercase keys without the x- prefix: aibox-turn-id, aibox-cap-token (see turnctx.py).
Chat
Synchronous
services/agent-runtime/src/agent_runtime/routes/chat.py — POST /chat:
curl http://localhost:8080/v1/chat \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"messages": [{"role": "user", "content": "What is 2 + 2?"}],
"session_id": "s1",
"tenant_id": "default",
"user_id": "alice",
"agent_id": "default"
}'
tenant_id and user_id in the body are overridden by the gateway-derived principal — they are kept for direct dev-service calls. Response shape mirrors agent_runtime.types.ChatResponse:
{
"message": {
"role": "assistant",
"content": "4",
"processing_trace": { "...": "..." },
"receipt": {
"id": "01J...",
"turn_id": "01J...",
"tier": "attestable",
"status": "pending",
"sealed": false,
"receipt_url": "/v1/receipts/01J...",
"proof_url": "/v1/receipts/01J.../proof",
"bundle_url": "/v1/receipts/01J.../bundle"
}
},
"session_id": "s1",
"agent_id": "default",
"receipt": { "...": "as above" }
}
status starts as pending because the turn envelope is sealed asynchronously by audit after the terminal events arrive. The frontend polls /v1/receipts/{turn_id} until sealed=true.
Streaming SSE
POST /v1/chat/stream returns text/event-stream with one event: <type>\ndata: <json>\n\n record per event (services/agent-runtime/src/agent_runtime/streaming.py):
| Event | Payload |
|---|---|
session_created | { "session_id": "..." } |
conversation_created | { "conversation_id": "..." } |
thinking | { "content": "..." } (reasoning text). |
token | { "content": "..." } |
tool_call | { "tool", "arguments", "call_id"? } |
tool_result | { "tool", "result", "call_id"?, "child_run_id"? } (truncated to 2000 chars). |
artifact | Sandbox/agent artifact metadata, keyed by call_id. |
approval_required | { "session_id", "agent_id", "approvals": [...], "content"? } |
trace_update | { "patch": {...} } (late processing-trace patch). |
blocked | { "reason": "..." } (guardrail block). |
retract | { "reason", "original_length" } (late output-guardrail retraction). |
user_redaction | { "display_content", "categories_scrubbed", "entities_redacted" } |
mcp_failures | { "failures": [{ "server_id", "server_name", "error" }] } |
done | { "session_id", "agent_id", "routing"?, "receipt"?, "verified_answer"?, "assistant_message_id"?, "user_message_id"? } |
error | { "message": "...", "code"?: "..." } — code is an optional stable identifier (examples: provider_key_missing, role_not_configured, backend_unavailable) so clients can render friendly copy without string-matching prose. |
curl -N http://localhost:8080/v1/chat/stream \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"messages": [{"role": "user", "content": "Explain transformers"}],
"tenant_id": "default",
"user_id": "alice",
"session_id": "s1"
}'
AG-UI streaming
/v1/chat/agui/stream is an AG-UI protocol SSE endpoint backed by services/gateway/internal/agui/. The gateway forwards the request to agent-runtime's /v1/chat/stream and translates the legacy events into AG-UI's canonical vocabulary: RUN_STARTED, TEXT_MESSAGE_{START,CONTENT,END}, TOOL_CALL_{START,ARGS,END,RESULT}, REASONING_*, RUN_FINISHED, RUN_ERROR, CUSTOM.
Wire format:
Content-Type: text/event-stream- One SSE record per event:
data: <JSON>\n\n(noevent:line; type is in the JSON body). - Field names are camelCase (
messageId,toolCallId,threadId,runId). typevalues are UPPER_SNAKE_CASE.
curl -N http://localhost:8080/v1/chat/agui/stream \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"messages": [{"role": "user", "content": "What is 2 + 2?"}],
"session_id": "s1",
"tenant_id": "default",
"user_id": "admin"
}'
Sample plain-text turn:
data: {"type":"RUN_STARTED","threadId":"s1","runId":"a3f2c1..."}
data: {"type":"TEXT_MESSAGE_START","messageId":"m-1","role":"assistant"}
data: {"type":"TEXT_MESSAGE_CONTENT","messageId":"m-1","delta":"4"}
data: {"type":"TEXT_MESSAGE_END","messageId":"m-1"}
data: {"type":"RUN_FINISHED","threadId":"s1","runId":"a3f2c1..."}
Events without a first-class AG-UI counterpart (HITL approvals, late retractions, MCP failures, attached artifacts, processing-trace patches, conversation-id resolution) are surfaced as CUSTOM:
data: {"type":"CUSTOM","name":"approval_required","value":{"approvals":[...]}}
The legacy /v1/chat/stream endpoint is unchanged. New frontend code should target AG-UI.
Receipts
curl "http://localhost:8080/v1/receipts/$TURN_ID" \
-H "Authorization: Bearer $TOKEN"
curl "http://localhost:8080/v1/receipts/$TURN_ID/proof" \
-H "Authorization: Bearer $TOKEN"
curl "http://localhost:8080/v1/receipts/$TURN_ID/bundle" \
-H "Authorization: Bearer $TOKEN"
Receipt detail, proof, and bundle are only available after the turn envelope has sealed. See Compliance Audit Trail.
Errors
The gateway returns:
401— missing/invalid bearer (auth enabled).403— admin path guard rejected the caller.413— request body overservices.gateway.max_request_body_bytes.429— rate limit exceeded (per-tenant + per-IP limiters).502 {"error":"service unavailable"}— downstream proxy error (services/gateway/internal/proxy/proxy.go).
When no inference provider is reachable, agent-runtime does not return a 502 body; the streaming loop surfaces it as an in-stream error event with code: "backend_unavailable". The code is assigned by classify_model_failure (services/agent-runtime/src/agent_runtime/chat_handler/errors.py), which trusts the inference-router's own code when present and otherwise derives one from the error text.
Related
Verified against commit 5842cdb1 (2026-06-18) · sources 5962499b26b1.