Architecture
AI-in-a-Box is a compose-first platform with a React frontend, Go gateway, Python and Go services, shared identity libraries, model routing, audit receipts, and optional integrations.
Request Path
Core Services
| Service | Port | Role |
|---|---|---|
| Frontend | 80 | Chat UI and admin UI. |
| Gateway | 8080 | Auth, routing, admin guard, tenant/principal stamping, CapTokens, turn context. |
| Agent Runtime | 8001 / 9001 | Chat orchestration, tools, skills, MCP, approvals, custom agents, sessions. |
| Guardrail | 8002 / 9002 | Input/output safety checks and guardrail turn events. |
| Memory | 8003 / 9003 | Typed memories with structured scope and memory turn events. |
| Inference Router | 8004 | OpenAI-compatible route to vLLM/OpenRouter, usage metadata, observability emission. |
| Code Sandbox | 8006 / 9006 | Docker/E2B code execution and artifacts. |
| Knowledge | 8007 / 9007 | Document RAG, wiki, visibility policy, RAG turn events. |
| Audit | 8008 / 9008 | Hash-chain audit log, turn events, envelopes, receipts, proofs. |
| Observability | 8009 | First-party generation usage, prices, budgets, traces API. |
| egauth | 8010 | Optional NTLM/EWS credential verification and JWT minting. |
| egmail | 8009 in compose profile context | Hosted email MCP support. |
| Teams MCP | 8011 | Microsoft 365/Teams MCP integration. |
| Docs Site | 3100 in dev override | Docusaurus docs site. |
Supporting infrastructure includes PostgreSQL, Redis, Qdrant, MinIO, ClickHouse/Langfuse, Firecrawl, SearXNG, Dify, and optional Airbyte.
Transport Split
USE_GRPC=true is the compose default. Gateway uses gRPC for routes whose proto contracts are full-fidelity and falls back to HTTP for richer or proof-heavy routes:
| Area | Default behavior |
|---|---|
| Chat streaming | HTTP passthrough for multimodal/SSE support. |
| Memory writes | gRPC for simple writes; REST for search/list/update details. |
| Guardrail input/output | gRPC. |
| Knowledge wiki read/write | gRPC; search stays REST for full response fields. |
| Audit event append | gRPC; admin listing, receipts, and proofs stay REST. |
| Inference router | Always HTTP. |
Identity Layer
The gateway validates user JWTs, strips inbound identity headers, and stamps trusted downstream identity:
X-Tenant-IDfrom JWTtenantorsub.X-User-ID,X-User-Email,X-User-Roles.- HMAC-signed
X-Aibox-Principal. X-Aibox-Turn-IdandX-Aibox-Cap-Token.
Most internal service calls also use short-lived Keycloak service JWTs. Inference-router model/route endpoints are a current exception.
Turn Events and Receipts
Every gateway-created request gets a turn ID. Participating services publish TurnEvent records to audit over gRPC. Audit stores events, seals terminal turns into signed envelopes, and exposes:
- Gateway-facing receipt APIs:
/v1/receipts,/v1/receipts/{turn_id}, and/v1/receipts/{turn_id}/proof. - Direct audit-service development APIs:
/v1/turns/{turn_id}and/v1/turns/{turn_id}/proof.
Receipt verification checks the turn-event Merkle root, envelope signature, audit-chain anchor, and exported suffix.
Data Stores
| Store | Used by |
|---|---|
| PostgreSQL | Audit, observability, Keycloak, Dify, service metadata. |
| Redis | Sessions, cache, rate limiting. |
| Qdrant | Memory vectors, knowledge vectors, and extracted document chunk/text payloads. |
| MinIO | Wiki/object artifacts, Dify storage, Langfuse buckets, and general object storage. |
| ClickHouse | Langfuse analytics. |
Optional Integrations
| Integration | Notes |
|---|---|
| Dify | Visual workflows. Configure providers through inference-router where possible. |
| Firecrawl/SearXNG | Web search and scraping tools. |
| Airbyte | Optional data connector profile. |
| E2B | Optional sandbox backend. |
| Cloudflare Tunnel | Optional public ingress profile. |
Deployment
Use make up for the local dev stack, make up-gpu for local vLLM, and make up-prod for production-mode compose. The docs assume gateway-routed APIs unless they explicitly call out a dev-only direct service port.