Skip to main content

Architecture

AI-in-a-Box is a compose-first platform with a React frontend, Go gateway, Python and Go services, shared identity libraries, model routing, audit receipts, and optional integrations.

Request Path

Core Services

ServicePortRole
Frontend80Chat UI and admin UI.
Gateway8080Auth, routing, admin guard, tenant/principal stamping, CapTokens, turn context.
Agent Runtime8001 / 9001Chat orchestration, tools, skills, MCP, approvals, custom agents, sessions.
Guardrail8002 / 9002Input/output safety checks and guardrail turn events.
Memory8003 / 9003Typed memories with structured scope and memory turn events.
Inference Router8004OpenAI-compatible route to vLLM/OpenRouter, usage metadata, observability emission.
Code Sandbox8006 / 9006Docker/E2B code execution and artifacts.
Knowledge8007 / 9007Document RAG, wiki, visibility policy, RAG turn events.
Audit8008 / 9008Hash-chain audit log, turn events, envelopes, receipts, proofs.
Observability8009First-party generation usage, prices, budgets, traces API.
egauth8010Optional NTLM/EWS credential verification and JWT minting.
egmail8009 in compose profile contextHosted email MCP support.
Teams MCP8011Microsoft 365/Teams MCP integration.
Docs Site3100 in dev overrideDocusaurus docs site.

Supporting infrastructure includes PostgreSQL, Redis, Qdrant, MinIO, ClickHouse/Langfuse, Firecrawl, SearXNG, Dify, and optional Airbyte.

Transport Split

USE_GRPC=true is the compose default. Gateway uses gRPC for routes whose proto contracts are full-fidelity and falls back to HTTP for richer or proof-heavy routes:

AreaDefault behavior
Chat streamingHTTP passthrough for multimodal/SSE support.
Memory writesgRPC for simple writes; REST for search/list/update details.
Guardrail input/outputgRPC.
Knowledge wiki read/writegRPC; search stays REST for full response fields.
Audit event appendgRPC; admin listing, receipts, and proofs stay REST.
Inference routerAlways HTTP.

Identity Layer

The gateway validates user JWTs, strips inbound identity headers, and stamps trusted downstream identity:

  • X-Tenant-ID from JWT tenant or sub.
  • X-User-ID, X-User-Email, X-User-Roles.
  • HMAC-signed X-Aibox-Principal.
  • X-Aibox-Turn-Id and X-Aibox-Cap-Token.

Most internal service calls also use short-lived Keycloak service JWTs. Inference-router model/route endpoints are a current exception.

Turn Events and Receipts

Every gateway-created request gets a turn ID. Participating services publish TurnEvent records to audit over gRPC. Audit stores events, seals terminal turns into signed envelopes, and exposes:

  • Gateway-facing receipt APIs: /v1/receipts, /v1/receipts/{turn_id}, and /v1/receipts/{turn_id}/proof.
  • Direct audit-service development APIs: /v1/turns/{turn_id} and /v1/turns/{turn_id}/proof.

Receipt verification checks the turn-event Merkle root, envelope signature, audit-chain anchor, and exported suffix.

Data Stores

StoreUsed by
PostgreSQLAudit, observability, Keycloak, Dify, service metadata.
RedisSessions, cache, rate limiting.
QdrantMemory vectors, knowledge vectors, and extracted document chunk/text payloads.
MinIOWiki/object artifacts, Dify storage, Langfuse buckets, and general object storage.
ClickHouseLangfuse analytics.

Optional Integrations

IntegrationNotes
DifyVisual workflows. Configure providers through inference-router where possible.
Firecrawl/SearXNGWeb search and scraping tools.
AirbyteOptional data connector profile.
E2BOptional sandbox backend.
Cloudflare TunnelOptional public ingress profile.

Deployment

Use make up for the local dev stack, make up-gpu for local vLLM, and make up-prod for production-mode compose. The docs assume gateway-routed APIs unless they explicitly call out a dev-only direct service port.