Skip to main content

AIBox

AIBox runs a self-contained AI platform from the repository's Compose files: a React frontend, a Go gateway, a set of Python services (agent runtime, guardrail, memory, knowledge, audit, sandbox), an inference router, and the backing stores (Keycloak, PostgreSQL, Redis, Qdrant, MinIO). Together they deliver chat, agent tools, RAG, memory, code execution, audit receipts, and model routing from one local deployment.

git clone --recurse-submodules https://github.com/egroup-labs/aibox.git
cd aibox
make ensure-secrets
make up

Open http://localhost. The admin console lives at http://localhost/admin. Default-realm seed accounts (admin, testuser) and their generated passwords are documented in the Quickstart.

What runs

make up brings up the dev compose stack defined in deploy/docker-compose.yml plus deploy/docker-compose.dev.yml. make up GPU=single adds local vLLM. For a lighter laptop footprint use make up-lite, and make dev-select to choose which services run locally (real/stub/off), driven by config/dev-bundles.toml. Everything else is opt-in through profiles and compose overlays.

Core capabilities

AreaWhat it does
Chat and agentsOne adaptive main agent per session, with focused subagents spawned natively by Codex when the main agent delegates.
ToolsSearch, scraping, knowledge, memory, code execution, MCP-backed tools, and user-approved actions.
Model routingOpenAI-compatible router with OpenRouter (default) and local vLLM backends, usage capture, route metadata, and observability.
MemoryTyped memories scoped by tenant, user, agent, and session. Runtime prefetch injects relevant notes before the model runs.
KnowledgeContextual document RAG and a markdown wiki with tenant, owner, audience, role, and visibility checks.
GuardrailsInput/output scanners, optional Constitutional AI, audit logging, and turn-event emission.
Code sandboxPer-session execution containers behind /v1/sandbox/*, with a Docker backend.
IdentityKeycloak OIDC with PKCE, mobile-push 2FA, signed X-Aibox-Principal, CapTokens, and service-to-service JWTs.
Audit receiptsHash-chained audit log plus signed turn envelopes, proof exports, /v1/receipts/*, and offline verification.

Inference modes

The default make up quickstart routes models through OpenRouter, so no local GPU is required once deploy/secrets/openrouter_api_key contains a valid key. For local inference, run the GPU profile and route models to vLLM:

make up GPU=single # single local vLLM container
make up GPU=multi # Gemma + Qwen behind the inference router
make up GPU=vision # adds a local vision model (dev only)

No data leaves your environment unless you configure an external model provider or other outbound integration.

Next pages


Verified against commit 5187b91e (2026-06-11) · sources d38d8ad498a4.