Deployment overview

The repository currently supports two delivery paths: a Docker Compose stack and an alpha Helm chart (data-plane only, not yet at parity). Compose pulls its topology from the active profile under config/profiles/ and secret files under deploy/secrets/. Helm uses Kubernetes values and Secret objects instead; it does not consume host files from deploy/secrets/.

When to use which

Path	Use when	Trade-off
Docker Compose	Single host, on-prem appliance, dev laptop, CI demo	One machine, no horizontal scale; the only path that boots end-to-end today.
Helm chart (core slice)	You already run Kubernetes and want login + chat with an external LLM provider	Memory, knowledge, audit, code-sandbox, MCPs, observability, firecrawl, and vllm are not templated yet — chat works, web search/RAG/code execution/audit trails do not. See Helm chart.

The Compose stack is the only surface validated by make pentest-stack, make smoke-gateway, and tests/smoke/test_compose_dev_lite.py. Helm is 0.2.0 in helm/aibox/Chart.yaml: the core slice (identity + chat with an external provider) deploys end-to-end, but it is not yet at Compose parity.

Topology

The gateway uses Redis for distributed rate limiting only when REDIS_URL is set; the shipped Compose profiles leave it unset, so the gateway falls back to an in-process limiter (services/gateway/main.go).

In the prod posture (with or without a GPU axis) everything except frontend:80 is closed off the host network and isolated onto an internal data network — see Compose modes and Single-port edge.

Outbound internet traffic can additionally be funneled through a single allowlist-enforced egress gateway (opt-in via AIBOX_COMPOSE_EGRESS=1, or the egress / egress_enforce inputs on the GitHub Actions deploy). It is observe-only by default — see Egress gateway.

Codex chat turns can optionally run inside a Firecracker microVM (opt-in via AIBOX_COMPOSE_CODEX_FIRECRACKER=1, the codex_firecracker overlay key in deploy/envs/config.<env>.toml, or aibox-ctl deploy --codex-firecracker). It is off by default and safe to enable — the backend falls back to the in-process Codex path when the host is unprovisioned or a guest boot fails.

Supported compose modes

scripts/compose.sh is the source of truth for posture mode names. Posture is orthogonal to the GPU axis: pick a posture, then select GPU support separately via AIBOX_GPU (off|single|multi|vision). The supported posture modes are:

Family	Modes
Development	`dev`, `dev-lite`
Production	`prod`
Bootstrap/internal	`base`, `install`

The dev-lite posture is configurable per service: make dev-select (or make up-lite) picks which services run as real (built locally), stub (lightweight stand-in), or off, driven by config/dev-bundles.toml groups and bundles. See Configurable local stack.

Profile matrix

Profiles live in config/profiles/*.toml and the active one is selected by writing its slug into config/profiles/.active (or via scripts/use-env.sh <profile>). make render-compose-env regenerates deploy/.compose.env from the active profile + secrets manifest.

There is one profile: single-tenant. It is the only profile shipped under config/profiles/.

Profile	Tenancy	Inference egress	Vision model
`single-tenant`	One pinned tenant (`tenancy.single_tenant_id`, default `default`)	Local-only out of the box — on first boot the inference-router seeds every role to the bundled vLLM model, and inference only leaves the box if an admin registers an external provider and allowlists it on the egress gateway	`qwen3-vl` (requires `make up GPU=vision`)

Customer/site-specific inputs live in deploy/envs/config.<env>.toml for the dev / demo / eg-prod environments and are consumed by deployment/CI automation. scripts/render-compose-env.py does not merge those files into deploy/.compose.env; it renders from the active profile plus secret files. The legacy production target was renamed to eg-prod in commit cfd3286d.

Configuration surface

Surface	File	Purpose
Compose env	`deploy/.compose.env`	Generated from active profile + secret listings by `make render-compose-env`. Do not edit by hand.
Active profile pointer	`config/profiles/.active`	Plain text file containing the active profile slug.
Profile TOML	`config/profiles/single-tenant.toml`	Topology + feature flags.
Deploy env input	`deploy/envs/config.{dev,demo,eg-prod}.toml`	Environment-specific deployment input; not merged by `make render-compose-env`.
Secrets manifest	`deploy/secrets.manifest.toml`	Declares the 52 secret files and how to generate them.
Secrets directory	`deploy/secrets/`	One file per secret, mode `0644`, dir `0700`.
Compose dispatcher	`scripts/compose.sh`	Single entry point that picks the right `-f` overlay set per mode.
Stack record	`deploy/aibox-stack.toml`	Written by `make stack-record`; used by `make down`/`make status` to recover the active mode.

Quick links

Compose modes — every up-* target, every overlay file.
Helm chart — what works today, what's missing.
Single-port edge — how the prod posture collapses everything behind one TLS port.
Secrets — make ensure-secrets, escrow, rotation, sealing.
Multi-vLLM router — Gemma + Qwen behind OpenResty on one GPU.

Verify

make plan         # dry-run the resolved compose mode (no containers touched)
make status       # compare the running stack against its recorded mode
make health       # readiness + liveness probes for every service

The recorded mode lives in deploy/aibox-stack.toml (printed by make stack-show). If make status reports drift, re-run the matching make up-* target — never docker compose up directly.

Verified against commit 5187b91e (2026-06-11) · sources 181b3bddde2d.

When to use which​

Topology​

Supported compose modes​

Profile matrix​

Configuration surface​

Quick links​

Verify​