Quickstart
Get AI-in-a-Box running locally and send the first chat message.
Prerequisites
| Requirement | Details |
|---|---|
| Git | Clone with submodules, or run make up so the dev UI can initialize them. |
| Docker | Docker Engine 24+ with Compose V2. |
| Make + Python 3 | Required for the supported make bootstrap and make up path. |
| OpenSSL | Required by make bootstrap to generate local secrets. |
| Disk and network | First startup pulls and builds several service images. |
| OpenRouter key | Required for the default no-GPU model path. Local GPU mode can avoid external model calls. |
1. Clone the repository
git clone --recurse-submodules https://github.com/Mapika/ai-in-a-box.git
cd ai-in-a-box
If you already cloned without submodules:
git submodule update --init --recursive
2. Create deploy/.env
Use the bootstrap script so required signing keys and service secrets are not left at placeholder values:
make bootstrap
Then edit deploy/.env.
For the default no-GPU path, set:
OPENROUTER_API_KEY=sk-or-v1-your-key-here
DEFAULT_MODEL=openai/gpt-5.4
For local GPU inference, set HUGGING_FACE_HUB_TOKEN if the selected VLLM_MODEL is gated, then use make up-gpu.
3. Start the platform
make up
make up starts the dev compose stack, initializes required submodules, and waits for service readiness. Use VERBOSE=1 make build or make logs when you need raw container output.
4. Open the UI
Open http://localhost.
Default development credentials are seeded from deploy/config/keycloak/realm-aibox.json and the generated deploy/.env:
| Account | Use |
|---|---|
admin | Platform admin user |
testuser | Regular user |
The password for admin is KEYCLOAK_ADMIN_PASSWORD in deploy/.env only for the Keycloak admin console. The realm user's initial password is defined in the realm import for development and should be changed before any shared environment.
5. Send a chat message
Ask:
What can you help me with?
The chat response streams token-by-token. Tool calls, reasoning output, routing metadata, and receipt status appear in the message timeline when the turn uses those features.
6. Check observability
For the dev override, direct service ports are bound to localhost. Langfuse is available at:
http://localhost:13000
The first-party observability API is exposed through the gateway under /v1/admin/observability/* and stores generation events from the inference router.
Local vs External Inference
The platform is sovereign by deployment design, but the default quickstart uses OpenRouter so a machine without a GPU can run immediately. Use make up-gpu with a local VLLM_MODEL when inference must stay inside your environment.