Quickstart

Get AI-in-a-Box running locally and send the first chat message.

Prerequisites

Requirement	Details
Git	Clone with submodules, or run `make up` so the dev UI can initialize them.
Docker	Docker Engine 24+ with Compose V2.
Make + Python 3	Required for the supported `make bootstrap` and `make up` path.
OpenSSL	Required by `make bootstrap` to generate local secrets.
Disk and network	First startup pulls and builds several service images.
OpenRouter key	Required for the default no-GPU model path. Local GPU mode can avoid external model calls.

1. Clone the repository

git clone --recurse-submodules https://github.com/Mapika/ai-in-a-box.git
cd ai-in-a-box

If you already cloned without submodules:

git submodule update --init --recursive

2. Create `deploy/.env`

Use the bootstrap script so required signing keys and service secrets are not left at placeholder values:

make bootstrap

Then edit deploy/.env.

For the default no-GPU path, set:

OPENROUTER_API_KEY=sk-or-v1-your-key-here
DEFAULT_MODEL=openai/gpt-5.4

For local GPU inference, set HUGGING_FACE_HUB_TOKEN if the selected VLLM_MODEL is gated, then use make up-gpu.

3. Start the platform

make up

make up starts the dev compose stack, initializes required submodules, and waits for service readiness. Use VERBOSE=1 make build or make logs when you need raw container output.

4. Open the UI

Open http://localhost.

Default development credentials are seeded from deploy/config/keycloak/realm-aibox.json and the generated deploy/.env:

Account	Use
`admin`	Platform admin user
`testuser`	Regular user

The password for admin is KEYCLOAK_ADMIN_PASSWORD in deploy/.env only for the Keycloak admin console. The realm user's initial password is defined in the realm import for development and should be changed before any shared environment.

5. Send a chat message

Ask:

What can you help me with?

The chat response streams token-by-token. Tool calls, reasoning output, routing metadata, and receipt status appear in the message timeline when the turn uses those features.

6. Check observability

For the dev override, direct service ports are bound to localhost. Langfuse is available at:

http://localhost:13000

The first-party observability API is exposed through the gateway under /v1/admin/observability/* and stores generation events from the inference router.

Local vs External Inference

The platform is sovereign by deployment design, but the default quickstart uses OpenRouter so a machine without a GPU can run immediately. Use make up-gpu with a local VLLM_MODEL when inference must stay inside your environment.

Prerequisites​

1. Clone the repository​

2. Create deploy/.env​

3. Start the platform​

4. Open the UI​

5. Send a chat message​

6. Check observability​

Local vs External Inference​

What to Try Next​