Skip to main content

Quickstart

Get AI-in-a-Box running locally and send the first chat message.

Prerequisites

RequirementDetails
GitClone with submodules, or run make up so the dev UI can initialize them.
DockerDocker Engine 24+ with Compose V2.
Make + Python 3Required for the supported make bootstrap and make up path.
OpenSSLRequired by make bootstrap to generate local secrets.
Disk and networkFirst startup pulls and builds several service images.
OpenRouter keyRequired for the default no-GPU model path. Local GPU mode can avoid external model calls.

1. Clone the repository

git clone --recurse-submodules https://github.com/Mapika/ai-in-a-box.git
cd ai-in-a-box

If you already cloned without submodules:

git submodule update --init --recursive

2. Create deploy/.env

Use the bootstrap script so required signing keys and service secrets are not left at placeholder values:

make bootstrap

Then edit deploy/.env.

For the default no-GPU path, set:

OPENROUTER_API_KEY=sk-or-v1-your-key-here
DEFAULT_MODEL=openai/gpt-5.4

For local GPU inference, set HUGGING_FACE_HUB_TOKEN if the selected VLLM_MODEL is gated, then use make up-gpu.

3. Start the platform

make up

make up starts the dev compose stack, initializes required submodules, and waits for service readiness. Use VERBOSE=1 make build or make logs when you need raw container output.

4. Open the UI

Open http://localhost.

Default development credentials are seeded from deploy/config/keycloak/realm-aibox.json and the generated deploy/.env:

AccountUse
adminPlatform admin user
testuserRegular user

The password for admin is KEYCLOAK_ADMIN_PASSWORD in deploy/.env only for the Keycloak admin console. The realm user's initial password is defined in the realm import for development and should be changed before any shared environment.

5. Send a chat message

Ask:

What can you help me with?

The chat response streams token-by-token. Tool calls, reasoning output, routing metadata, and receipt status appear in the message timeline when the turn uses those features.

6. Check observability

For the dev override, direct service ports are bound to localhost. Langfuse is available at:

http://localhost:13000

The first-party observability API is exposed through the gateway under /v1/admin/observability/* and stores generation events from the inference router.

Local vs External Inference

The platform is sovereign by deployment design, but the default quickstart uses OpenRouter so a machine without a GPU can run immediately. Use make up-gpu with a local VLLM_MODEL when inference must stay inside your environment.

What to Try Next