Compliance audit trail
The audit service stores two related integrity records:
- An append-only, per-tenant audit log with hash-chain linkage, HMAC signatures, and a separate append-only chain-head checkpoint table.
- Turn envelopes that seal one chat turn into a Merkle root, sign it, and anchor that signed envelope back into the audit log itself.
Receipts are the user-facing view of sealed envelopes; proofs let an offline verifier reconstruct the chain suffix without contacting the running platform.
Audit write flow
Authoritative implementation:
services/audit/src/audit_service/db.py— schema (audit_log,audit_chain_head,turn_events,turn_envelopes,turn_artifacts) and the immutabilityINSTEAD OF UPDATE/DELETE DO NOTHINGrules.services/audit/src/audit_service/store.py—insert_event_on_connection, key resolution,verify_chain.services/audit/src/audit_service/turn_sealing.py— RFC 8785 canonicalization, Merkle root (0x00leaf domain,0x01internal domain, duplicate-last-on-odd).services/audit/src/audit_service/turn_events.py—TurnEventStore, sealing logic, receipt proof export.services/audit/src/audit_service/turn_sealer.py— background watermark sealer.services/audit/src/audit_service/receipt_routes.py— gateway-facing receipt API.
Audit log row
Every row carries:
| Column | Purpose |
|---|---|
id | Serial PK (Postgres). |
seq | Per-tenant monotonic sequence number. Included in the signed payload and UNIQUE per tenant — gaps and reordering are detectable independently of the hash chain. |
timestamp | Application-side datetime.now(UTC) — included in the signature (not DB NOW()). |
tenant_id, user_id, action, resource_type, resource_id | Event identity. |
detail_type, detail | Typed JSON payload. |
previous_hash | SHA256(json({id, previous_hash, signature})) of the previous row in this tenant's chain; genesis is SHA256(json({type: "genesis", tenant_id})). |
signed_payload | Canonical JSON (sorted keys) of all signed fields including seq and previous_hash. |
signature | HMAC-SHA256(current_key, signed_payload). |
key_version | Label of the key used to sign this row. |
agent_id, agent_chain, delegated_by | Agent provenance. |
A second insert into audit_chain_head happens in the same transaction, recording {tenant_id, seq, audit_log_id, signature, previous_hash, chain_hash, key_version}. Truncating audit_log alone leaves the head checkpoint behind, which verify_chain flags.
Both tables have ON UPDATE DO NOTHING and ON DELETE DO NOTHING rules. A privileged operator could DROP RULE, but that itself is auditable at the Postgres level.
POST /v1/audit
Routes go through the gateway. Every event is recorded under the pinned
single_tenant_id() regardless of what the caller put in event.tenant_id,
so there is no cross-tenant write path that could fabricate audit history
under another tenant. A Principal missing a tenant_id is rejected with 400.
curl http://localhost:8080/v1/audit \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"tenant_id": "default",
"user_id": "alice",
"action": "knowledge.document.ingested",
"resource_type": "document",
"resource_id": "doc-123",
"detail": {"title": "Runbook"},
"result": "success"
}'
If the Postgres insert fails, the event is queued to a write-ahead log (audit_service/wal.py) and drained in the background instead of being lost.
Query events
curl "http://localhost:8080/v1/admin/audit?tenant_id=default&limit=20&offset=0" \
-H "Authorization: Bearer $TOKEN"
Filters: user_id, action, limit, offset. Listing requires the admin role and always targets the pinned single_tenant_id(); the tenant_id query parameter is ignored and there is no cross-tenant listing path.
Verify a tenant chain
curl "http://localhost:8080/v1/admin/audit/verify?tenant_id=default" \
-H "Authorization: Bearer $TOKEN"
tenant_id is required. Omit limit for a full chain walk; pass limit=N for a fast spot-check. The verifier runs five checks:
- Chain continuity — each row's
previous_hashmatchesSHA256(json({id, previous_hash, signature}))of the prior row. - HMAC validity — the stored
signed_payloadre-HMACs to the storedsignatureunder the key matchingkey_version. Keys are resolved fresh per verify run so a newly rotated previous key is accepted. - Sequence integrity — per-tenant
seqvalues form a1..Nrun. - Chain-head cross-check — every
audit_logrow has a matchingaudit_chain_headrow with the same signature and previous hash. - Cross-table truncation —
max(seq)andCOUNT(*)must match betweenaudit_logandaudit_chain_head.
Turn events
Services emit typed protobuf TurnEvent records:
| Payload | Emitter |
|---|---|
turn_started, turn_sealed | Gateway (TurnContextMiddleware) |
prompt_generated, tool_called, tool_returned, turn_failed | Agent runtime |
model_invoked, model_response | Agent runtime |
guardrail_verdict | Guardrail |
memory_op | Memory |
rag_chunks_retrieved | Knowledge |
cap_token_issued and cap_token_rejected exist in the protobuf/audit
decoder, but the gateway does not emit them today.
Every event carries turn_id, event_id, tenant_id, principal_id, emitter_service, emitter_instance (optional), occurred_at, sequence_in_service, payload_type, payload_json. The turn_events primary key is (tenant_id, turn_id, event_id), so re-delivery is idempotent and event_id collisions across tenants are impossible.
Turn envelopes
When a terminal event (turn_sealed or turn_failed) is appended, or when the background sealer detects a stalled turn older than TURN_SEAL_WATERMARK_SECONDS (default 300), audit:
- Canonicalizes each event with RFC 8785 (
rfc8785-v1). - Hashes each leaf:
SHA256(0x00 || canonical_event_bytes). - Builds the Merkle root: internal nodes use
SHA256(0x01 || left || right), odd final node duplicates itself. - Builds the envelope payload
{detail_type: "turn_envelope_v1", tenant_id, turn_id, status, seal_reason, canonicalization_version, event_count, event_ids, leaf_hashes, merkle_root}. - Canonicalizes the envelope payload, HMAC-signs it under the current audit key, and inserts a
turn.envelope.sealedrow intoaudit_log. - Inserts the
turn_envelopesrow anchoring the envelope to that audit-log row (audit_log_id,audit_seq,audit_previous_hash,audit_signature,audit_chain_hash).
status is completed if the events include turn_sealed, otherwise failed. seal_reason is terminal_event, watermark_timeout, or manual. Late events for an already-sealed turn are rejected; duplicate event_id on the same turn is a no-op.
Background sealing parameters:
| Env var | Default | Purpose |
|---|---|---|
TURN_SEAL_INTERVAL_SECONDS | 10 | Polling interval. |
TURN_SEAL_WATERMARK_SECONDS | 300 | Stalled-turn cutoff. |
TURN_SEAL_BATCH_LIMIT | 100 | Max turns sealed per pass. |
Receipt and turn APIs
Gateway-facing (auditProxy routes /v1/receipts):
| Endpoint | Purpose |
|---|---|
GET /v1/receipts | List sealed receipts for the caller's tenant. |
GET /v1/receipts/{turn_id} | Receipt metadata + envelope + ordered events. |
GET /v1/receipts/{turn_id}/proof | Receipt-shaped proof export. |
GET /v1/receipts/{turn_id}/bundle | Detail + proof in one response. |
Forensic replay endpoints are proxied under /v1/audit/turns/*:
| Endpoint | Purpose |
|---|---|
GET /v1/audit/turns/{turn_id}/replay?tenant_id=... | Ordered replay events plus artifact metadata. |
GET /v1/audit/turns/{turn_id}/artifacts/{event_id}/{kind}?tenant_id=... | Stream one captured artifact body. |
POST /v1/audit/turns/{turn_id}/replay/live | Live replay, carved out by the gateway to agent-runtime. |
POST /v1/audit/artifacts is a direct audit-service route used by internal
writers to store captured prompt/tool/model bodies before replay. It is not
part of the gateway carve-out in services/gateway/internal/proxy/proxy.go;
the public gateway route for reading those bodies is
GET /v1/audit/turns/{turn_id}/artifacts/{event_id}/{kind}.
Direct audit-service turn routes also exist:
| Endpoint | Purpose |
|---|---|
GET /v1/turns/{turn_id} | Direct service read of one sealed turn envelope and events. |
GET /v1/turns/{turn_id}/proof | Direct service proof export for one sealed turn. |
Receipts require tenant context (the Principal's tenant_id). Before seal, receipt detail/proof returns 404.
Receipt statuses:
| Status | Meaning |
|---|---|
pending | UI has a turn id but no sealed envelope yet. |
sealed | Audit has produced a signed envelope. |
verified | Client/UI has verified the proof. |
curl "http://localhost:8080/v1/receipts/$TURN_ID" \
-H "Authorization: Bearer $TOKEN"
Network egress events
The single egress gateway (Squid forward proxy) records every outbound request
as an audit event. The egress-shipper
(services/egress-shipper/src/egress_shipper/parser.py) tails the gateway's
Squid access log, parses each line, and writes it through the normal audit
write flow under the service principal service:egress.
Two actions are emitted:
| Action | Meaning |
|---|---|
egress.allow | Squid permitted the request (no DENIED code and HTTP status not 403/407). |
egress.deny | Squid blocked the request (DENIED in the Squid result code, or HTTP 403/407). |
Each event's resource_type is egress_destination, resource_id is the
host:port destination, and detail_type is squid_access_log. The detail
JSONB carries:
| Key | Meaning |
|---|---|
service | Calling container, derived from the proxy client's rDNS name (falls back to client_ip). |
destination | host:port the request was bound for. |
path | strict for the default allowlist port, allow-all for the firecrawl loopback port (3129). |
verdict | allow or deny (also encoded in action). |
bytes | Bytes returned to the client (nullable if unparseable). |
method | HTTP method. |
client_ip | Proxy client IP. |
squid_code | Squid result code (the part before /). |
http_status | HTTP status (the part after /). |
local_port | Squid listening port that served the request (%lp). |
hierarchy | Squid hierarchy/peer field. |
squid_ts | Original Squid log timestamp (ts.ms). |
username | Proxy auth username, or null when the log field is -. |
GET /v1/admin/audit/egress/stats
The admin console's Network egress tab
(docs-site/docs/admin/activity.md) reads server-side aggregates over these
events:
curl "http://localhost:8080/v1/admin/audit/egress/stats?group_by=destination" \
-H "Authorization: Bearer $TOKEN"
group_by is one of destination, service, path, verdict, hour, or
day. Optional from_date / to_date bound the window; limit (1–500,
default 50) caps the returned buckets. The response carries per-bucket
hits/allow/deny/bytes plus a summary (total, allow, deny, deny rate, distinct
destinations, bytes) computed over the full per-tenant egress set. The route
requires the admin role and is served directly by the audit service
(services/audit/src/audit_service/routes.py get_egress_stats).
Proof shape
/proof returns the envelope plus an audit_chain.suffix: every chain row from the sealing seq forward, with signed_payload, signature, previous_hash, key_version, and chain_hash. An offline verifier can replay the chain from the sealing point to whatever head is supplied.
Offline verification
scripts/aibox-verify verifies exported receipt/proof JSON without contacting the running platform:
scripts/aibox-verify \
--receipt receipt.json \
--proof proof.json \
--key-version v1 \
--key "$AUDIT_SIGNING_KEY"
It checks Merkle leaves, the envelope HMAC, the audit row HMAC, and chain-suffix continuity from the receipt row to the exported head.
This is not public non-repudiation. It is valid against the exported proof, the exported chain suffix, and the keys supplied by the verifier. Publishing chain heads to an external transparency log is outside the current implementation.
Signing keys
| Variable | Purpose |
|---|---|
AUDIT_SIGNING_KEY | Active HMAC key. Required, non-placeholder, at least 32 characters. |
AUDIT_KEY_VERSION | Version label stored with new rows (default v1). |
AUDIT_SIGNING_KEY_PREVIOUS | Previous key for verification during rotation. |
AUDIT_KEY_VERSION_PREVIOUS | Version label for the previous key. |
The active key is resolved at insert/seal time, so rotation can take effect without restarting the audit pod — new rows sign with the new key, old rows still verify against the previous version label.
Compliance and threat-model notes
Integrity checks enforced today:
- Any modification to a stored row invalidates the HMAC.
- Any reordering or deletion within a tenant chain produces a
seqgap and breaksprevious_hashlinkage. - Truncating
audit_logwhile leavingaudit_chain_head(or vice versa) is detected by the cross-table truncation check. - A sealed turn envelope binds every event's canonical bytes to a single Merkle root, signed and anchored into the tenant's hash chain — tampering with an event invalidates the leaf, the root, the envelope HMAC, and the audit row HMAC simultaneously.
- Receipts cannot cross tenants because
(tenant_id, turn_id)is the primary key on bothturn_envelopesand (with the tenant-scoped query path)turn_events.
What it does not guarantee:
- Non-repudiation to a third party. There is no external transparency-log publishing. A platform operator with the signing key could in principle re-sign a fabricated chain on a fresh database.
- Resistance to DDL-level tampering by a Postgres superuser (dropping the immutability rules). The chain-head + cross-table checks detect this after the fact but cannot prevent it.
- Per-event encryption. Forensic artifact bodies are sealed via the keystore (
ArtifactStore), butaudit_log.detailis plaintext JSON; do not write secrets into it.
Related
- Multi-Tenant Isolation
- Guardrails
- Observability
- Auth
- Repo doc:
docs/audit-compliance.md(operational compliance posture).
Verified against commit 701f8b2e (2026-06-11) · sources a2337b3517aa.