Security Hardening Notes from the Early Audit
Historical post: This post records an early hardening sprint. Some implementation details have changed since it was written. Use the current Security guide, Authentication reference, and Audit Trail reference as the source of truth.
Before shipping AI-in-a-Box to production, we ran a comprehensive security audit across all services. We found 27 vulnerabilities: 5 critical, 8 important, and 14 medium. This post captures the findings and the intended remediation work from that point in time.
The audit process
We reviewed every service for the OWASP Top 10, then added AI-specific checks: prompt injection pathways, guardrail bypass scenarios, and agent tool abuse vectors. Each finding was categorized by severity and given a specific remediation.
Critical fixes
These were stop-shipping issues that could lead to data exposure or remote code execution.
SSRF via URL parameters. The web scraping tool accepted arbitrary URLs from the agent. An attacker could craft a prompt that made the agent scrape http://169.254.169.254/latest/meta-data/ (the AWS metadata endpoint) or internal service URLs. Fix: we added URL validation with an allowlist of schemes (http/https only) and a denylist of internal IP ranges (RFC 1918, link-local, loopback). The Firecrawl integration now rejects requests to non-routable addresses.
Command injection in sandbox. The code sandbox passed user input directly into shell commands via string formatting. A carefully crafted filename like ; rm -rf / could escape the intended command. Fix: we replaced string interpolation with proper argument arrays and added input sanitization. The exec_run call now uses ["bash", "-c", command] with the command as a single argument, and file paths go through _validate_path before use.
Path traversal in file operations. The sandbox file read/write endpoints accepted paths like ../../etc/passwd without validation. Fix: we added the _validate_path static method that rejects absolute paths, null bytes, and .. components both before and after normalization. All file operations are confined to /workspace.
Guardrail fail-open. When the guardrail service was unreachable (network error, timeout, crash), the agent runtime silently continued without safety checks. Fix: we changed the guardrail client to fail-closed. If the guardrail service returns an error or times out, the request is blocked with an explicit error message. The system prompt is checked on every request, not just the first.
Audit chain race condition. The hash-chained audit log used a simple in-memory counter for chain sequencing. Under concurrent requests, two events could get the same sequence number, breaking the chain's integrity guarantee. Fix: we moved chain sequencing to a PostgreSQL sequence with a SELECT ... FOR UPDATE lock, ensuring strict ordering even under high concurrency.
Important fixes
These were significant security gaps that required attention before production use.
httpx client reuse. Several services created a new httpx.AsyncClient on every request, leaking file descriptors under load. More importantly, this meant no connection pooling and no shared TLS session cache. Fix: we moved to a single shared client per service, created at startup and closed on shutdown, with proper connection limits and timeouts.
Session eviction. There was no mechanism to invalidate active sessions. If a user's access was revoked in Keycloak, their existing JWT continued to work until expiration. Fix: we added a token revocation check to the API gateway that validates tokens against Keycloak's introspection endpoint on a configurable interval.
Missing rate limiting on auth endpoints. The login and token refresh endpoints had no rate limiting, allowing brute-force attacks. Fix: we added per-IP rate limiting with exponential backoff on the gateway's auth routes.
Agent tool abuse. Early delegation prototypes made recursive subagent
launches too easy. The current runtime exposes a bounded Delegate tool:
configured subagents return one final result to the main agent and cannot
dispatch further subagents.
Webhook URL validation. The Dify integration accepted arbitrary webhook URLs for workflow callbacks without validation. Fix: same URL validation as the SSRF fix, applied to all outbound HTTP calls from the platform.
Memory scope leakage. A bug in the memory service's query filtering allowed agents to read memories from other tenants when the tenant_id parameter was omitted from the query. Fix: we made tenant_id a required parameter at the API level, with the gateway injecting it from the JWT claims so agents cannot override it.
Langfuse credentials in logs. The inference router logged full request/response bodies for debugging, which included the Langfuse API keys in headers. Fix: we added a header sanitization pass that redacts Authorization, X-Api-Key, and similar headers before logging.
Unencrypted inter-service communication. Services communicated over plain HTTP within the Docker network. While the Docker network provides some isolation, this is insufficient for compliance requirements. Fix: we added mTLS support via the service mesh (Istio/Linkerd) in Kubernetes deployments, and documented TLS configuration for Docker Compose setups.
Medium fixes
These improved defense-in-depth without addressing immediate exploits.
Docker healthchecks. None of the service containers had healthchecks defined. Docker Compose and Kubernetes had no way to detect a hung service. Fix: we added /health endpoints to every service and corresponding HEALTHCHECK directives in Dockerfiles.
Kubernetes security contexts. The Helm chart deployments ran containers as root by default. Fix: we added securityContext blocks to all pod specs: runAsNonRoot: true, readOnlyRootFilesystem: true, allowPrivilegeEscalation: false, and capabilities.drop: [ALL].
NetworkPolicy. The Kubernetes deployment had no NetworkPolicy resources, meaning any pod could communicate with any other pod. Fix: we added NetworkPolicy resources that restrict traffic to the minimum required paths (e.g., only the agent runtime can reach the guardrail service).
CORS configuration. The API gateway accepted * as the allowed origin in the default configuration. Fix: we changed the default to reject all cross-origin requests, requiring administrators to explicitly configure allowed origins.
Missing Content-Security-Policy headers. The frontend nginx configuration served pages without CSP headers. Fix: we added a strict CSP that restricts script sources to the application's own origin.
Redis without AUTH. The default Docker Compose configuration ran Redis without a password. Fix: we added a generated password to deploy/.env.example and configured all services to use it.
MinIO default credentials. The default MinIO deployment used minioadmin/minioadmin. Fix: same approach as Redis, generated credentials in deploy/.env.example.
Qdrant without API key. The vector database was accessible without authentication from any container on the Docker network. Fix: we enabled Qdrant's API key authentication and added the key to the service configuration.
Log injection. User-supplied strings were written directly into structured logs without sanitization. A user could inject fake log entries. Fix: we switched to structured JSON logging with proper escaping across all Python services.
Sandbox image pinning. The sandbox worker image tag was latest, meaning container content could change unpredictably. Fix: we pinned to a specific digest in the configuration.
Guardrail policy caching. Guardrail policies were fetched from the database on every request with no caching. Under load, this created a hot path to PostgreSQL. Fix: we added a TTL cache (default 60 seconds) for policy lookups, with cache invalidation on policy updates.
Missing request size limits. The API gateway had no maximum request body size, allowing memory exhaustion via large uploads. Fix: we added a configurable max_body_size (default 10MB) to the gateway.
Dependency pinning. Python requirements.txt files used >= version specifiers. Fix: we switched to exact pins with hashes for reproducible builds.
TLS certificate validation. The external LLM gateway did not verify TLS certificates when connecting to OpenRouter. Fix: we enabled certificate verification with the system CA bundle.
Lessons learned
Three patterns stood out:
-
Fail-closed by default. Every security boundary (guardrails, auth, rate limiting) should block requests when the checking mechanism is unavailable. Fail-open is the most dangerous default in a security-sensitive system.
-
Multi-tenant isolation is an ongoing discipline. Adding
tenant_idto every table is the easy part. The hard part is ensuring every query path, every cache key, and every error message respects the tenant boundary. -
AI-specific attack surfaces are real. SSRF via agent tool calls, prompt injection to bypass guardrails, and recursive agent spawning are not theoretical. They showed up in our audit and needed specific mitigations.