Defense-in-depth security architecture for the Masters Of AI Harness. Each layer is independent — a bypass at one layer is caught by the next.
For the comprehensive security narrative (threat model, data flow diagrams, known limitations, audit history), see security.md.
The harness runs LLM agents with tool access (shell, web, files, A2A). Threats include:
- Prompt injection: Malicious content in fetched web pages or memory influencing agent behavior
- SSRF: Agent tricked into fetching internal/private network resources
- Credential leakage: Agent exfiltrating secrets via shell, web, or A2A tools
- Lateral movement: Compromised agent accessing other agents or users
- DNS rebinding: Attacker flipping DNS after validation to reach internal hosts
- Fork bombs: Resource exhaustion via unlimited worker spawning
- IPC injection: Compromised worker injecting crafted WebSocket frames
Shell commands receive a minimal allowlist of environment variables (PATH, HOME, TERM, TZ, LANG, USER). Agent credentials are never passed to shell processes. This prevents accidental or malicious credential access via env, printenv, or $VAR expansion.
All outbound HTTP requests (web_fetch, web_search, A2A) are validated against blocked IP ranges:
- RFC 1918 private ranges (10.x, 172.16-31.x, 192.168.x)
- Loopback (127.x, ::1)
- Link-local (169.254.x, fe80::)
- Cloud metadata (169.254.169.254)
Bypass hardening (Wave 4):
- IPv4-mapped IPv6: Both dotted (
::ffff:127.0.0.1) and hex-short (::ffff:7f00:1) forms are normalized to IPv4 before checking. The hex-short form is whatnew URL()produces. - Exotic IP encoding: Decimal (
2130706433), hex (0x7f000001), and octal (0177.0.0.1) representations are parsed and normalized (defense-in-depth —new URL()normalizes most of these) - Protocol restriction: Only
http:andhttps:URLs are allowed —file://,data:,ftp://etc. are rejected - Redirect chaining: Each redirect hop re-validates the target URL against the SSRF blocklist and egress filter
- DNS rebinding / TOCTOU: Known limitation — the validate-then-fetch pattern has a narrow TOCTOU window. True DNS pinning for HTTPS requires undici dispatcher customization (tracked for future enhancement). The SSRF blocklist is the primary defense; rebinding requires attacker-controlled DNS.
Agents declare which credentials they need in frontmatter. The CredentialStore enforces least-privilege:
- Strict mode (credentials declared): Only granted keys are available, scoped to specific tool domains
- Legacy mode (no credentials block): All keys available (backward compatible)
- Audit logging: Every credential resolution is logged with domain and key names
See credentials.md for the credential format reference.
Agents declare allowed outbound domains in frontmatter. The EgressFilter blocks requests to any domain not on the allowlist. Supports exact match and wildcard patterns (*.supabase.co). Works with both web tools and A2A tools.
Fetched web content is wrapped in <fetched_content> structural tags. Memory from prior sessions is wrapped in <memory_context> tags. The system prompt includes instructions for the model to treat tagged content as untrusted and never follow instructions within it.
This is a probabilistic defense — it reduces the success rate of prompt injection but doesn't eliminate it. It works alongside the deterministic layers above.
Multi-layer tool gating:
- Global config: Tool domains enabled/disabled in
config.yaml - Agent filter: Per-agent tool allow/deny in frontmatter
- Sandbox policy: Remote sessions restrict shell access
- User deny: Per-user tool restrictions in
access.yaml - Operation allowlist: Per-agent operation-level restrictions (e.g., read-only Braintree)
- Tokens stored as SHA-256 hashes in
access.yaml(never plaintext) - Constant-time comparison prevents timing attacks
- Connected WebSocket clients store token hash, not raw token
- WS query parameter tokens deprecated (log warning) — prefer Authorization header
- Per-user rate limiting, connection limits, idle timeouts
- A2A server requires bearer token auth on JSON-RPC endpoint
- Fork-per-session: Each WebSocket conversation runs in an isolated child process
- IPC protocol: Typed, validated messages between parent and worker — frame types checked against an allowlist before relay to WebSocket
- Frame sanitization: Worker frames are reconstructed with only known fields before relay — extra properties from a compromised worker are stripped
- Worker capacity: Configurable
maxWorkerscap (default 20) prevents fork bombs - Worker lifecycle: Ready timeout (30s), idle timeout, crash cleanup with pending approval rejection
- Sandbox (bubblewrap): Filesystem and network namespace isolation for CLI mode
- Remote sandbox policy: Shell disabled by default in serve mode
- Per-user directories: Workspace, memory, and logs are isolated per remote user
- External MCP: Command-based MCP servers blocked in serve mode (arbitrary command execution)
All incoming WebSocket messages are validated against Zod schemas before processing:
subscribe—agentIdmax 200 chars,sessionIdmax 200 chars,lastMessageIdnon-negative integer bounded toNumber.MAX_SAFE_INTEGERmessage—contentmin 1, max 200,000 charactersinterrupt,ping— empty payloadtool_approval—toolIdnon-empty,approvedbooleanconsent_granted—policyVersionnon-empty
The TypeScript WsClientMessage type is derived directly from the Zod schema via z.infer<>, making the schema the single source of truth. Schema/type divergence is structurally impossible.
Outgoing WebSocket messages are typed as WsServerMessage — the safeSend helper enforces this at compile time, catching protocol drift.
- Structured logging: JSON objects to stdout. Never logs: message content, API keys, tokens, or full tool inputs/outputs
- Health monitoring: Shallow (
/health, no auth) and deep (/health/deep, admin-only, cached 30s) endpoints - Worker pool stats: Active/max/utilization reported in deep health checks
- Bounded arrays: Prune-on-insert strategy for error/success tracking (every 1,000 insertions, trim entries > 1 hour old)
User Input
→ canUseTool (tool access control)
→ CredentialStore (credential scoping)
→ EgressFilter (domain allowlist)
→ validateUrl (SSRF + DNS pinning)
→ fetch (with pinned IP)
→ Content tagging (<fetched_content>)
→ Model (with content boundary instructions)
Security is configured at three levels:
- Global (
~/.mastersof-ai/config.yaml): Tool domain enable/disable - Agent (frontmatter in
IDENTITY.md): Credentials, egress, tool filters, operations - User (
~/.mastersof-ai/access.yaml): Token auth, agent access, tool deny, budgets
- Security Narrative — Comprehensive security documentation for audit
- Agent Security Guide — How to author secure agents
- Credentials Reference — Credential configuration format
- Sandbox — Process isolation details
- Configuration — Full config reference