Skip to content

Security audit + SEC-001 fix: shared agent loader with validation#3

Open
roybotbot wants to merge 5 commits intodisler:mainfrom
roybotbot:sec-001
Open

Security audit + SEC-001 fix: shared agent loader with validation#3
roybotbot wants to merge 5 commits intodisler:mainfrom
roybotbot:sec-001

Conversation

@roybotbot
Copy link

Summary

Security audit of the full codebase identifying 15 issues, plus a fix for the highest-priority one.

Security Audit (SECURITY_AUDIT.md)

Comprehensive analysis covering all extensions, agent definitions, configuration files, and damage-control rules:

Severity Count
🔴 Critical 4
🟠 High 4
🟡 Medium 4
🟢 Low 3

Each issue includes a description, fix plan, and behavior change analysis.

SEC-001 Fix: Shared Agent Loader

agent-team.ts, agent-chain.ts, and pi-pi.ts each had duplicate parseAgentFile() functions that loaded agent .md files and passed raw system prompts into spawn() with no validation. A malicious .md file could inject content into subprocess arguments.

Fix: New shared loader at extensions/utils/agent-loader.ts that validates:

  • Name — alphanumeric + dashes/underscores/dots, max 64 chars, rejects shell metacharacters
  • Tools — checked against a known allowlist, warns on unknowns
  • System prompt — scanned for injection patterns ($(…), pipe to shell, null bytes, eval(), etc.), capped at 50K chars

Error-severity issues reject the agent at load time. Warning-severity issues allow loading but surface warnings. All 18 existing agent .md files load with zero warnings.

Tests

42 tests covering all validation logic:

npx tsx --test tests/agent-loader.test.ts

Files Changed

  • New: extensions/utils/agent-loader.ts — shared validated loader
  • New: tests/agent-loader.test.ts — 42 tests
  • New: SECURITY_AUDIT.md — full audit document
  • Modified: extensions/agent-team.ts — uses shared loader, removed duplicate code
  • Modified: extensions/agent-chain.ts — uses shared loader, removed duplicate code
  • Modified: extensions/pi-pi.ts — uses shared loader, removed duplicate code
  • Modified: README.md — documents validation, tests, and audit

Comprehensive security analysis of all extensions, agent definitions,
configuration files, and damage-control rules. Covers:
- 4 critical (subprocess injection, env inheritance, path bypass, ReDoS)
- 4 high (env sample, session files, input validation, foreign commands)
- 4 medium (swallowed errors, global trust, race conditions, weak heuristics)
- 3 low/informational (argv coupling, no sandboxing, parent path ref)
Replace duplicate parseAgentFile/scanAgentDirs across agent-team,
agent-chain, and pi-pi with a shared agent-loader.ts that validates:
- Agent names (alphanumeric + dashes/underscores/dots, max 64 chars)
- Tools (checked against known allowlist)
- System prompts (scanned for shell injection patterns, max 50K chars)

Invalid agents (error-severity) are rejected at load time. Suspicious
content (warning-severity) allows loading but surfaces warnings.

Includes 42 tests covering all validation logic.
All 18 existing agent .md files load with zero warnings.
scanAgentDirectory now walks directories recursively using
readdirSync({ recursive: true }), enabling nested organization:

  .pi/agents/
  ├── review_agents/
  │   ├── code_reviewer.md
  │   └── security_reviewer.md
  └── build_agents/
      └── ts_builder.md

Collision detection: when duplicate agent names (case-insensitive)
are found, first-wins and a CollisionWarning is returned with both
file paths. All three extension call sites (agent-team, agent-chain,
pi-pi) surface collisions via ui.notify() at session start.

New exports from agent-loader.ts:
  - CollisionWarning: { name, duplicatePath, originalPath }
  - ScanResult: { agents: Map, collisions: CollisionWarning[] }

7 new tests covering recursive walk, deep nesting, collision
detection, mixed flat+nested, frontmatter-less .md skipping,
and collision path validity. All 49 tests pass.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants