Skip to content

feat: TDD refactoring workflows for runner.ts + main.rs decomposition#675

Open
khaliqgant wants to merge 15 commits intomainfrom
feat/refactor-workflows
Open

feat: TDD refactoring workflows for runner.ts + main.rs decomposition#675
khaliqgant wants to merge 15 commits intomainfrom
feat/refactor-workflows

Conversation

@khaliqgant
Copy link
Copy Markdown
Member

@khaliqgant khaliqgant commented Apr 1, 2026

Summary

TDD refactoring workflows for the two largest files in the codebase, executed end-to-end with full verification.

Targets

  • runner.ts (6,878 lines) — decomposed into 5 focused modules
  • main.rs (7,023 lines) — decomposition planned, broker/worker extraction attempted

E2E Results (full run with refactored code)

Wave Workflow Steps Status
1 runner-decomposition-plan 7/7
1 main-rs-decomposition-plan 6/6
2 extract-verification 6/6
2 extract-template-channel 6/6
2 extract-broker-worker See below
3 extract-step-executor 8/8

extract-broker-worker failure details

This workflow extracts BrokerState and WorkerRegistry from main.rs (Rust). It failed on both attempts:

Attempt 1 (6 passed, 1 failed, 2 skipped):

  • All agent steps succeeded — Codex extracted broker.rs and worker.rs
  • cargo-gate (deterministic cargo test verification) failed with exit code 1 — the extracted Rust modules broke compilation
  • cargo-review and downstream steps skipped

Attempt 2 (1 passed, 2 failed, 6 skipped):

  • read-broker and read-worker failed — the deterministic steps tried to read the extracted files but they were in a broken state from attempt 1
  • All downstream steps skipped

Root cause: Codex's Rust extraction produced code that didn't compile. This is an agent quality issue (the Codex worker made incorrect edits to the Rust source), not a workflow runner or infrastructure issue. The runner correctly detected the failure via the cargo-gate verification step and reported it.

Not a regression — no runner code paths were affected. The workflow SDK correctly:

  1. Spawned agents ✅
  2. Collected output ✅
  3. Ran verification gates ✅
  4. Detected the failure ✅
  5. Retried the wave ✅
  6. Reported final status ✅

Extracted Modules

Module Lines Tests
verification.ts 143 11 tests
template-resolver.ts 87 tests included
channel-messenger.ts 151 tests included
step-executor.ts 571 27+ characterization tests
process-spawner.ts 96

Test Results

  • SDK workflow tests: 88/88 pass ✅
  • Full repo tests: 672/672 pass ✅
  • TypeScript build: clean ✅
  • Rust tests: pass ✅

Bug Fixes (found during E2E)

  • stripInjectedTaskEcho missing in extracted verification.ts — Codex extraction lost the task-echo-stripping logic that prevents false positive verification. Restored.
  • agent.release() broker 400 race condition — broker returns HTTP 400 when releasing an already-exited agent. Was causing steps to fail even when the agent completed successfully. Added .catch() guards.
  • Legacy summary test assertion — pre-existing test expected old format the code no longer produces. Updated.
  • ESM imports — all 6 workflow files used require() in an ESM project. Fixed to import.
  • Executor exit coderun-refactor.ts didn't exit non-zero on single-wave failures. Fixed.

Tooling

  • run-continuous.sh — runs all waves sequentially, retries up to 3x per wave, auto-commits
  • Rebuilt broker binary from main (old binary lacked HTTP API server)

@khaliqgant khaliqgant requested a review from willwashburn as a code owner April 1, 2026 09:20
devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

…ules

Extracted 5 modules from runner.ts (6,878 lines):
- verification.ts (143 lines)
- template-resolver.ts (87 lines)
- channel-messenger.ts (151 lines)
- step-executor.ts (571 lines)
- process-spawner.ts (96 lines)

Added characterization tests for all extracted modules.
Extracted broker.rs and worker.rs from main.rs.

Bug fixes:
- Restore stripInjectedTaskEcho in verification.ts
- Guard agent.release() against broker 400 race condition
- Fix run-summary-table test for new table format
- Export normalizeModel for correct pricing resolution
- Fix --wave argument parsing in run-refactor.ts
- ESM imports in all workflow files
@khaliqgant khaliqgant force-pushed the feat/refactor-workflows branch from dc80aa5 to e6c86c6 Compare April 1, 2026 12:34
devin-ai-integration[bot]

This comment was marked as resolved.

khaliqgant and others added 3 commits April 1, 2026 21:47
tracker.ts: resolveModel now uses normalizeModel for alias resolution (pre-existing fix verified)
run-refactor.ts: --wave parsing with proper validation (pre-existing fix verified)
step-executor.ts: signal-killed processes now correctly treated as failures
channel-messenger.ts: replaced ReDoS-vulnerable regex with iterative indexOf stripping
runner.ts: eliminated shell injection by using direct git spawn with argument arrays
process-spawner.ts: fixed SIGKILL fallback timer leak by storing and clearing reference

Co-Authored-By: My Senior Dev <dev@myseniordev.com>
…ules

Extracted 5 modules from runner.ts (6,878 lines):
- verification.ts (143 lines)
- template-resolver.ts (87 lines)
- channel-messenger.ts (151 lines)
- step-executor.ts (571 lines)
- process-spawner.ts (96 lines)

Added characterization tests for all extracted modules.
Extracted broker.rs and worker.rs from main.rs.

Bug fixes:
- Restore stripInjectedTaskEcho in verification.ts
- Guard agent.release() against broker 400 race condition
- Fix run-summary-table test for new table format
- Export normalizeModel for correct pricing resolution
- Fix --wave argument parsing in run-refactor.ts
- ESM imports in all workflow files
devin-ai-integration[bot]

This comment was marked as resolved.

khaliqgant and others added 8 commits April 1, 2026 21:56
template-resolver.ts: shell-escape interpolated variables (CRITICAL #1)
broker_tests.rs: uncomment and wire up 5 real tests (CRITICAL #2)
worker_tests.rs: uncomment and wire up 5 real tests (CRITICAL #3)
worker.rs: log bypass-flag injection, add .. path traversal rejection (CRITICAL #4, #7)
verification.ts: export stripInjectedTaskEcho, add path traversal guard (CRITICAL #5)
runner.ts: remove duplicate stripInjectedTaskEcho, add ENV_ALLOWLIST filtering (HIGH #17)
channel-messenger.ts: add secret scrubbing, hoist regex constants (MEDIUM #27, #28)
process-spawner.ts: add settled guard for race condition (MEDIUM #23)
step-executor.ts: add sideEffects to callback type, deprecate alias (HIGH #15, #16)
index.ts: export StepExecutor directly (MEDIUM #29)
workflows/refactor/*.ts: replace hardcoded paths, remove --no-verify (HIGH #8-11)
broker.rs: move is_pid_alive to canonical location (HIGH #14)
cost/tracker.ts: add restrictive file permissions (MEDIUM #30)
cost/pricing.ts: add last-verified date (MEDIUM #31)
verification.test.ts: 9 new tests for exported helpers (MEDIUM #32)

Co-Authored-By: My Senior Dev <dev@myseniordev.com>
export function scrubSecrets(text: string): string {
let result = text;
for (const pattern of SECRET_PATTERNS) {
result = result.replace(pattern, '[REDACTED]');

Check failure

Code scanning / CodeQL

Polynomial regular expression used on uncontrolled data High

This
regular expression
that depends on
library input
may run slow on strings starting with '-----BEGIN\tPRIVATE\tKEY-----' and with many repetitions of '-----BEGIN\tPRIVATE\tKEY-----'.
Moved fix-mcp-spawn.yaml, add-swift-sdk.ts, and cli-observability.ts
into workflows/ci/ to clearly distinguish them as CI test suite
workflows. Updated .gitignore to allow workflows/ci/ and workflows/refactor/.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
devin-ai-integration[bot]

This comment was marked as resolved.

- Fix tracker test: expect mode: 0o700 in mkdirSync assertion
- Use Object.hasOwn() instead of `in` operator to avoid prototype chain false positives
- Use Promise.allSettled to preserve partial output on process timeout
- Apply path containment check for absolute paths in checkFileExists

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
devin-ai-integration[bot]

This comment was marked as resolved.

…and cwd trailing slash

- Rename StepExecutor interface in runner.ts to RunnerStepExecutor to avoid
  shadowing the StepExecutor class export in the barrel index
- Normalize cwd with path.resolve() in checkFileExists to handle trailing slashes

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@barryonthecape
Copy link
Copy Markdown
Collaborator

PR #675 Review — Strong Work, One Blocking Issue

What's Impressive

  • Decomposition of runner.ts (6,878 lines → 5 focused modules) with full test coverage across all extracted units
  • 672/672 full repo tests pass, 88/88 SDK workflow tests pass, TypeScript clean, Rust tests pass
  • Honest failure reporting — the extract-broker-worker wave failed cleanly because Codex's Rust extraction broke compilation, and the workflow runner correctly caught it and retried
  • Good bug fixes caught during E2E: stripInjectedTaskEcho restoration, broker 400 race condition guards, ESM require()import fixes

Blocking: CodeQL ReDoS in channel-messenger.ts (line 87)

Severity: High — ReDoS vulnerability in secret scrubbing

The problematic pattern:

/-----BEGIN\s+(?:RSA\s+)?PRIVATE\s+KEY-----[\s\S]*?-----END/g

The non-greedy [\s\S]*? still causes polynomial backtracking on adversarial input (e.g., many repetitions of -----BEGIN\tPRIVATE\tKEY----- with no closing delimiter).

Suggested fix — use a tempered greedy token:

/-----BEGIN\s+(?:RSA\s+)?PRIVATE\s+KEY-----(?:(?!-----END)[\s\S])*-----END/g

This prevents backtracking by checking ahead at each position. Alternatively, since scrubForChannel already iterates line-by-line, you could use indexOf/slice for the private key block stripping instead of regex entirely.


Non-Blocking: Devin Reviews

Multiple Devin Review passes are showing findings (5 + 14 + 17 + 18 + 26 + etc. across passes). The PR body doesn't address whether these are resolved. Worth confirming all findings are addressed before merge.


Merge Readiness

Check Status
CI (excluding CodeQL) All passing
Tests 672/672
TypeScript Clean
Rust clippy/build Passing
CodeQL 1 high (ReDoS)

One fix needed before merge — the ReDoS in scrubSecrets. Once that's addressed, this is a solid approve. Massive refactor executed well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants