Skip to content

[DBA-290] Refactor review skills to use sub-agent behavior#73

Merged
gasparian merged 6 commits intomainfrom
andrei/DBA-290-refactor-review-skills-to-subagents
Mar 25, 2026
Merged

[DBA-290] Refactor review skills to use sub-agent behavior#73
gasparian merged 6 commits intomainfrom
andrei/DBA-290-refactor-review-skills-to-subagents

Conversation

@gasparian
Copy link
Contributor

@gasparian gasparian commented Mar 23, 2026

Summary

Refactor local-code-review and review-architecture skills to run in isolated sub-agent context using Claude Code's native context: fork mechanism. Reviews execute with zero prior conversation state, ensuring unbiased findings. Also adds a pacing rule for user confirmation between workflow phases and improves eval-skills to support multi-skill evaluation.

Changes

Use native context: fork for review skill isolation

  • Skills run directly as forked sub-agents — no manual orchestration needed
  • Add argument-hint, context: fork, agent: general-purpose frontmatter
  • Use $ARGUMENTS for scope parameter (staged, branch, files:<path>, module:<path>, full)
Files
  • .claude/skills/local-code-review/SKILL.md
  • .claude/skills/review-architecture/SKILL.md

Add pacing rule for user confirmation between phases

  • Agent pauses between plan → implement → validate → commit → PR
  • Add explicit Plan step to the workflow
  • Review findings require user approval before fixing
Files
  • CLAUDE.md

Improve eval-skills to support multi-skill evaluation

  • Accept skill names as arguments (/eval-skills local-code-review review-architecture)
  • Add argument-hint to frontmatter
  • When no arguments provided, ask the user which skills to evaluate
Files
  • .claude/skills/eval-skills/SKILL.md

Update evals and smoke tests

  • Eval assertions test review quality directly (not orchestration)
  • Remove review-guidelines.md smoke test checks (files removed)
Files
  • .claude/skills/local-code-review/evals/evals.json
  • .claude/skills/review-architecture/evals/evals.json
  • scripts/smoke-test-skills.sh

Fix flaky duckdb test

  • test_build and test_index used shared default duckdb connection causing table collision
Files
  • tests/test_build.py
  • tests/test_index.py

Test Plan

  • make check passes
  • make test passes (66/66, flaky test fixed)
  • make lint-skills passes
  • All 10 eval JSON files valid (36 total eval cases)
  • Manual test: /local-code-review runs in forked sub-agent context
  • eval-skills iteration 1 completed — review-architecture with-skill 100% vs without-skill 85%

🤖 Generated with Claude Code

Extract review logic from SKILL.md into separate review-guidelines.md
files. Skills now act as thin orchestrators that spawn sub-agents with
clean context (no prior conversation state), ensuring unbiased reviews.

- local-code-review and review-architecture accept a scope parameter
- Both skills require explicit user approval before fixing findings
- Evals updated to assert sub-agent delegation pattern
- Smoke tests verify review-guidelines.md file existence
- CLAUDE.md workflow merges review steps and notes parallel execution

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@gasparian gasparian requested a review from a team March 23, 2026 18:15
Andrei Gasparian and others added 4 commits March 23, 2026 19:17
Both test_build and test_index called duckdb.execute() on the shared
default in-memory connection instead of the file-scoped connection,
causing "table t1 already exists" when both tests ran in the same suite.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace manual sub-agent orchestration with Claude Code's native
context:fork + agent frontmatter fields. Skills now run directly in
a forked sub-agent context, eliminating the need for separate
review-guidelines.md files and orchestrator logic.

- Add argument-hint, context:fork, agent:general-purpose to frontmatter
- Use $ARGUMENTS for scope parameter substitution
- Merge review guidelines back into SKILL.md (it IS the sub-agent now)
- Remove review-guidelines.md files (no longer needed)
- Move "ask before fixing" guardrail to CLAUDE.md workflow
- Update evals to test review quality directly

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Introduce a general pacing guideline requiring the agent to pause for
user confirmation between significant workflow phases (plan, implement,
validate, commit, PR). Add explicit Plan step to the workflow.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Accept skill names as arguments (e.g. /eval-skills local-code-review).
When no arguments provided, ask the user which skills to evaluate.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
before or after significant changes.
You are reviewing the architecture of the Databao CLI project.
You have NO prior context about why these changes were made — review
purely on merit. Use your tools (Read, Grep, Glob, Bash) to inspect
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it's necessary to specify a list of tools? I'm sure Claude Code will use them anyway. If you want to limit the tools list, use the allowed-tools header parameter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point regarding tools - I'll better add it to the header indeed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so I've dropped these mentions, and added a separate agent specification for that at .claude/agents/, with read-only tools available only.
ptal

Replace agent: general-purpose with agent: reviewer in both review
skills. The reviewer agent restricts tools to Read, Glob, Grep, Bash,
structurally preventing file edits. Remove redundant "Do NOT edit"
instructions and replace verbose rewrite guardrail with concise
snippet guidance.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@gasparian gasparian requested a review from catstrike March 25, 2026 11:08
@gasparian gasparian merged commit 9d7814d into main Mar 25, 2026
6 checks passed
@gasparian gasparian deleted the andrei/DBA-290-refactor-review-skills-to-subagents branch March 25, 2026 13:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants