diff --git a/plugins/agentdev/skills/debug-mode/SKILL.md b/plugins/agentdev/skills/debug-mode/SKILL.md index 0d70994..1d1c72e 100644 --- a/plugins/agentdev/skills/debug-mode/SKILL.md +++ b/plugins/agentdev/skills/debug-mode/SKILL.md @@ -1,398 +1,89 @@ --- name: debug-mode -description: | - Enable, disable, and manage debug mode for agentdev sessions. - Records all tool invocations, skill activations, hook triggers, and agent delegations to JSONL. - Use when debugging agent behavior, optimizing workflows, or analyzing session performance. +description: "Records tool invocations, skill activations, hook triggers, and agent delegations to JSONL files in claude-code-session-debug/. Manages per-project debug configuration with three capture levels and provides jq-based session analysis. Use when debugging agent behavior, profiling workflow performance, or auditing session event sequences." --- -plugin: agentdev -updated: 2026-01-20 # AgentDev Debug Mode -Debug mode captures detailed session information for analysis, debugging, and optimization. -All events are recorded to a JSONL file in `claude-code-session-debug/`. +Captures session events to append-only JSONL files for debugging, profiling, and auditing agent workflows. -## Configuration +## Workflow -Debug mode uses **per-project configuration** stored in `.claude/agentdev-debug.json`. +1. **Enable debug mode** — run `/agentdev:debug-enable` or create `.claude/agentdev-debug.json` with `{"enabled": true, "level": "standard"}`. +2. **Run agent workflow** — all events (tool calls, delegations, phase transitions, errors) are appended to `claude-code-session-debug/agentdev-{slug}-{timestamp}-{id}.jsonl`. +3. **Analyze session** — use jq queries to extract statistics, find failures, trace delegations. +4. **Adjust capture level** — switch between minimal/standard/verbose as needed. +5. **Disable or clean up** — run `/agentdev:debug-disable` or delete old JSONL files. -### Config File Format - -Location: `.claude/agentdev-debug.json` (in project root) - -```json -{ - "enabled": true, - "level": "standard", - "created_at": "2026-01-09T07:00:00Z" -} -``` - -**Fields:** -- `enabled`: boolean - Whether debug mode is active -- `level`: string - Debug level (minimal, standard, verbose) -- `created_at`: string - ISO timestamp when config was created - -## Enabling Debug Mode - -Use the command to create the config file: - -``` -/agentdev:debug-enable -``` +## Debug Levels -This creates `.claude/agentdev-debug.json` with `enabled: true`. +| Level | Captures | +|-------|----------| +| `minimal` | Phase transitions, errors, session start/end | +| `standard` | + tool invocations, agent delegations | +| `verbose` | + skill activations, hook triggers, full parameters | -Or manually create the file: +## Example: Enable and Configure ```bash +# Create config (or use /agentdev:debug-enable) mkdir -p .claude cat > .claude/agentdev-debug.json << 'EOF' -{ - "enabled": true, - "level": "standard", - "created_at": "2026-01-09T07:00:00Z" -} +{"enabled": true, "level": "standard", "created_at": "2026-01-09T07:00:00Z"} EOF -``` - -## Debug Levels - -| Level | Captured Events | -|-------|-----------------| -| `minimal` | Phase transitions, errors, session start/end | -| `standard` | All of minimal + tool invocations, agent delegations | -| `verbose` | All of standard + skill activations, hook triggers, full parameters | - -Default level is `standard`. - -### Changing Debug Level -Using jq: -```bash +# Change level jq '.level = "verbose"' .claude/agentdev-debug.json > tmp.json && mv tmp.json .claude/agentdev-debug.json ``` -## Output Location - -Debug sessions are saved to: - -``` -claude-code-session-debug/agentdev-{slug}-{timestamp}-{id}.jsonl -``` - -Example: -``` -claude-code-session-debug/agentdev-graphql-reviewer-20260109-063623-ba71.jsonl -``` - -## JSONL Format +**Verification:** Run `/agentdev:debug-status` to confirm debug mode is active and at the expected level. -Each line in the JSONL file is a complete JSON event object. This append-only format is: -- Crash-resilient (no data loss on unexpected termination) -- Easy to process with `jq` -- Streamable during the session - -### Event Schema (v1.0.0) - -```json -{ - "event_id": "550e8400-e29b-41d4-a716-446655440001", - "correlation_id": null, - "timestamp": "2026-01-09T06:40:00Z", - "type": "tool_invocation", - "data": { ... } -} -``` - -**Fields:** -- `event_id`: Unique UUID for this event -- `correlation_id`: Links related events (e.g., tool_invocation -> tool_result) -- `timestamp`: ISO 8601 timestamp -- `type`: Event type (see below) -- `data`: Type-specific payload - -### Event Types - -| Type | Description | -|------|-------------| -| `session_start` | Session initialization with metadata | -| `session_end` | Session completion | -| `tool_invocation` | Tool called with parameters | -| `tool_result` | Tool execution result | -| `skill_activation` | Skill loaded by agent | -| `hook_trigger` | PreToolUse/PostToolUse hook fired | -| `agent_delegation` | Task delegated to sub-agent | -| `agent_response` | Sub-agent returned result | -| `phase_transition` | Workflow phase changed | -| `user_interaction` | User approval/input requested | -| `proxy_mode_request` | External model request via Claudish | -| `proxy_mode_response` | External model response | -| `error` | Error occurred | - -## What Gets Captured - -### Session Metadata -- Session ID and path -- User request -- Environment (Claudish availability, plugin version) -- Start/end timestamps - -### Tool Invocations -- Tool name -- Parameters (sanitized - credentials redacted) -- Execution context (phase, agent) -- Duration and result size - -### Agent Delegations -- Target agent name -- Prompt preview (first 200 chars) -- Proxy mode model if used -- Session path - -### Proxy Mode -- Model ID -- Request/response duration -- Success/failure status - -### Phase Transitions -- From/to phase numbers and names -- Transition reason (completed, skipped, failed) -- Quality gate results - -### Errors -- Error type (tool_error, hook_error, agent_error, etc.) -- Message and stack trace -- Context (phase, agent, tool) -- Recoverability - -## Sensitive Data Protection - -Debug mode automatically sanitizes sensitive data: - -**Redacted Patterns:** -- API keys (`sk-*`, `ghp_*`, `AKIA*`, etc.) -- Tokens (bearer, access, auth) -- Passwords and secrets -- AWS credentials -- Slack tokens (`xox*`) -- Google API keys (`AIza*`) - -## Analyzing Debug Output - -### Prerequisites - -Install `jq` for JSON processing: -```bash -# macOS -brew install jq - -# Linux -apt-get install jq -``` - -### Quick Statistics +## Example: Analyze Session ```bash # Count events by type cat session.jsonl | jq -s 'group_by(.type) | map({type: .[0].type, count: length})' -``` - -### Tool Usage Analysis - -```bash -# Tool invocation counts -cat session.jsonl | jq -s ' - [.[] | select(.type == "tool_invocation") | .data.tool_name] - | group_by(.) - | map({tool: .[0], count: length}) - | sort_by(-.count)' -``` - -### Failed Operations -```bash # Find all errors and failed tool results cat session.jsonl | jq 'select(.type == "error" or (.type == "tool_result" and .data.success == false))' -``` - -### Timeline View - -```bash -# Chronological event summary -cat session.jsonl | jq '"\(.timestamp) [\(.type)] \(.data | keys | join(", "))"' -``` -### Event Correlation +# Tool invocation frequency +cat session.jsonl | jq -s '[.[] | select(.type == "tool_invocation") | .data.tool_name] | group_by(.) | map({tool: .[0], count: length}) | sort_by(-.count)' -```bash -# Find tool invocation and its result -INVOCATION_ID="550e8400-e29b-41d4-a716-446655440001" -cat session.jsonl | jq "select(.event_id == \"$INVOCATION_ID\" or .correlation_id == \"$INVOCATION_ID\")" +# Slowest agent delegations +cat session.jsonl | jq -s '[.[] | select(.type == "agent_response")] | sort_by(-.data.duration_ms) | .[:5] | .[] | {agent: .data.agent, duration_sec: (.data.duration_ms / 1000)}' ``` -### Phase Duration Analysis +## Event Types -```bash -# Calculate time between phase transitions -cat session.jsonl | jq -s ' - [.[] | select(.type == "phase_transition")] - | sort_by(.timestamp) - | .[] - | {phase: .data.to_name, timestamp: .timestamp}' -``` - -### Agent Delegation Timing - -```bash -# Find slowest agent delegations -cat session.jsonl | jq -s ' - [.[] | select(.type == "agent_response")] - | sort_by(-.data.duration_ms) - | .[:5] - | .[] - | {agent: .data.agent, duration_sec: (.data.duration_ms / 1000)}' -``` - -### Proxy Mode Performance - -```bash -# External model response times -cat session.jsonl | jq -s ' - [.[] | select(.type == "proxy_mode_response")] - | .[] - | {model: .data.model_id, success: .data.success, duration_sec: (.data.duration_ms / 1000)}' -``` - -## Disabling Debug Mode - -Use the command: -``` -/agentdev:debug-disable -``` - -Or manually update: -```bash -jq '.enabled = false' .claude/agentdev-debug.json > tmp.json && mv tmp.json .claude/agentdev-debug.json -``` - -Or delete the config file: -```bash -rm -f .claude/agentdev-debug.json -``` - -## Cleaning Up Debug Files - -### Remove All Debug Files - -```bash -rm -rf claude-code-session-debug/ -``` - -### Remove Files Older Than 7 Days - -```bash -find claude-code-session-debug/ -name "*.jsonl" -mtime +7 -delete -``` - -### Remove Files Larger Than 10MB - -```bash -find claude-code-session-debug/ -name "*.jsonl" -size +10M -delete -``` - -## File Permissions - -Debug files are created with restrictive permissions: -- Directory: `0o700` (owner only) -- Files: `0o600` (owner read/write only) - -This prevents other users from reading potentially sensitive session data. - -## Example Session Output - -```jsonl -{"event_id":"init-1736408183","timestamp":"2026-01-09T06:36:23Z","type":"session_start","data":{"schema_version":"1.0.0","session_id":"agentdev-graphql-reviewer-20260109-063623-ba71","user_request":"Create an agent that reviews GraphQL schemas","session_path":"ai-docs/sessions/agentdev-graphql-reviewer-20260109-063623-ba71","environment":{"claudish_available":true,"plugin_version":"1.4.0","jq_available":true}}} -{"event_id":"550e8400-e29b-41d4-a716-446655440001","timestamp":"2026-01-09T06:36:25Z","type":"tool_invocation","data":{"tool_name":"TodoWrite","parameters":{"todos":"[REDACTED]"},"context":{"phase":0,"agent":null}}} -{"event_id":"550e8400-e29b-41d4-a716-446655440002","correlation_id":"550e8400-e29b-41d4-a716-446655440001","timestamp":"2026-01-09T06:36:25Z","type":"tool_result","data":{"tool_name":"TodoWrite","success":true,"result_size_bytes":156,"duration_ms":12}} -{"event_id":"550e8400-e29b-41d4-a716-446655440003","timestamp":"2026-01-09T06:36:26Z","type":"phase_transition","data":{"from_phase":null,"to_phase":0,"from_name":null,"to_name":"Init","transition_reason":"completed","quality_gate_result":true}} -{"event_id":"550e8400-e29b-41d4-a716-446655440004","timestamp":"2026-01-09T06:36:30Z","type":"agent_delegation","data":{"target_agent":"agentdev:architect","prompt_preview":"SESSION_PATH: ai-docs/sessions/agentdev-graphql-reviewer...","prompt_length":1456,"proxy_mode":null,"session_path":"ai-docs/sessions/agentdev-graphql-reviewer-20260109-063623-ba71"}} -{"event_id":"end-1736408565","timestamp":"2026-01-09T06:42:45Z","type":"session_end","data":{"success":true}} -``` - -## Troubleshooting - -### Debug File Not Created - -1. Check if debug mode is enabled: - ```bash - /agentdev:debug-status - ``` - -2. Verify config file: - ```bash - cat .claude/agentdev-debug.json - ``` - -3. Verify the directory is writable: - ```bash - ls -la claude-code-session-debug/ - ``` - -### jq Commands Not Working - -1. Install jq: `brew install jq` or `apt-get install jq` -2. Verify JSONL format (each line should be valid JSON): - ```bash - head -1 session.jsonl | jq . - ``` - -### Large Debug Files - -Debug files can grow large in verbose mode. Use `minimal` level for lighter capture: - -Update config: -```bash -jq '.level = "minimal"' .claude/agentdev-debug.json > tmp.json && mv tmp.json .claude/agentdev-debug.json -``` - -Or clean up old files regularly: -```bash -find claude-code-session-debug/ -name "*.jsonl" -mtime +3 -delete -``` - -## Integration with Other Tools - -### Viewing in VS Code - -The JSONL format works with JSON syntax highlighting. For better viewing: -1. Install "JSON Lines" VS Code extension -2. Use "Format Document" on each line individually - -### Importing to Analytics +| Type | Description | +|------|-------------| +| `session_start` / `session_end` | Session lifecycle with metadata | +| `tool_invocation` / `tool_result` | Tool calls with parameters and outcomes | +| `skill_activation` | Skill loaded by agent | +| `hook_trigger` | PreToolUse/PostToolUse hook fired | +| `agent_delegation` / `agent_response` | Sub-agent task lifecycle | +| `phase_transition` | Workflow phase changes with quality gate results | +| `error` | Errors with type, context, and recoverability | -```bash -# Convert to CSV for spreadsheet import -cat session.jsonl | jq -rs ' - (.[0] | keys_unsorted) as $keys - | ($keys | @csv), - (.[] | [.[$keys[]]] | @csv)' > session.csv -``` +Events link via `correlation_id` (e.g., tool_invocation to its tool_result). -### Streaming to External Service +## Sensitive Data Protection -```bash -# Tail and send to logging service -tail -f session.jsonl | while read line; do - curl -X POST -d "$line" https://logging.example.com/ingest -done -``` +API keys (`sk-*`, `ghp_*`, `AKIA*`), tokens, passwords, and credentials are automatically redacted in captured events. -## Commands Reference +## Commands | Command | Description | |---------|-------------| -| `/agentdev:debug-enable` | Enable debug mode (creates config file) | -| `/agentdev:debug-disable` | Disable debug mode (updates config file) | -| `/agentdev:debug-status` | Check current debug mode status | +| `/agentdev:debug-enable` | Enable debug mode | +| `/agentdev:debug-disable` | Disable debug mode | +| `/agentdev:debug-status` | Check current status | + +## Notes + +- JSONL format is crash-resilient — no data loss on unexpected termination +- Debug files use restrictive permissions (directory 0o700, files 0o600) +- Clean old files: `find claude-code-session-debug/ -name "*.jsonl" -mtime +7 -delete` +- Use `minimal` level to reduce file size in long sessions +- Process with jq, import to CSV, or stream to external logging services diff --git a/plugins/agentdev/skills/xml-standards/SKILL.md b/plugins/agentdev/skills/xml-standards/SKILL.md index 4771b34..64117ee 100644 --- a/plugins/agentdev/skills/xml-standards/SKILL.md +++ b/plugins/agentdev/skills/xml-standards/SKILL.md @@ -1,9 +1,7 @@ --- name: xml-standards -description: XML tag structure patterns for Claude Code agents and commands. Use when designing or implementing agents to ensure proper XML structure following Anthropic best practices. +description: "Specifies XML tag structure patterns (role, expertise, constraints, workflow) for Claude Code agents and commands following Anthropic best practices. Provides required and optional tag schemas with nesting rules. Use when designing agent prompts, implementing XML-structured commands, or validating agent tag compliance." --- -plugin: agentdev -updated: 2026-01-20 # XML Tag Standards diff --git a/plugins/agentdev/skills/yaml-agent-format/SKILL.md b/plugins/agentdev/skills/yaml-agent-format/SKILL.md index 49d1626..1546bf2 100644 --- a/plugins/agentdev/skills/yaml-agent-format/SKILL.md +++ b/plugins/agentdev/skills/yaml-agent-format/SKILL.md @@ -1,11 +1,6 @@ --- name: yaml-agent-format -description: YAML format for Claude Code agent definitions as alternative to markdown. Use when creating agents with YAML, converting markdown agents to YAML, or validating YAML agent schemas. Trigger keywords - "YAML agent", "agent YAML", "YAML format", "agent schema", "YAML definition", "convert to YAML". -version: 0.1.0 -tags: [agentdev, yaml, agent, format, schema, definition] -keywords: [yaml, agent, format, schema, definition, conversion, validation, frontmatter] -plugin: agentdev -updated: 2026-01-28 +description: "Defines the YAML schema for Claude Code agent definitions as an alternative to markdown. Covers required fields, validation rules, and markdown-to-YAML conversion patterns. Use when creating agents in YAML format, converting existing markdown agents to YAML, or validating YAML agent schemas." --- # YAML Agent Format diff --git a/plugins/autopilot/skills/linear-integration/SKILL.md b/plugins/autopilot/skills/linear-integration/SKILL.md index 6a2846d..dbe6d17 100644 --- a/plugins/autopilot/skills/linear-integration/SKILL.md +++ b/plugins/autopilot/skills/linear-integration/SKILL.md @@ -1,278 +1,137 @@ --- name: linear-integration -description: Linear API patterns and examples for autopilot. Includes authentication, webhooks, issue CRUD, state transitions, file attachments, and comment handling. -version: 0.1.0 -tags: [linear, api, webhook, integration] -keywords: [linear, api, webhook, issue, comment, state, attachment] +description: "Provides Linear API patterns for authentication, webhook handling, issue CRUD, state transitions, file attachments, and comment posting. Includes signature verification and SDK usage examples. Use when integrating with Linear API or handling Linear webhook events." --- -plugin: autopilot -updated: 2026-01-20 # Linear Integration -**Version:** 0.1.0 -**Purpose:** Patterns for Linear API integration in autopilot workflows -**Status:** Phase 1 +Provides patterns and examples for integrating with the Linear API in autopilot workflows, covering authentication, webhooks, issue management, and state transitions. -## When to Use +## Workflow -Use this skill when you need to: -- Authenticate with Linear API -- Set up webhook handlers for Linear events -- Create, read, update, or delete Linear issues -- Transition issue states in Linear workflows -- Attach files to Linear issues -- Add comments to Linear issues - -## Overview - -This skill provides patterns for: -- Linear API authentication -- Webhook handler setup -- Issue CRUD operations -- State transitions -- File attachments -- Comment handling +1. **Authenticate** - initialize LinearClient with API key +2. **Set up webhook handler** - configure HTTP server with signature verification +3. **Route events** - process incoming webhook payloads by action type +4. **Execute operations** - create/read/update issues, transition states, attach files, post comments ## Core Patterns ### Pattern 1: Authentication -**Personal API Key (MVP):** ```typescript import { LinearClient } from '@linear/sdk'; -const linear = new LinearClient({ - apiKey: process.env.LINEAR_API_KEY -}); -``` +const linear = new LinearClient({ apiKey: process.env.LINEAR_API_KEY }); -**Verification:** -```typescript -async function verifyConnection(): Promise { - try { - const me = await linear.viewer; - console.log(`Connected as: ${me.name}`); - return true; - } catch (error) { - console.error('Linear connection failed:', error); - return false; - } -} +// Verify connection +const me = await linear.viewer; +console.log(`Connected as: ${me.name}`); ``` -### Pattern 2: Webhook Handler +### Pattern 2: Webhook Handler with Signature Verification -**Bun HTTP Server:** ```typescript import { serve } from 'bun'; import { createHmac } from 'crypto'; -interface LinearWebhookPayload { - action: 'created' | 'updated' | 'deleted'; - type: 'Issue' | 'Comment' | 'Label'; - data: { - id: string; - title?: string; - description?: string; - state: { id: string; name: string }; - labels: Array<{ id: string; name: string }>; - }; -} - serve({ port: process.env.AUTOPILOT_WEBHOOK_PORT || 3001, - async fetch(req: Request): Promise { - if (req.method !== 'POST') { - return new Response('Method not allowed', { status: 405 }); - } + if (req.method !== 'POST') return new Response('Method not allowed', { status: 405 }); - // Verify signature const signature = req.headers.get('Linear-Signature'); const body = await req.text(); - if (!verifySignature(body, signature)) { + // Verify HMAC-SHA256 signature + const hmac = createHmac('sha256', process.env.LINEAR_WEBHOOK_SECRET!); + if (signature !== hmac.update(body).digest('hex')) { return new Response('Unauthorized', { status: 401 }); } - const payload: LinearWebhookPayload = JSON.parse(body); - - // Route to handler - await routeWebhook(payload); - + await routeWebhook(JSON.parse(body)); return new Response('OK', { status: 200 }); } }); - -function verifySignature(body: string, signature: string | null): boolean { - if (!signature) return false; - - const hmac = createHmac('sha256', process.env.LINEAR_WEBHOOK_SECRET!); - const expectedSignature = hmac.update(body).digest('hex'); - - return signature === expectedSignature; -} ``` -### Pattern 3: Issue Operations +### Pattern 3: Issue CRUD -**Create Issue:** ```typescript -async function createIssue( - teamId: string, - title: string, - description: string, - labels: string[] -): Promise { - // Note: Linear SDK uses linear.createIssue() method - const result = await linear.createIssue({ - teamId, - title, - description, - labelIds: await resolveLabelIds(labels), - assigneeId: process.env.AUTOPILOT_BOT_USER_ID, - priority: 2, - }); - - const issue = await result.issue; - return issue!.id; -} -``` - -**Query Issues:** -```typescript -async function getAutopilotTasks(teamId: string) { - const issues = await linear.issues({ - filter: { - team: { id: { eq: teamId } }, - assignee: { id: { eq: process.env.AUTOPILOT_BOT_USER_ID } }, - state: { name: { in: ['Todo', 'In Progress'] } }, - }, - }); - - return issues.nodes; -} +// Create issue +const result = await linear.createIssue({ + teamId, title, description, + labelIds: await resolveLabelIds(labels), + assigneeId: process.env.AUTOPILOT_BOT_USER_ID, + priority: 2, +}); +const issueId = (await result.issue)!.id; + +// Query autopilot tasks +const issues = await linear.issues({ + filter: { + team: { id: { eq: teamId } }, + assignee: { id: { eq: process.env.AUTOPILOT_BOT_USER_ID } }, + state: { name: { in: ['Todo', 'In Progress'] } }, + }, +}); ``` ### Pattern 4: State Transitions -**Transition State:** ```typescript -async function transitionState( - issueId: string, - newStateName: string -): Promise { - // Get workflow states for the issue's team +async function transitionState(issueId: string, newStateName: string): Promise { const issue = await linear.issue(issueId); - const team = await issue.team; - const states = await team.states(); - + const states = await (await issue.team).states(); const targetState = states.nodes.find(s => s.name === newStateName); - - if (!targetState) { - throw new Error(`State "${newStateName}" not found`); - } - - // Note: Linear SDK uses linear.updateIssue() method - await linear.updateIssue(issueId, { - stateId: targetState.id, - }); + if (!targetState) throw new Error(`State "${newStateName}" not found`); + await linear.updateIssue(issueId, { stateId: targetState.id }); } ``` ### Pattern 5: File Attachments -**Upload and Attach:** ```typescript -async function attachFile( - issueId: string, - filePath: string, - fileName: string -): Promise { - // Request upload URL - const uploadPayload = await linear.fileUpload( - getMimeType(filePath), - fileName, - getFileSize(filePath) - ); - - // Upload to storage - const fileContent = await Bun.file(filePath).arrayBuffer(); +async function attachFile(issueId: string, filePath: string, fileName: string): Promise { + const uploadPayload = await linear.fileUpload(getMimeType(filePath), fileName, getFileSize(filePath)); await fetch(uploadPayload.uploadUrl, { method: 'PUT', - body: fileContent, + body: await Bun.file(filePath).arrayBuffer(), headers: { 'Content-Type': getMimeType(filePath) }, }); - - // Attach to issue - await linear.attachmentCreate({ - issueId, - url: uploadPayload.assetUrl, - title: fileName, - }); + await linear.attachmentCreate({ issueId, url: uploadPayload.assetUrl, title: fileName }); } ``` ### Pattern 6: Comments -**Add Comment:** ```typescript -async function addComment( - issueId: string, - body: string -): Promise { - // Note: Linear SDK uses linear.createComment() method - await linear.createComment({ - issueId, - body, - }); -} +await linear.createComment({ issueId, body: "Implementation complete. See attached proof." }); ``` -## Best Practices - -- Always verify webhook signatures -- Use exponential backoff for API rate limits -- Cache team/state/label IDs to reduce API calls -- Handle webhook delivery failures gracefully -- Log all state transitions for audit - ## Examples -### Example 1: Full Issue Lifecycle - +### Full issue lifecycle ```typescript -// Create issue -const issueId = await createIssue( - teamId, - "Add user profile page", - "Implement user profile with avatar upload", - ["frontend", "feature"] -); - -// Transition to In Progress +const issueId = await createIssue(teamId, "Add user profile page", "...", ["frontend"]); await transitionState(issueId, "In Progress"); - // ... work happens ... - -// Attach proof artifacts await attachFile(issueId, "screenshot.png", "Desktop Screenshot"); - -// Add completion comment -await addComment(issueId, "Implementation complete. See attached proof."); - -// Transition to In Review +await addComment(issueId, "Implementation complete."); await transitionState(issueId, "In Review"); ``` -### Example 2: Query Autopilot Queue - +### Query autopilot queue ```typescript const tasks = await getAutopilotTasks(teamId); - -console.log(`Autopilot queue: ${tasks.length} tasks`); for (const task of tasks) { console.log(`- ${task.identifier}: ${task.title} (${task.state.name})`); } ``` + +## Verification + +When using these patterns, confirm: +- [ ] Webhook signatures are verified before processing payloads +- [ ] API rate limits are handled with exponential backoff +- [ ] Team/state/label IDs are cached to reduce API calls +- [ ] All state transitions are logged for audit diff --git a/plugins/autopilot/skills/proof-of-work/SKILL.md b/plugins/autopilot/skills/proof-of-work/SKILL.md index e5f91af..2814e75 100644 --- a/plugins/autopilot/skills/proof-of-work/SKILL.md +++ b/plugins/autopilot/skills/proof-of-work/SKILL.md @@ -1,200 +1,90 @@ --- name: proof-of-work -description: Proof artifact generation patterns for task validation. Covers screenshots, test results, deployments, and confidence scoring. -version: 0.1.0 -tags: [proof, validation, screenshots, tests, deployment] -keywords: [proof, artifact, screenshot, test, deployment, confidence, validation] +description: "Generates validation artifacts (screenshots, test results, coverage reports) after task completion and calculates confidence scores for auto-approval decisions. Supports bug fix, feature, and UI change proof types. Use when validating task completion or determining if work can be auto-approved." --- -plugin: autopilot -updated: 2026-01-20 # Proof-of-Work -**Version:** 0.1.0 -**Purpose:** Generate validation artifacts for autonomous task completion -**Status:** Phase 1 +Generates verifiable artifacts that demonstrate task completion and calculates confidence scores to determine whether tasks can be auto-approved or need manual review. -## When to Use +## Workflow -Use this skill when you need to: -- Generate proof artifacts after task completion -- Capture screenshots for UI verification -- Parse and report test results -- Calculate confidence scores for task validation -- Determine if a task can be auto-approved +1. **Determine proof type** based on task classification (bug fix, feature, or UI change) +2. **Collect artifacts** - run tests, capture screenshots, check coverage, verify build +3. **Calculate confidence score** using the weighted scoring algorithm +4. **Generate proof summary** in markdown format for Linear comments +5. **Return approval decision** based on confidence thresholds -## Overview +## Required Artifacts by Task Type -Proof-of-work is the mechanism that validates task completion. Every finished task must include verifiable artifacts that demonstrate the work was done correctly. +| Artifact | Bug Fix | Feature | UI Change | +|----------|---------|---------|-----------| +| Git diff | Required | - | - | +| Test results | Required | Required | - | +| Regression test | Required | - | - | +| Coverage report | - | Required (>=80%) | - | +| Build output | - | Required | - | +| Desktop screenshot (1920x1080) | - | - | Required | +| Mobile screenshot (375x667) | - | - | Required | +| Tablet screenshot (768x1024) | - | - | Required | +| Accessibility score | - | - | Required (>=80) | +| Deployment URL | Optional | Optional | Optional | -## Proof Types by Task - -### Bug Fix Proof - -| Artifact | Required | Purpose | -|----------|----------|---------| -| Git diff | Yes | Show minimal, focused changes | -| Test results | Yes | All tests passing | -| Regression test | Yes | Specific test for the bug | -| Error log (before/after) | Optional | Visual evidence | - -### Feature Proof - -| Artifact | Required | Purpose | -|----------|----------|---------| -| Screenshots | Yes | Visual verification | -| Test results | Yes | Functionality works | -| Coverage report | Yes | >= 80% coverage | -| Build output | Yes | Builds successfully | -| Deployment URL | Optional | Live demo | - -### UI Change Proof - -| Artifact | Required | Purpose | -|----------|----------|---------| -| Desktop screenshot | Yes | 1920x1080 view | -| Mobile screenshot | Yes | 375x667 view | -| Tablet screenshot | Yes | 768x1024 view | -| Accessibility score | Yes | >= 80 Lighthouse | -| Visual regression | Optional | BackstopJS diff | - -## Screenshot Capture - -**Playwright Pattern:** - -```typescript -import { chromium } from 'playwright'; - -async function captureScreenshots(url: string, outputDir: string) { - const browser = await chromium.launch({ headless: true }); - const context = await browser.newContext(); - const page = await context.newPage(); - - // Desktop - await page.setViewportSize({ width: 1920, height: 1080 }); - await page.goto(url); - await page.waitForLoadState('networkidle'); - await page.screenshot({ - path: `${outputDir}/desktop.png`, - fullPage: true, - }); - - // Mobile - await page.setViewportSize({ width: 375, height: 667 }); - await page.goto(url); - await page.waitForLoadState('networkidle'); - await page.screenshot({ - path: `${outputDir}/mobile.png`, - fullPage: true, - }); - - // Tablet - await page.setViewportSize({ width: 768, height: 1024 }); - await page.goto(url); - await page.waitForLoadState('networkidle'); - await page.screenshot({ - path: `${outputDir}/tablet.png`, - fullPage: true, - }); - - await browser.close(); -} -``` - -## Confidence Scoring - -**Algorithm:** +## Confidence Scoring Algorithm ```typescript -interface ProofArtifacts { - testResults?: { passed: number; total: number }; - buildSuccessful?: boolean; - lintErrors?: number; - screenshots?: string[]; - testCoverage?: number; - performanceScore?: number; -} - function calculateConfidence(artifacts: ProofArtifacts): number { let score = 0; - - // Tests (40 points) - if (artifacts.testResults) { - if (artifacts.testResults.passed === artifacts.testResults.total) { - score += 40; - } - } - - // Build (20 points) - if (artifacts.buildSuccessful) { - score += 20; - } - - // Coverage (20 points) - if (artifacts.testCoverage) { - if (artifacts.testCoverage >= 80) score += 20; - else if (artifacts.testCoverage >= 60) score += 15; - else if (artifacts.testCoverage >= 40) score += 10; - else score += 5; - } - - // Screenshots (10 points) - if (artifacts.screenshots) { - if (artifacts.screenshots.length >= 3) score += 10; - else if (artifacts.screenshots.length >= 1) score += 5; - } - - // Lint (10 points) - if (artifacts.lintErrors === 0) { - score += 10; - } - + // Tests: 40 points (all must pass) + if (artifacts.testResults?.passed === artifacts.testResults?.total) score += 40; + // Build: 20 points + if (artifacts.buildSuccessful) score += 20; + // Coverage: 20 points (>=80% full marks, >=60% partial) + if (artifacts.testCoverage >= 80) score += 20; + else if (artifacts.testCoverage >= 60) score += 15; + // Screenshots: 10 points (3+ full marks) + if (artifacts.screenshots?.length >= 3) score += 10; + else if (artifacts.screenshots?.length >= 1) score += 5; + // Lint: 10 points (zero errors) + if (artifacts.lintErrors === 0) score += 10; return score; } ``` ## Confidence Thresholds -| Confidence | Action | -|------------|--------| +| Score | Action | +|-------|--------| | >= 95% | Auto-approve (In Review -> Done) | | 80-94% | Manual review required | -| < 80% | Validation failed, iterate | - -## Proof Summary Template - -```markdown -# Proof of Work - -**Task**: {issue_id} -**Type**: {task_type} -**Confidence**: {score}% +| < 80% | Validation failed, must iterate | -## Test Results -- Total: {total} -- Passed: {passed} -- Failed: {failed} -- Coverage: {coverage}% +## Screenshot Capture Pattern -## Build -- Status: {status} -- Duration: {duration} - -## Screenshots -- Desktop: proof/desktop.png -- Mobile: proof/mobile.png -- Tablet: proof/tablet.png +```typescript +import { chromium } from 'playwright'; -## Artifacts -- test-results.txt -- coverage.json -- build-output.txt +async function captureScreenshots(url: string, outputDir: string) { + const browser = await chromium.launch({ headless: true }); + const page = await browser.newContext().then(ctx => ctx.newPage()); + + for (const [name, width, height] of [ + ['desktop', 1920, 1080], + ['mobile', 375, 667], + ['tablet', 768, 1024], + ]) { + await page.setViewportSize({ width, height }); + await page.goto(url); + await page.waitForLoadState('networkidle'); + await page.screenshot({ path: `${outputDir}/${name}.png`, fullPage: true }); + } + await browser.close(); +} ``` ## Examples -### Example 1: Feature Proof Generation - +### Full confidence proof ```typescript const proof = { testResults: { passed: 15, total: 15 }, @@ -203,32 +93,25 @@ const proof = { screenshots: ['desktop.png', 'mobile.png', 'tablet.png'], testCoverage: 85, }; - -const confidence = calculateConfidence(proof); -// 40 (tests) + 20 (build) + 20 (coverage) + 10 (screenshots) + 10 (lint) = 100% +calculateConfidence(proof); // 40+20+20+10+10 = 100% → auto-approve ``` -### Example 2: Partial Proof - +### Partial proof requiring iteration ```typescript const proof = { - testResults: { passed: 12, total: 15 }, // Some failing + testResults: { passed: 12, total: 15 }, // some failing buildSuccessful: true, lintErrors: 2, screenshots: ['desktop.png'], testCoverage: 65, }; - -const confidence = calculateConfidence(proof); -// 0 (tests fail) + 20 (build) + 15 (coverage) + 5 (1 screenshot) + 0 (lint errors) = 40% -// Result: Validation failed, must iterate +calculateConfidence(proof); // 0+20+15+5+0 = 40% → validation failed, iterate ``` -## Best Practices +## Verification -- Always capture screenshots for UI work -- Run full test suite, not just affected tests -- Include coverage report for features -- Build must pass before any proof is valid -- Store proofs in session directory for debugging -- Generate proof summary in markdown for Linear comments +After generating proof, confirm: +- [ ] All required artifacts for the task type are present +- [ ] Confidence score calculated correctly +- [ ] Proof summary generated in markdown format +- [ ] Approval decision matches threshold table diff --git a/plugins/autopilot/skills/state-machine/SKILL.md b/plugins/autopilot/skills/state-machine/SKILL.md index dc173e9..e7d2a83 100644 --- a/plugins/autopilot/skills/state-machine/SKILL.md +++ b/plugins/autopilot/skills/state-machine/SKILL.md @@ -1,241 +1,108 @@ --- name: state-machine -description: Task lifecycle state transitions with validation gates. Defines states, triggers, and required proofs. -version: 0.1.0 -tags: [state-machine, workflow, transitions, gates] -keywords: [state, transition, gate, validation, workflow, lifecycle] +description: "Manages task lifecycle state transitions (Todo, In Progress, In Review, Done, Blocked) with validation gates and iteration limits. Enforces entry conditions, confidence thresholds, and escalation rules. Use when transitioning task states or implementing validation gates." --- -plugin: autopilot -updated: 2026-01-20 # Task Lifecycle State Machine -**Version:** 0.1.0 -**Purpose:** Manage task state transitions with validation gates -**Status:** Phase 1 +Manages task state transitions with validation gates, ensuring tasks move through the lifecycle correctly with proper checks at each boundary. -## When to Use - -Use this skill when you need to: -- Understand valid state transitions for tasks -- Implement validation gates before state changes -- Handle iteration loops (In Review -> In Progress) -- Manage escalation to blocked state -- Enforce iteration limits - -## States +## States and Transitions ``` Todo ──→ In Progress ──→ In Review ──→ Done ↑ │ - └───────────┘ - (iteration) + └───────────┘ (iteration) In Progress ──→ Blocked (escalation) ``` +## Workflow + +1. **Determine current state** - read the task's current state from Linear +2. **Validate transition** - check that the target state is reachable and all gate conditions are met +3. **Execute transition** - update the state in Linear via API +4. **Log transition** - record the state change for audit trail + ## State Definitions -| State | Description | Entry Condition | -|-------|-------------|-----------------| -| Todo | Task queued for execution | Created with @autopilot label | -| In Progress | Task being executed | Passed start gate | -| In Review | Awaiting validation | Proof generated | -| Done | Task completed | Auto-approved or user approved | -| Blocked | Cannot proceed | Dependency issue or escalation | - -## Transition Triggers - -| From | To | Trigger | Gate | -|------|----|---------|------| -| Todo | In Progress | Label @autopilot added | Has acceptance criteria | -| In Progress | In Review | Work complete | Proof >= 80% confidence | -| In Review | Done | Confidence >= 95% | Auto-approval | -| In Review | Done | User approves | User feedback = APPROVAL | -| In Review | In Progress | Confidence < 80% | Validation failed | -| In Review | In Progress | User requests changes | Feedback = REQUESTED_CHANGES | -| In Progress | Blocked | Max iterations | Escalation | -| * | Blocked | Unresolvable blocker | Manual trigger | +| State | Entry Condition | +|-------|----------------| +| Todo | Created with `@autopilot` label | +| In Progress | Has acceptance criteria, no blocking dependencies, assigned to autopilot | +| In Review | All tests pass, build successful, no lint errors, proof artifacts exist | +| Done | Confidence >= 95% (auto-approve) OR user explicitly approves | +| Blocked | Max iterations reached, unresolvable dependency, or manual trigger | ## Validation Gates ### Gate 1: Start Work (Todo -> In Progress) - ```typescript async function canStartWork(issue: Issue): Promise { - const checks = [ - // Has acceptance criteria + return [ extractAcceptanceCriteria(issue.description).length > 0, - - // No blocking dependencies (await getBlockingIssues(issue)).length === 0, - - // Assigned to autopilot issue.assignee?.id === AUTOPILOT_BOT_USER_ID, - ]; - - return checks.every(c => c); + ].every(c => c); } ``` ### Gate 2: Submit for Review (In Progress -> In Review) - ```typescript async function canSubmitForReview(proof: Proof): Promise { - const checks = [ - // All tests pass + return [ proof.testResults.passed === proof.testResults.total, - - // Build successful proof.buildSuccessful, - - // No lint errors proof.lintErrors === 0, - - // Has proof artifacts proof.screenshots.length > 0 || proof.deploymentUrl, - ]; - - return checks.every(c => c); + ].every(c => c); } ``` ### Gate 3: Complete (In Review -> Done) - -```typescript -async function canComplete(proof: Proof): Promise<{ - canProceed: boolean; - autoApproved: boolean; -}> { - if (proof.confidence >= 95) { - return { canProceed: true, autoApproved: true }; - } - - if (proof.confidence >= 80) { - return { canProceed: false, autoApproved: false }; - // Wait for user approval - } - - return { canProceed: false, autoApproved: false }; - // Validation failed, should iterate -} -``` +- Confidence >= 95%: auto-approve, transition to Done +- Confidence 80-94%: wait for user approval +- Confidence < 80%: validation failed, iterate back to In Progress ## Iteration Limits -| Loop Type | Max Iterations | Escalation | -|-----------|----------------|------------| +| Loop Type | Max Iterations | Escalation Action | +|-----------|----------------|-------------------| | Execution retry | 2 | Block task | | Feedback rounds | 5 | Manual intervention | | Quality check fixes | 2 | Report to user | -## Implementation - -```typescript -class StateMachine { - async transition( - issueId: string, - targetState: string, - proof?: Proof - ): Promise { - const issue = await linear.issue(issueId); - const currentState = issue.state.name; - - // Validate transition - const isValid = this.validateTransition(currentState, targetState, proof); - - if (!isValid) { - throw new Error(`Invalid transition: ${currentState} -> ${targetState}`); - } - - // Execute transition - await linear.issueUpdate(issueId, { - stateId: await this.getStateId(issue.team.id, targetState), - }); - - // Log transition - await this.logTransition(issueId, currentState, targetState, proof); - } - - private validateTransition( - from: string, - to: string, - proof?: Proof - ): boolean { - const validTransitions: Record = { - 'Todo': ['In Progress', 'Blocked'], - 'In Progress': ['In Review', 'Blocked'], - 'In Review': ['Done', 'In Progress'], - 'Blocked': ['Todo', 'In Progress'], - }; - - return validTransitions[from]?.includes(to) ?? false; - } -} -``` - -## State Transition Diagram - -``` - ┌─────────────────────────────┐ - │ │ - ▼ │ -┌──────┐ ┌─────────────┐ ┌───────────┴───┐ ┌──────┐ -│ Todo │ ────► │ In Progress │ ────► │ In Review │ ────► │ Done │ -└──────┘ └─────────────┘ └───────────────┘ └──────┘ - │ │ │ - │ │ │ - │ ▼ │ - │ ┌─────────┐ │ - └────────► │ Blocked │ ◄─────────────────┘ - └─────────┘ -``` - ## Examples -### Example 1: Happy Path - +### Happy path ```typescript -// Task created -await transitionState(issueId, 'In Progress'); // Gate: Has acceptance criteria - -// Work complete, proof generated -await transitionState(issueId, 'In Review'); // Gate: Proof >= 80% - -// High confidence auto-approval -await transitionState(issueId, 'Done'); // Gate: Confidence >= 95% +await transitionState(issueId, 'In Progress'); // Gate: has acceptance criteria +await transitionState(issueId, 'In Review'); // Gate: proof >= 80% +await transitionState(issueId, 'Done'); // Gate: confidence >= 95% ``` -### Example 2: Iteration Loop - +### Iteration loop ```typescript -// First attempt await transitionState(issueId, 'In Progress'); -await transitionState(issueId, 'In Review'); // Confidence: 85% - -// User requests changes -await transitionState(issueId, 'In Progress'); // Feedback: REQUESTED_CHANGES - -// Second attempt -await transitionState(issueId, 'In Review'); // Confidence: 97% -await transitionState(issueId, 'Done'); // Auto-approved +await transitionState(issueId, 'In Review'); // Confidence: 85% +// User requests changes → back to In Progress +await transitionState(issueId, 'In Progress'); +await transitionState(issueId, 'In Review'); // Confidence: 97% → auto-approved +await transitionState(issueId, 'Done'); ``` -### Example 3: Escalation - +### Escalation ```typescript -// After 5 feedback rounds if (iterationCount >= MAX_FEEDBACK_ROUNDS) { await transitionState(issueId, 'Blocked'); await addComment(issueId, "Escalated: Max iterations reached"); } ``` -## Best Practices +## Verification -- Always validate before transitioning -- Log all transitions for audit trail -- Include proof artifacts when transitioning to In Review -- Enforce iteration limits to prevent infinite loops -- Escalate gracefully rather than failing silently -- Comment on Linear when state changes for visibility +After each transition, confirm: +- [ ] State was updated in Linear +- [ ] Transition was logged for audit trail +- [ ] Gate conditions were checked before transition +- [ ] Iteration count was incremented (if applicable) diff --git a/plugins/autopilot/skills/tag-command-mapping/SKILL.md b/plugins/autopilot/skills/tag-command-mapping/SKILL.md index 7d5d3c6..f44b30f 100644 --- a/plugins/autopilot/skills/tag-command-mapping/SKILL.md +++ b/plugins/autopilot/skills/tag-command-mapping/SKILL.md @@ -1,184 +1,109 @@ --- name: tag-command-mapping -description: How tag-to-command routing works in autopilot. Defines default mappings, precedence rules, and customization patterns. -version: 0.1.0 -tags: [routing, tags, commands, mapping] -keywords: [tag, command, mapping, routing, classification, precedence] +description: "Routes Linear tasks to Claude Code commands based on tag labels, applying precedence rules when multiple tags exist and falling back to text classification. Supports custom mappings via autopilot.local.md. Use when resolving which agent handles a Linear task." --- -plugin: autopilot -updated: 2026-01-20 # Tag-to-Command Mapping -**Version:** 0.1.0 -**Purpose:** Route Linear tasks to appropriate Claude Code commands based on tags -**Status:** Phase 1 +Routes incoming Linear tasks to the appropriate Claude Code command/agent based on tag labels, precedence rules, and text classification fallback. -## When to Use +## Workflow -Use this skill when you need to: -- Understand how Linear tags map to Claude Code commands -- Customize tag-to-command mappings for a project -- Handle tasks with multiple tags (precedence rules) -- Classify tasks based on title/description text -- Resolve the correct agent/command for a task +1. **Extract tags** - filter labels starting with `@` from the Linear issue +2. **Apply precedence** - if multiple tags, select highest-priority tag +3. **Resolve mapping** - look up command, agent, and skills for the selected tag +4. **Fallback to classification** - if no tags, classify from title/description text +5. **Return routing** - provide the resolved command, agent, and skills to the caller -## Overview +## Default Tag Mappings -Tag-to-command mapping is the core routing mechanism in autopilot. When a task arrives from Linear, its labels determine which Claude Code command/agent handles execution. +| Tag | Command | Agent | Skills | +|-----|---------|-------|--------| +| `@debug` | `/dev:debug` | debugger | debugging-strategies | +| `@test` | `/dev:test-architect` | test-architect | testing-strategies | +| `@ui` | `/dev:ui` | ui | ui-design-review | +| `@frontend` | `/dev:feature` | developer | react-typescript | +| `@backend` | `/dev:implement` | developer | golang, api-design | +| `@review` | `/commit-commands:commit-push-pr` | reviewer | universal-patterns | +| `@refactor` | `/dev:implement` | developer | universal-patterns | +| `@research` | `/dev:deep-research` | researcher | n/a | -## Default Mappings +## Precedence Order (highest to lowest) -| Linear Tag | Command | Agent | Skills | -|------------|---------|-------|--------| -| @frontend | /dev:feature | developer | react-typescript | -| @backend | /dev:implement | developer | golang, api-design | -| @debug | /dev:debug | debugger | debugging-strategies | -| @test | /dev:test-architect | test-architect | testing-strategies | -| @review | /commit-commands:commit-push-pr | reviewer | universal-patterns | -| @refactor | /dev:implement | developer | universal-patterns | -| @research | /dev:deep-research | researcher | n/a | -| @ui | /dev:ui | ui | ui-design-review | - -## Precedence Rules - -When multiple tags are present, apply precedence order: +`@debug` > `@test` > `@ui` > `@frontend` > `@backend` > `@review` > `@refactor` > `@research` ```typescript -const PRECEDENCE = [ - '@debug', // Bug fixing takes priority - '@test', // Tests before implementation - '@ui', // UI before generic frontend - '@frontend', // Frontend before generic - '@backend', // Backend before generic - '@review', // Review after implementation - '@refactor', // Refactoring is lower priority - '@research' // Research is lowest -]; +const PRECEDENCE = ['@debug', '@test', '@ui', '@frontend', '@backend', '@review', '@refactor', '@research']; function selectTag(labels: string[]): string { const agentTags = labels.filter(l => l.startsWith('@')); - - if (agentTags.length === 0) return 'default'; - if (agentTags.length === 1) return agentTags[0]; - - // Multiple tags - apply precedence + if (agentTags.length <= 1) return agentTags[0] || 'default'; for (const tag of PRECEDENCE) { if (agentTags.includes(tag)) return tag; } - return 'default'; } ``` +## Text Classification Fallback + +When no tags are present, classify from task text: + +```typescript +function classifyTask(title: string, description: string): string { + const text = `${title} ${description}`.toLowerCase(); + if (/\b(fix|bug|error|crash|broken)\b/.test(text)) return 'BUG_FIX'; // → @debug + if (/\b(add|implement|create|new|feature)\b/.test(text)) return 'FEATURE'; // → @frontend + if (/\b(refactor|clean|optimize|improve)\b/.test(text)) return 'REFACTOR'; // → @refactor + if (/\b(ui|design|component|style|visual)\b/.test(text)) return 'UI_CHANGE'; // → @ui + if (/\b(test|coverage|e2e|spec)\b/.test(text)) return 'TEST'; // → @test + return 'UNKNOWN'; // → @frontend (default) +} +``` + ## Custom Mappings -Users can define custom mappings in `.claude/autopilot.local.md`: +Define project-specific mappings in `.claude/autopilot.local.md`: ```yaml ---- tag_mappings: "@database": command: "/dev:implement" agent: "developer" skills: ["database-patterns"] - systemPrompt: "You are a database specialist." - "@performance": command: "/dev:implement" agent: "developer" skills: ["universal-patterns"] - systemPrompt: "You are a performance optimization expert." ---- -``` - -## Task Classification - -Beyond explicit tags, classify tasks from text: - -```typescript -function classifyTask(title: string, description: string): string { - const text = `${title} ${description}`.toLowerCase(); - - // Keyword patterns - if (/\b(fix|bug|error|crash|broken)\b/.test(text)) return 'BUG_FIX'; - if (/\b(add|implement|create|new|feature)\b/.test(text)) return 'FEATURE'; - if (/\b(refactor|clean|optimize|improve)\b/.test(text)) return 'REFACTOR'; - if (/\b(ui|design|component|style|visual)\b/.test(text)) return 'UI_CHANGE'; - if (/\b(test|coverage|e2e|spec)\b/.test(text)) return 'TEST'; - if (/\b(doc|documentation|readme)\b/.test(text)) return 'DOCUMENTATION'; - - return 'UNKNOWN'; -} -``` - -## Mapping Resolution - -Complete resolution algorithm: - -```typescript -function resolveMapping(labels: string[], title: string, desc: string) { - // 1. Check explicit tags - const tag = selectTag(labels); - - if (tag !== 'default') { - return getMappingForTag(tag); - } - - // 2. Classify from text - const taskType = classifyTask(title, desc); - - // 3. Map task type to default tag - const typeToTag = { - 'BUG_FIX': '@debug', - 'FEATURE': '@frontend', - 'UI_CHANGE': '@ui', - 'TEST': '@test', - 'REFACTOR': '@refactor', - 'DOCUMENTATION': '@research', - }; - - return getMappingForTag(typeToTag[taskType] || '@frontend'); -} ``` ## Examples -### Example 1: Single Tag Resolution - +### Single tag resolution ```typescript -// Task with @frontend label const labels = ['@frontend', 'feature']; -const tag = selectTag(labels); // '@frontend' -const mapping = getMappingForTag(tag); -// Result: { command: '/dev:feature', agent: 'developer', skills: ['react-typescript'] } +selectTag(labels); // '@frontend' +// → { command: '/dev:feature', agent: 'developer', skills: ['react-typescript'] } ``` -### Example 2: Multiple Tag Precedence - +### Multiple tag precedence ```typescript -// Task with both @frontend and @debug const labels = ['@frontend', '@debug']; -const tag = selectTag(labels); // '@debug' (higher precedence) -const mapping = getMappingForTag(tag); -// Result: { command: '/dev:debug', agent: 'debugger', skills: ['debugging-strategies'] } +selectTag(labels); // '@debug' (higher precedence) +// → { command: '/dev:debug', agent: 'debugger', skills: ['debugging-strategies'] } ``` -### Example 3: Text Classification Fallback - +### Text classification fallback ```typescript -// Task without tags -const labels = []; -const title = "Fix login button not working"; -const mapping = resolveMapping(labels, title, ""); -// Classifies as BUG_FIX -> @debug -// Result: { command: '/dev:debug', agent: 'debugger', skills: ['debugging-strategies'] } +resolveMapping([], "Fix login button not working", ""); +// Classifies as BUG_FIX → @debug +// → { command: '/dev:debug', agent: 'debugger', skills: ['debugging-strategies'] } ``` -## Best Practices +## Verification -- Use explicit tags over relying on classification -- Create custom mappings for project-specific workflows -- Debug > Test > UI > Frontend precedence makes sense -- Review mapping effectiveness periodically -- Keep tag names short and descriptive (start with @) +When resolving a mapping, confirm: +- [ ] Correct tag selected based on precedence +- [ ] Mapping resolves to valid command, agent, and skills +- [ ] Custom mappings in autopilot.local.md are checked before defaults +- [ ] Fallback classification produces reasonable routing diff --git a/plugins/bun/skills/claudish-usage/SKILL.md b/plugins/bun/skills/claudish-usage/SKILL.md index 9431a88..4d67ede 100644 --- a/plugins/bun/skills/claudish-usage/SKILL.md +++ b/plugins/bun/skills/claudish-usage/SKILL.md @@ -1,9 +1,7 @@ --- name: claudish-usage -description: CRITICAL - Guide for using Claudish CLI ONLY through sub-agents to run Claude Code with OpenRouter models (Grok, GPT-5, Gemini, MiniMax). NEVER run Claudish directly in main context unless user explicitly requests it. Use when user mentions external AI models, Claudish, OpenRouter, or alternative models. Includes mandatory sub-agent delegation patterns, agent selection guide, file-based instructions, and strict rules to prevent context window pollution. +description: "Delegates Claudish CLI calls to sub-agents for running Claude Code with OpenRouter models (Grok, GPT-5, Gemini, MiniMax). Enforces sub-agent-only execution to prevent context window pollution, provides agent selection guides and file-based instruction patterns. Use when invoking external AI models, running Claudish, or setting up OpenRouter-based multi-model workflows." --- -plugin: bun -updated: 2026-01-20 # Claudish Usage Skill diff --git a/plugins/code-analysis/skills/architect-detective/SKILL.md b/plugins/code-analysis/skills/architect-detective/SKILL.md index 41e942b..025946b 100644 --- a/plugins/code-analysis/skills/architect-detective/SKILL.md +++ b/plugins/code-analysis/skills/architect-detective/SKILL.md @@ -1,484 +1,65 @@ --- name: architect-detective -description: Use when analyzing architecture and system design. Find design patterns, map layers, identify core abstractions via PageRank. Uses claudemem AST structural analysis for efficient architecture investigation. -updated: 2026-01-20 -keywords: architecture, design-patterns, system-design, claudemem, pagerank, layers -allowed-tools: Bash, Task, Read, AskUserQuestion +description: "Analyzes codebase architecture using claudemem AST structural analysis with PageRank. Maps layers, identifies core abstractions, traces dependency flow, and detects dead code. Use when analyzing system design, finding design patterns, mapping architectural boundaries, or auditing code structure." --- -# Architect Detective Skill +# Architect Detective -This skill uses claudemem's AST structural analysis for architecture investigation. +Software Architect perspective for deep architectural investigation using claudemem's `map`, `symbol`, `callers`, `callees`, and `dead-code` commands with PageRank centrality analysis. -## Why Claudemem Works Better for Architecture +## Workflow -| Task | claudemem | Native Tools | -|------|-----------|--------------| -| Find core abstractions | `map` with PageRank ranking | Read all files | -| Identify design patterns | Structural symbol graph | Grep patterns | -| Map dependencies | `callers`/`callees` chains | Manual tracing | -| Find architectural pillars | High-PageRank symbols | Unknown | +1. **Verify claudemem** — confirm v0.3.0+ installed and indexed. Check freshness; reindex if stale. +2. **Map the landscape** — run `claudemem --agent map` and identify high-PageRank symbols (> 0.05) as architectural pillars. +3. **Identify layers** — map presentation, business, and data layers with targeted queries. +4. **Trace dependencies** — for each pillar, run `callers` (who depends on it) and `callees` (what it depends on). +5. **Find boundaries** — search for interfaces, contracts, and dependency injection points. +6. **Detect dead code** (v0.4.0+) — run `dead-code`, categorize by PageRank (high = broken, low = cleanup candidate). +7. **Validate results** — confirm PageRank data is present and symbols match expected architectural patterns. -**Primary commands:** -- `claudemem --agent map "query"` - Architecture overview with PageRank -- `claudemem --agent symbol ` - Exact file:line locations - -# Architect Detective Skill - -**Version:** 3.3.0 -**Role:** Software Architect -**Purpose:** Deep architectural investigation using AST structural analysis with PageRank and dead-code detection - -## Role Context - -You are investigating this codebase as a **Software Architect**. Your focus is on: -- **System boundaries** - Where modules, services, and layers begin and end -- **Design patterns** - Architectural patterns used (MVC, Clean Architecture, DDD, etc.) -- **Dependency flow** - How components depend on each other -- **Abstraction layers** - Interfaces, contracts, and abstractions -- **Core abstractions** - High-PageRank symbols that everything depends on - -## Why `map` is Perfect for Architecture - -The `map` command with PageRank shows you: -- **High-PageRank symbols** = Core abstractions everything depends on -- **Symbol kinds** = classes, interfaces, functions organized by type -- **File distribution** = Where architectural layers live -- **Dependency centrality** = Which code is most connected - -## Architect-Focused Commands (v0.3.0) - -### Architecture Discovery (use `map`) - -```bash -# Get high-level architecture overview -claudemem --agent map "architecture layers" -# Find core abstractions (highest PageRank) -claudemem --agent map # Full map, sorted by importance - -# Map specific architectural concerns -claudemem --agent map "service layer business logic"claudemem --agent map "repository data access"claudemem --agent map "controller API endpoints"claudemem --agent map "middleware request handling"``` - -### Layer Boundary Discovery - -```bash -# Find interfaces/contracts (architectural boundaries) -claudemem --agent map "interface contract abstract" -# Find dependency injection points -claudemem --agent map "inject provider module" -# Find configuration/bootstrap -claudemem --agent map "config bootstrap initialize"``` - -### Pattern Discovery - -```bash -# Find factory patterns -claudemem --agent map "factory create builder" -# Find repository patterns -claudemem --agent map "repository persist query" -# Find event-driven patterns -claudemem --agent map "event emit subscribe handler"``` - -### Dependency Analysis - -```bash -# For a core abstraction, see what depends on it -claudemem --agent callers CoreService -# See what the abstraction depends on -claudemem --agent callees CoreService -# Get full dependency context -claudemem --agent context CoreService``` - -### Dead Code Detection (v0.4.0+ Required) - -```bash -# Find unused symbols for cleanup -claudemem --agent dead-code -# Only truly dead code (very low PageRank) -claudemem --agent dead-code --max-pagerank 0.005``` - -**Architectural insight**: Dead code indicates: -- Failed features that were never removed -- Over-engineering (abstractions nobody uses) -- Potential tech debt cleanup opportunities - -High PageRank + dead = Something broke recently (investigate!) -Low PageRank + dead = Safe to remove - -**Handling Results:** -```bash -DEAD_CODE=$(claudemem --agent dead-code) -if [ -z "$DEAD_CODE" ]; then - echo "No dead code found - architecture is well-maintained" -else - # Categorize by risk - HIGH_PAGERANK=$(echo "$DEAD_CODE" | awk '$5 > 0.01') - LOW_PAGERANK=$(echo "$DEAD_CODE" | awk '$5 <= 0.01') - - if [ -n "$HIGH_PAGERANK" ]; then - echo "WARNING: High-PageRank dead code found (possible broken references)" - echo "$HIGH_PAGERANK" - fi - - if [ -n "$LOW_PAGERANK" ]; then - echo "Cleanup candidates (low PageRank):" - echo "$LOW_PAGERANK" - fi -fi -``` - -**Limitations Note:** -Results labeled "Potentially Dead" require manual verification for: -- Dynamically imported modules -- Reflection-accessed code -- External API consumers - -## PHASE 0: MANDATORY SETUP - -### Step 1: Verify claudemem v0.3.0 - -```bash -which claudemem && claudemem --version -# Must be 0.3.0+ -``` - -### Step 2: If Not Installed → STOP - -Use AskUserQuestion (see ultrathink-detective for template) - -### Step 3: Check Index Status - -```bash -# Check claudemem installation and index -claudemem --version && ls -la .claudemem/index.db 2>/dev/null -``` - -### Step 3.5: Check Index Freshness - -Before proceeding with investigation, verify the index is current: - -```bash -# First check if index exists -if [ ! -d ".claudemem" ] || [ ! -f ".claudemem/index.db" ]; then - # Use AskUserQuestion to prompt for index creation - # Options: [1] Create index now (Recommended), [2] Cancel investigation - exit 1 -fi - -# Count files modified since last index -STALE_COUNT=$(find . -type f \( -name "*.ts" -o -name "*.tsx" -o -name "*.js" -o -name "*.jsx" -o -name "*.py" -o -name "*.go" -o -name "*.rs" \) \ - -newer .claudemem/index.db 2>/dev/null | grep -v "node_modules" | grep -v ".git" | grep -v "dist" | grep -v "build" | wc -l) -STALE_COUNT=$((STALE_COUNT + 0)) # Normalize to integer - -if [ "$STALE_COUNT" -gt 0 ]; then - # Get index time with explicit platform detection - if [[ "$OSTYPE" == "darwin"* ]]; then - INDEX_TIME=$(stat -f "%Sm" -t "%Y-%m-%d %H:%M" .claudemem/index.db 2>/dev/null) - else - INDEX_TIME=$(stat -c "%y" .claudemem/index.db 2>/dev/null | cut -d'.' -f1) - fi - INDEX_TIME=${INDEX_TIME:-"unknown time"} - - # Get sample of stale files - STALE_SAMPLE=$(find . -type f \( -name "*.ts" -o -name "*.tsx" \) \ - -newer .claudemem/index.db 2>/dev/null | grep -v "node_modules" | grep -v ".git" | head -5) - - # Use AskUserQuestion (see template in ultrathink-detective) -fi -``` - -### Step 4: Index if Needed +## Example: Architecture Analysis ```bash -claudemem index -``` - ---- - -## Workflow: Architecture Analysis (v0.3.0) - -### Phase 1: Map the Landscape - -```bash -# Get structural overview with PageRank +# Step 1: Structural overview claudemem --agent map -# Focus on high-PageRank symbols (> 0.01) - these are architectural pillars -``` -### Phase 2: Identify Layers +# Step 2: Layer identification +claudemem --agent map "controller handler endpoint" # Presentation +claudemem --agent map "service business logic" # Business +claudemem --agent map "repository database query" # Data -```bash -# Map each layer -claudemem --agent map "controller handler endpoint" # Presentation -claudemem --agent map "service business logic" # Business -claudemem --agent map "repository database query" # Data -``` - -### Phase 3: Trace Dependencies - -```bash -# For each high-PageRank symbol, understand its role -claudemem --agent symbol UserServiceclaudemem --agent callers UserService # Who depends on it? -claudemem --agent callees UserService # What does it depend on? -``` - -### Phase 4: Identify Boundaries - -```bash -# Find interfaces (architectural contracts) -claudemem --agent map "interface abstract" -# Check how implementations connect -claudemem --agent callers IUserRepository``` +# Step 3: Dependency tracing for a core abstraction +claudemem --agent symbol UserService +claudemem --agent callers UserService # What depends on it +claudemem --agent callees UserService # What it depends on -### Phase 5: Cleanup Opportunities (v0.4.0+ Required) - -```bash -# Find dead code -DEAD_CODE=$(claudemem --agent dead-code) - -if [ -z "$DEAD_CODE" ]; then - echo "No cleanup needed - codebase is well-maintained" -else - # For each dead symbol: - # - Check PageRank (low = utility, high = broken) - # - Verify not used externally (see limitations) - # - Add to cleanup backlog - - echo "Review each item for static analysis limitations:" - echo "- Dynamic imports may hide real usage" - echo "- External callers not visible to static analysis" -fi -``` - -## Output Format: Architecture Report - -### 1. Architecture Overview - -``` -┌─────────────────────────────────────────────────────────┐ -│ ARCHITECTURE ANALYSIS │ -├─────────────────────────────────────────────────────────┤ -│ Pattern: Clean Architecture / Layered │ -│ Core Abstractions (PageRank > 0.05): │ -│ - UserService (0.092) - Central business logic │ -│ - Database (0.078) - Data access foundation │ -│ - AuthMiddleware (0.056) - Security boundary │ -│ Search Method: claudemem v0.3.0 (AST + PageRank) │ -└─────────────────────────────────────────────────────────┘ +# Step 4: Dead code detection (v0.4.0+) +claudemem --agent dead-code ``` -### 2. Layer Map +**Verification:** Confirm `map` output contains PageRank values. If empty or missing PageRank, diagnose and reindex. -``` -┌─────────────────────────────────────────────────────────┐ -│ LAYER STRUCTURE │ -├─────────────────────────────────────────────────────────┤ -│ │ -│ PRESENTATION (src/controllers/, src/routes/) │ -│ └── UserController (0.034) │ -│ └── AuthController (0.028) │ -│ ↓ │ -│ BUSINESS (src/services/) │ -│ └── UserService (0.092) ⭐HIGH PAGERANK │ -│ └── AuthService (0.067) │ -│ ↓ │ -│ DATA (src/repositories/) │ -│ └── UserRepository (0.045) │ -│ └── Database (0.078) ⭐HIGH PAGERANK │ -│ │ -└─────────────────────────────────────────────────────────┘ -``` - -### 3. Dependency Flow - -``` -Entry → Controller → Service → Repository → Database - ↘ Middleware (cross-cutting) -``` - -## PageRank for Architecture +## PageRank Interpretation | PageRank | Architectural Role | Action | |----------|-------------------|--------| -| > 0.05 | Core abstraction | This IS the architecture - understand first | -| 0.01-0.05 | Important component | Key building block, affects many things | -| 0.001-0.01 | Standard component | Normal code, not architecturally significant | -| < 0.001 | Leaf/utility | Implementation detail, skip for arch analysis | - -## Result Validation Pattern - -After EVERY claudemem command, validate results: - -### Map Command Validation - -After `map` commands, validate architectural symbols were found: - -```bash -RESULTS=$(claudemem --agent map "service layer business logic") -EXIT_CODE=$? - -# Check for failure -if [ "$EXIT_CODE" -ne 0 ]; then - DIAGNOSIS=$(claudemem status 2>&1) - # Use AskUserQuestion -fi - -# Check for empty results -if [ -z "$RESULTS" ]; then - echo "WARNING: No symbols found - may be wrong query or index issue" - # Use AskUserQuestion: Reindex, Different query, or Cancel -fi - -# Check for high-PageRank symbols (> 0.01) -HIGH_PR=$(echo "$RESULTS" | grep "pagerank:" | awk -F': ' '{if ($2 > 0.01) print}' | wc -l) - -if [ "$HIGH_PR" -eq 0 ]; then - # No architectural symbols found - may be wrong query or index issue - # Use AskUserQuestion: Reindex, Broaden query, or Cancel -fi -``` - -### Symbol Validation - -```bash -SYMBOL=$(claudemem --agent symbol ArchitecturalComponent) - -if [ -z "$SYMBOL" ] || echo "$SYMBOL" | grep -qi "not found\|error"; then - # Component doesn't exist or index issue - # Use AskUserQuestion -fi -``` - ---- +| > 0.05 | Core abstraction | Analyze first — this IS the architecture | +| 0.01-0.05 | Important component | Key building block | +| 0.001-0.01 | Standard component | Not architecturally significant | +| < 0.001 | Leaf/utility | Skip for architecture analysis | -## FALLBACK PROTOCOL +## Fallback Protocol -**CRITICAL: Never use grep/find/Glob without explicit user approval.** +Never use grep/find/Glob without explicit user approval. If claudemem fails: -If claudemem fails or returns irrelevant results: - -1. **STOP** - Do not silently switch tools -2. **DIAGNOSE** - Run `claudemem status` -3. **REPORT** - Tell user what happened -4. **ASK** - Use AskUserQuestion for next steps - -```typescript -// Fallback options (in order of preference) -AskUserQuestion({ - questions: [{ - question: "claudemem map returned no architectural symbols or failed. How should I proceed?", - header: "Architecture Discovery Issue", - multiSelect: false, - options: [ - { label: "Reindex codebase", description: "Run claudemem index (~1-2 min)" }, - { label: "Try broader query", description: "Use different architectural terms" }, - { label: "Use grep (not recommended)", description: "Traditional search - loses PageRank ranking" }, - { label: "Cancel", description: "Stop investigation" } - ] - }] -}) -``` - -**See ultrathink-detective skill for complete Fallback Protocol documentation.** - ---- - -## Anti-Patterns - -| Anti-Pattern | Why Wrong | Correct Approach | -|--------------|-----------|------------------| -| `grep -r "class"` | No ranking, no structure | `claudemem --agent map` | -| Read all files | Token waste | Focus on high-PageRank symbols | -| Skip `map` command | Miss architecture | ALWAYS start with `map` | -| Ignore PageRank | Miss core abstractions | High PageRank = important | -| `cmd \| head/tail` | Hides high-PageRank symbols | Use full output or `--tokens` | - -### Output Truncation Warning - -╔══════════════════════════════════════════════════════════════════════════════╗ -║ ║ -║ ❌ Anti-Pattern 7: Truncating Claudemem Output ║ -║ ║ -║ FORBIDDEN (any form of output truncation): ║ -║ → BAD: claudemem --agent map "query" | head -80 ║ -║ → BAD: claudemem --agent callers X | tail -50 ║ -║ → BAD: claudemem --agent search "x" | grep -m 10 "y" ║ -║ → BAD: claudemem --agent map "q" | awk 'NR <= 50' ║ -║ → BAD: claudemem --agent callers X | sed '50q' ║ -║ → BAD: claudemem --agent search "x" | sort | head -20 ║ -║ → BAD: claudemem --agent map "q" | grep "pattern" | head -20 ║ -║ ║ -║ CORRECT (use full output or built-in limits): ║ -║ → GOOD: claudemem --agent map "query" ║ -║ → GOOD: claudemem --agent search "x" -n 10 ║ -║ → GOOD: claudemem --agent map "q" --tokens 2000 ║ -║ → GOOD: claudemem --agent search "x" --page-size 20 --page 1 ║ -║ → GOOD: claudemem --agent context Func --max-depth 3 ║ -║ ║ -║ WHY: Output is pre-optimized; truncation hides critical results ║ -║ ║ -╚══════════════════════════════════════════════════════════════════════════════╝ - ---- - -## Feedback Reporting (v0.8.0+) - -After completing investigation, report search feedback to improve future results. - -### When to Report - -Report feedback ONLY if you used the `search` command during investigation: - -| Result Type | Mark As | Reason | -|-------------|---------|--------| -| Read and used | Helpful | Contributed to investigation | -| Read but irrelevant | Unhelpful | False positive | -| Skipped after preview | Unhelpful | Not relevant to query | -| Never read | (Don't track) | Can't evaluate | - -### Feedback Pattern - -```bash -# Track during investigation -SEARCH_QUERY="your original query" -HELPFUL_IDS="" -UNHELPFUL_IDS="" - -# When reading a helpful result -HELPFUL_IDS="$HELPFUL_IDS,$result_id" - -# When reading an unhelpful result -UNHELPFUL_IDS="$UNHELPFUL_IDS,$result_id" - -# Report at end of investigation (v0.8.0+ only) -if claudemem feedback --help 2>&1 | grep -qi "feedback"; then - timeout 5 claudemem feedback \ - --query "$SEARCH_QUERY" \ - --helpful "${HELPFUL_IDS#,}" \ - --unhelpful "${UNHELPFUL_IDS#,}" 2>/dev/null || true -fi -``` - -### Output Update - -Include in investigation report: - -``` -Search Feedback: [X helpful, Y unhelpful] - Submitted (v0.8.0+) -``` - ---- +1. Stop — do not silently switch tools. +2. Diagnose — run `claudemem status`. +3. Ask user via AskUserQuestion (reindex, broaden query, or cancel). ## Notes -- **`map` is your primary tool** - It shows architecture through PageRank -- High-PageRank symbols ARE the architecture - they're what everything depends on -- Use `callers` to see what depends on a component (impact of changes) -- Use `callees` to see what a component depends on (its requirements) -- Works best with TypeScript, Go, Python, Rust codebases - ---- - -**Maintained by:** MadAppGang -**Plugin:** code-analysis v2.7.0 -**Last Updated:** December 2025 (v3.3.0 - Cross-platform compatibility, inline templates, improved validation) +- `map` is the primary tool — it reveals architecture through PageRank centrality +- High PageRank + dead = something recently broke (investigate immediately) +- Low PageRank + dead = safe cleanup candidate +- Never truncate claudemem output — use `--tokens` or `-n` flags for size control +- Submit search feedback via `claudemem feedback` (v0.8.0+) diff --git a/plugins/code-analysis/skills/claudemem-orchestration/SKILL.md b/plugins/code-analysis/skills/claudemem-orchestration/SKILL.md index 1d6954e..6b86cc3 100644 --- a/plugins/code-analysis/skills/claudemem-orchestration/SKILL.md +++ b/plugins/code-analysis/skills/claudemem-orchestration/SKILL.md @@ -1,10 +1,7 @@ --- -name: claudemem-orchestration -description: Use when orchestrating multi-agent code analysis with claudemem. Run claudemem once, share output across parallel agents. Enables parallel investigation, consensus analysis, and role-based command mapping. -updated: 2026-01-20 -keywords: claudemem, orchestration, multi-agent, parallel-execution, consensus +name: mem-orchestration +description: "Coordinates multiple agents using shared claudemem output for code analysis. Use when orchestrating multi-agent code analysis, running parallel investigations, performing consensus analysis, or mapping role-based commands across agents." allowed-tools: Bash, Task, Read, Write, AskUserQuestion -skills: orchestration:multi-model-validation --- # Claudemem Multi-Agent Orchestration diff --git a/plugins/code-analysis/skills/claudemem-search/SKILL.md b/plugins/code-analysis/skills/claudemem-search/SKILL.md index 0c812c8..4cdaed0 100644 --- a/plugins/code-analysis/skills/claudemem-search/SKILL.md +++ b/plugins/code-analysis/skills/claudemem-search/SKILL.md @@ -1,6 +1,6 @@ --- -name: claudemem-search -description: "⚡ PRIMARY TOOL for semantic code search AND structural analysis. NEW: AST tree navigation with map, symbol, callers, callees, context commands. PageRank ranking. Recommended workflow: Map structure first, then search semantically, analyze callers before modifying." +name: mem-search +description: "Provides semantic code search and AST-based structural analysis via claudemem. Use when searching codebases semantically, navigating symbol graphs with PageRank ranking, or analyzing callers and callees before modifying code." allowed-tools: Bash, Task, AskUserQuestion --- diff --git a/plugins/code-analysis/skills/code-search-selector/SKILL.md b/plugins/code-analysis/skills/code-search-selector/SKILL.md index 2c61bfc..5e4e1f6 100644 --- a/plugins/code-analysis/skills/code-search-selector/SKILL.md +++ b/plugins/code-analysis/skills/code-search-selector/SKILL.md @@ -1,16 +1,20 @@ --- name: code-search-selector -description: "💡 Tool selector for code search tasks. Helps choose between semantic search (claudemem) and native tools (Grep/Glob) based on query type. Semantic search recommended for: 'how does X work', 'find all', 'audit', 'investigate', 'architecture'." -allowed-tools: Bash, Read, AskUserQuestion +description: "Selects between semantic search (claudemem) and native tools (Grep/Glob) based on query type. Checks claudemem index status and classifies tasks as semantic or exact-match. Use when choosing a code search tool, before investigation tasks, or when deciding between claudemem search and grep." --- # Code Search Tool Selector -This skill helps choose the most effective search tool for your task. +Chooses the most effective search tool for each task by classifying queries and checking claudemem availability. -## When Semantic Search Works Better +## Workflow -Claudemem provides better results for conceptual queries: +1. **Check claudemem status** — run `claudemem status` (mandatory before any semantic search). +2. **Classify the task** — determine if the query is semantic (conceptual) or exact-match (literal string). +3. **Select the tool** based on classification and claudemem availability. +4. **Execute the search** using the recommended approach. + +## Classification Guide | Query Type | Example | Recommended Tool | |------------|---------|------------------| @@ -18,285 +22,52 @@ Claudemem provides better results for conceptual queries: | Find implementations | "Find all API endpoints" | `claudemem search` | | Architecture questions | "Map the service layer" | `claudemem --agent map` | | Trace data flow | "How does user data flow?" | `claudemem search` | -| Audit integrations | "Audit Prime API usage" | `claudemem search` | - -## When Native Tools Work Better - -| Query Type | Example | Recommended Tool | -|------------|---------|------------------| | Exact string match | "Find 'DEPRECATED_FLAG'" | `Grep` | | Count occurrences | "How many TODO comments?" | `Grep -c` | -| Specific symbol | "Find class UserService" | `Grep` | | File patterns | "Find all *.config.ts" | `Glob` | -## Why Semantic Search is Often More Efficient - -**Token Efficiency**: Reading 5 files costs ~5000 tokens; claudemem search costs ~500 tokens with ranked results. - -**Context Discovery**: Claudemem finds related code you didn't know to ask for. - -**Ranking**: Results sorted by relevance and PageRank, so important code comes first. - ## Example: Semantic Query -**User asks:** "How does authentication work?" - -**Less effective approach:** -```bash -grep -r "auth" src/ -# Result: 500 lines of noise, hard to understand -``` - -**More effective approach:** ```bash -claudemem status # Check if indexed -claudemem search "authentication login flow JWT" -# Result: Top 10 semantically relevant code chunks, ranked -``` - -## Quick Decision Guide - -### Classify the Task - -| User Request | Category | Recommended Tool | -|--------------|----------|------------------| -| "Find all X", "How does X work" | Semantic | claudemem search | -| "Audit X integration", "Map data flow" | Semantic | claudemem search | -| "Understand architecture", "Trace X" | Semantic | claudemem map | -| "Find exact string 'foo'" | Exact Match | Grep | -| "Count occurrences of X" | Exact Match | Grep | -| "Find symbol UserService" | Exact Match | Grep | - -### Step 2: Check claudemem Status (MANDATORY for Semantic) - -```bash -# ALWAYS run this before semantic search +# Step 1: Check index claudemem status -``` - -**Interpret the output:** +# Output shows chunk count (e.g., "938 chunks") → indexed -| Status | What It Means | Next Action | -|--------|---------------|-------------| -| Shows chunk count (e.g., "938 chunks") | ✅ Indexed | **USE CLAUDEMEM** (Step 3) | -| "No index found" | ❌ Not indexed | Offer to index (Step 2b) | -| "command not found" | ❌ Not installed | Fall back to Detective agent | - -### Step 2b: If Not Indexed, Offer to Index - -```typescript -AskUserQuestion({ - questions: [{ - question: "Claudemem is not indexed. Index now for better semantic search results?", - header: "Index?", - multiSelect: false, - options: [ - { label: "Yes, index now (Recommended)", description: "Takes 1-2 minutes, enables semantic search" }, - { label: "No, use grep instead", description: "Faster but less accurate for semantic queries" } - ] - }] -}) -``` - -If user says yes: -```bash -claudemem index -y +# Step 2: Search semantically +claudemem search "authentication login flow JWT" -n 15 +# Result: Top 15 ranked code chunks vs. grep's 500 lines of noise ``` -### Step 3: Execute the Search +**Verification:** Confirm `claudemem status` shows chunk count before proceeding with semantic search. -**IF CLAUDEMEM IS INDEXED (from Step 2):** +## Example: Exact Match ```bash -# Get role-specific guidance first -claudemem ai developer # or architect, tester, debugger - -# Then search semantically -claudemem search "authentication login JWT token validation" -n 15 -``` - -**IF CLAUDEMEM IS NOT AVAILABLE:** - -Use the detective agent: -```typescript -Task({ - subagent_type: "code-analysis:detective", - description: "Investigate [topic]", - prompt: "Use semantic search to find..." -}) -``` - -### Tool Recommendations by Use Case - -| Use Case | Less Efficient | More Efficient | -|----------|----------------|----------------| -| Semantic queries | `grep -r "pattern" src/` | `claudemem search "concept"` | -| Find implementations | `Glob → Read all` | `claudemem search "feature"` | -| Understand flow | `find . -name "*.ts" \| xargs...` | `claudemem --agent map` | - -Native tools (Grep, Glob, find) work well for exact matches but provide no semantic ranking. - ---- - -## When Hooks Redirect to Claudemem - -If a hook provides claudemem results instead of native tool output: - -1. **Use the provided results** - They're ranked by relevance -2. **For more data** - Run additional claudemem queries -3. **Bypass available** - Use `_bypass_claudemem: true` for native tools when needed - -The hook system provides claudemem results proactively when the index is available. - ---- - -## Task-to-Tool Mapping Reference - -| User Request | Native Approach | Semantic Approach (Recommended) | -|--------------|-----------------|--------------------------------| -| "Audit all API endpoints" | `grep -r "router\|endpoint"` | `claudemem search "API endpoint route handler"` | -| "How does auth work?" | `grep -r "auth\|login"` | `claudemem search "authentication login flow"` | -| "Find all database queries" | `grep -r "prisma\|query"` | `claudemem search "database query SQL prisma"` | -| "Map the data flow" | `grep -r "transform\|map"` | `claudemem search "data transformation pipeline"` | -| "What's the architecture?" | `ls -la src/` | `claudemem --agent map "architecture"` | -| "Find error handling" | `grep -r "catch\|error"` | `claudemem search "error handling exception"` | -| "Trace user creation" | `grep -r "createUser"` | `claudemem search "user creation registration"` | - -## When Grep IS Appropriate - -✅ **Use Grep for:** -- Finding exact string: `grep -r "DEPRECATED_FLAG" src/` -- Counting occurrences: `grep -c "import React" src/**/*.tsx` -- Finding specific symbol: `grep -r "class UserService" src/` -- Regex patterns: `grep -r "TODO:\|FIXME:" src/` - -❌ **Never use Grep for:** -- Understanding how something works -- Finding implementations by concept -- Architecture analysis -- Tracing data flow -- Auditing integrations - -## Integration with Detective Skills - -After using this skill's decision tree, invoke the appropriate detective: - -| Investigation Type | Detective Skill | -|-------------------|-----------------| -| Architecture patterns | `code-analysis:architect-detective` | -| Implementation details | `code-analysis:developer-detective` | -| Test coverage | `code-analysis:tester-detective` | -| Bug root cause | `code-analysis:debugger-detective` | -| Comprehensive audit | `code-analysis:ultrathink-detective` | - -## Quick Reference Card - -``` -┌─────────────────────────────────────────────────────────────────┐ -│ CODE SEARCH QUICK REFERENCE │ -├─────────────────────────────────────────────────────────────────┤ -│ │ -│ 1. ALWAYS check first: claudemem status │ -│ │ -│ 2. If indexed: claudemem search "semantic query" │ -│ │ -│ 3. For exact matches: Grep tool (only this case!) │ -│ │ -│ 4. For deep analysis: Task(code-analysis:detective) │ -│ │ -│ ⚠️ GREP IS FOR EXACT MATCHES, NOT SEMANTIC UNDERSTANDING │ -│ │ -└─────────────────────────────────────────────────────────────────┘ +# Exact string — use native tools +grep -r "DEPRECATED_FLAG" src/ ``` -## Pre-Investigation Checklist +## Bulk Read Optimization -Before ANY code investigation task, verify: - -- [ ] Ran `claudemem status` to check index -- [ ] Classified task as SEMANTIC or EXACT MATCH -- [ ] Selected appropriate tool based on classification -- [ ] NOT using grep for semantic queries when claudemem is indexed - ---- - -## Multi-File Read Optimization - -When reading multiple files, consider if a semantic search would be more efficient: - -| Scenario | Optimization | -|----------|-------------| -| Read 3+ files in same directory | Try `claudemem search` first | -| Glob with broad patterns | Try `claudemem --agent map` | -| Sequential reads to "understand" | One semantic query may suffice | - -**Quick check before bulk reads:** -1. Is claudemem indexed? (`claudemem status`) -2. Can this be one semantic query instead of N file reads? - -### Interception Examples - -**❌ About to do:** -``` -Read src/services/auth/login.ts -Read src/services/auth/session.ts -Read src/services/auth/jwt.ts -Read src/services/auth/middleware.ts -Read src/services/auth/types.ts -Read src/services/auth/utils.ts -``` +Before reading 3+ files, check if one semantic query is more efficient: -**✅ Do instead:** -```bash -claudemem search "authentication login session JWT middleware" -n 15 -``` +| Planned Operation | Better Alternative | +|-------------------|-------------------| +| Read 5 auth files individually | `claudemem search "authentication login session" -n 15` | +| Glob all services then read | `claudemem search "service layer business logic"` | +| Sequential reads to understand | One semantic query (~500 vs ~5000 tokens) | -**❌ About to do:** -``` -Glob pattern: src/services/prime/**/*.ts -Then read all 12 matches sequentially -``` +## If Claudemem Not Indexed -**✅ Do instead:** ```bash -claudemem search "Prime API integration service endpoints" -n 20 -``` - -**❌ Parallelization trap:** -``` -"Let me Read these 5 files while the detective agent works..." -``` - -**✅ Do instead:** -``` -Trust the detective agent to use claudemem. -Don't duplicate work with inferior Read/Glob. +# Offer to index +claudemem index -y +# Takes 1-2 minutes, enables semantic search ``` ---- - -## Efficiency Comparison - -| Approach | Token Cost | Result Quality | -|----------|------------|----------------| -| Read 5+ files sequentially | ~5000 tokens | No ranking | -| Glob → Read all matches | ~3000+ tokens | No semantic understanding | -| `claudemem search` once | ~500 tokens | Ranked by relevance | - -**Tip:** Claudemem results include context around matches, so you often don't need to read full files. - ---- - -## Recommended Workflow - -1. **Check index**: `claudemem status` -2. **Search semantically**: `claudemem search "concept query" -n 15` -3. **Read specific code**: Use results to target file:line reads - -This workflow finds relevant code faster than reading files sequentially. - ---- +## Notes -**Maintained by:** MadAppGang -**Plugin:** code-analysis v2.16.0 -**Purpose:** Help choose the most efficient search tool for each task +- Semantic search finds related code even with different terminology +- Results sorted by relevance and PageRank — important code comes first +- Hook system may provide claudemem results proactively when index is available +- Use `_bypass_claudemem: true` when native tool behavior is specifically needed diff --git a/plugins/code-analysis/skills/cross-plugin-detective/SKILL.md b/plugins/code-analysis/skills/cross-plugin-detective/SKILL.md index 6fa274a..90fd9e8 100644 --- a/plugins/code-analysis/skills/cross-plugin-detective/SKILL.md +++ b/plugins/code-analysis/skills/cross-plugin-detective/SKILL.md @@ -1,266 +1,61 @@ --- name: cross-plugin-detective -description: Use when integrating detective skills across plugins. Maps agent roles to appropriate detective skills (developer → developer-detective, architect → architect-detective). Reference this to connect agents with claudemem investigation capabilities. -updated: 2026-01-20 -keywords: cross-plugin, detective, agent-mapping, claudemem, integration -allowed-tools: Bash, Task, Read, AskUserQuestion +description: "Maps agent roles to appropriate detective skills (developer-detective, architect-detective, tester-detective, debugger-detective, ultrathink-detective) for cross-plugin integration. Use when connecting agents from other plugins to claudemem investigation capabilities or deciding which detective to reference in agent frontmatter." --- # Cross-Plugin Detective Integration -**Version:** 1.0.0 -**Purpose:** Connect ANY agent to the appropriate detective skill based on role +Connects any agent across plugins to the appropriate detective skill based on its role, ensuring all code investigation uses claudemem indexed memory exclusively. -## ⛔ CORE PRINCIPLE: INDEXED MEMORY ONLY +## Workflow -``` -╔══════════════════════════════════════════════════════════════════════════════╗ -║ ║ -║ ALL DETECTIVE SKILLS USE claudemem (INDEXED MEMORY) EXCLUSIVELY ║ -║ ║ -║ When ANY agent references a detective skill, they MUST: ║ -║ ❌ NEVER use grep, find, rg, Glob tool, Grep tool ║ -║ ✅ ALWAYS use claudemem search "query" ║ -║ ║ -╚══════════════════════════════════════════════════════════════════════════════╝ -``` - ---- - -## Agent-to-Skill Mapping - -### Frontend Plugin Agents - -| Agent | Should Use Skill | Purpose | -|-------|-----------------|---------| -| `typescript-frontend-dev` | `code-analysis:developer-detective` | Find implementations, trace data flow | -| `frontend-architect` | `code-analysis:architect-detective` | Analyze architecture, design patterns | -| `test-architect` | `code-analysis:tester-detective` | Coverage analysis, test quality | -| `senior-code-reviewer` | `code-analysis:ultrathink-detective` | Comprehensive code review | -| `ui-developer` | `code-analysis:developer-detective` | Find UI implementations | -| `designer` | `code-analysis:architect-detective` | Understand component structure | -| `plan-reviewer` | `code-analysis:architect-detective` | Review architecture plans | - -### Bun Backend Plugin Agents - -| Agent | Should Use Skill | Purpose | -|-------|-----------------|---------| -| `backend-developer` | `code-analysis:developer-detective` | Find implementations, trace data flow | -| `api-architect` | `code-analysis:architect-detective` | API architecture analysis | -| `apidog` | `code-analysis:developer-detective` | Find API implementations | - -### Code Analysis Plugin Agents +1. **Identify the agent's primary role** — implementing, designing, testing, debugging, or reviewing. +2. **Select the matching detective** from the mapping table below. +3. **Add the skill reference** to the agent's frontmatter: `skills: code-analysis:`. +4. **Verify the agent uses claudemem** — never grep, find, Glob, or Grep tools for code discovery. -| Agent | Should Use Skill | Purpose | -|-------|-----------------|---------| -| `codebase-detective` | All detective skills | Full investigation capability | +## Agent-to-Detective Mapping -### Any Other Plugin +| Agent Role | Detective Skill | Primary Focus | +|------------|----------------|---------------| +| Developer agents | `code-analysis:developer-detective` | Implementation, data flow, callers/callees | +| Architect agents | `code-analysis:architect-detective` | System design, layers, PageRank analysis | +| Tester agents | `code-analysis:tester-detective` | Coverage analysis, test quality | +| Debugger agents | `code-analysis:debugger-detective` | Root cause analysis, call chains | +| Reviewer agents | `code-analysis:ultrathink-detective` | Comprehensive multi-perspective audit | -| Agent Role | Should Use Skill | -|------------|-----------------| -| Any "developer" agent | `code-analysis:developer-detective` | -| Any "architect" agent | `code-analysis:architect-detective` | -| Any "tester" agent | `code-analysis:tester-detective` | -| Any "reviewer" agent | `code-analysis:ultrathink-detective` | -| Any "debugger" agent | `code-analysis:debugger-detective` | +## Example: Adding Detective to a Frontend Agent ---- - -## How to Reference Skills in Agent Frontmatter - -### Example: Developer Agent ```yaml --- -name: my-developer-agent +name: typescript-frontend-dev description: Implements features skills: code-analysis:developer-detective --- -# My Developer Agent - -When investigating code, use the developer-detective skill. -This gives you access to indexed memory search via claudemem. - -## Investigation Pattern - -Before implementing: -1. Check claudemem status: `claudemem status` -2. Search for related code: `claudemem search "feature I'm implementing"` -3. Read specific files from results -4. NEVER use grep or find for discovery -``` - -### Example: Architect Agent -```yaml ---- -name: my-architect-agent -description: Designs architecture -skills: code-analysis:architect-detective ---- - -# My Architect Agent - -When analyzing architecture, use the architect-detective skill. - -## Architecture Discovery - -1. Check claudemem status: `claudemem status` -2. Search for patterns: `claudemem search "service layer architecture"` -3. Map dependencies: `claudemem search "import dependency injection"` -4. NEVER use grep or find for discovery -``` - -### Example: Multi-Skill Agent -```yaml ---- -name: comprehensive-reviewer -description: Reviews all aspects -skills: code-analysis:ultrathink-detective, code-analysis:tester-detective ---- -``` - ---- - -## Skill Selection Decision Tree - -``` -┌─────────────────────────────────────────────────────────────────────────────┐ -│ WHICH DETECTIVE SKILL TO USE? │ -├─────────────────────────────────────────────────────────────────────────────┤ -│ │ -│ What is the agent's PRIMARY focus? │ -│ │ -│ ├── IMPLEMENTING code / Finding where to change │ -│ │ └── Use: developer-detective │ -│ │ │ -│ ├── DESIGNING architecture / Understanding patterns │ -│ │ └── Use: architect-detective │ -│ │ │ -│ ├── TESTING / Coverage analysis / Quality │ -│ │ └── Use: tester-detective │ -│ │ │ -│ ├── DEBUGGING / Finding root cause │ -│ │ └── Use: debugger-detective │ -│ │ │ -│ └── COMPREHENSIVE analysis / Technical debt / Audit │ -│ └── Use: ultrathink-detective │ -│ │ -└─────────────────────────────────────────────────────────────────────────────┘ -``` - ---- - -## Integration Examples - -### Example 1: Frontend Developer Agent Needing to Find Code - -```typescript -// In frontend plugin's typescript-frontend-dev agent: - -// ❌ WRONG - Never do this -Grep({ pattern: "UserService", type: "ts" }); -Glob({ pattern: "**/user*.ts" }); - -// ✅ CORRECT - Use indexed memory via developer-detective skill -// The skill teaches the agent to use: -claudemem search "UserService implementation methods" +# Investigation Pattern +# 1. claudemem status +# 2. claudemem search "feature I'm implementing" +# 3. Read specific file:line from results +# 4. NEVER use grep or find for discovery ``` -### Example 2: Backend Architect Analyzing API Structure - -```typescript -// In bun plugin's api-architect agent: - -// ❌ WRONG - Never do this -find . -name "*.controller.ts" -grep -r "router\." . --include="*.ts" - -// ✅ CORRECT - Use indexed memory via architect-detective skill -claudemem search "API controller endpoint handler" -claudemem search "router pattern REST GraphQL" -``` - -### Example 3: Test Architect Finding Coverage Gaps - -```typescript -// In frontend plugin's test-architect agent: - -// ❌ WRONG - Never do this -Glob({ pattern: "**/*.test.ts" }); -Grep({ pattern: "describe" }); - -// ✅ CORRECT - Use indexed memory via tester-detective skill -claudemem search "test coverage describe spec" -claudemem search "mock stub test assertion" -``` - ---- - -## Skill Inheritance Pattern +**Verification:** Confirm the agent's frontmatter includes the correct `skills:` reference and that investigation code uses `claudemem` commands. -When an agent needs code investigation, it should: +## Plugin Dependency -1. **Reference the appropriate detective skill in frontmatter** -2. **Follow the skill's INDEXED MEMORY ONLY requirement** -3. **Use claudemem for ALL code discovery** -4. **NEVER fall back to grep/find/Glob/Grep tools** - -```yaml ---- -name: any-agent-that-needs-investigation -skills: code-analysis:developer-detective # or architect/tester/debugger/ultrathink ---- - -# This agent inherits: -# - INDEXED MEMORY requirement (claudemem only) -# - Role-specific search patterns -# - Output format guidance -# - FORBIDDEN: grep, find, Glob, Grep tools -``` - ---- - -## Plugin Dependencies - -If your plugin has agents that need code investigation, add this dependency: +Add to your plugin's `plugin.json` to ensure detective skills are available: ```json { - "name": "your-plugin", "dependencies": { "code-analysis@mag-claude-plugins": "^1.6.0" } } ``` -This ensures: -- claudemem skills are available -- Detective skills are accessible via `code-analysis:*` prefix -- Agents can reference skills in frontmatter - ---- - -## Summary: The Golden Rule - -``` -╔══════════════════════════════════════════════════════════════════════════════╗ -║ ║ -║ ANY AGENT + CODE INVESTIGATION = claudemem ONLY ║ -║ ║ -║ Developer agents → code-analysis:developer-detective ║ -║ Architect agents → code-analysis:architect-detective ║ -║ Tester agents → code-analysis:tester-detective ║ -║ Debugger agents → code-analysis:debugger-detective ║ -║ Reviewer agents → code-analysis:ultrathink-detective ║ -║ ║ -║ grep/find/Glob/Grep = FORBIDDEN (always, everywhere, no exceptions) ║ -║ ║ -╚══════════════════════════════════════════════════════════════════════════════╝ -``` - ---- +## Notes -**Maintained by:** MadAppGang -**Plugin:** code-analysis -**Last Updated:** December 2025 +- All detective skills require claudemem indexed memory — grep/find/Glob/Grep are forbidden for semantic queries +- Multi-skill agents can reference multiple detectives: `skills: code-analysis:ultrathink-detective, code-analysis:tester-detective` +- Direct detective usage and /analyze command remain unchanged diff --git a/plugins/code-analysis/skills/deep-analysis/SKILL.md b/plugins/code-analysis/skills/deep-analysis/SKILL.md index da6597a..d7a915c 100644 --- a/plugins/code-analysis/skills/deep-analysis/SKILL.md +++ b/plugins/code-analysis/skills/deep-analysis/SKILL.md @@ -1,368 +1,69 @@ --- name: deep-analysis -description: "⚡ PRIMARY SKILL for: 'how does X work', 'investigate', 'analyze architecture', 'trace flow', 'find implementations'. PREREQUISITE: code-search-selector must validate tool choice. Launches codebase-detective with claudemem INDEXED MEMORY." -allowed-tools: Task -prerequisites: - - code-search-selector # Must run before this skill -dependencies: - - claudemem must be indexed (claudemem status) +description: "Launches codebase-detective with claudemem indexed memory for comprehensive code investigation. Traces code flow, maps architecture, and locates implementations with semantic search. Use when asked 'how does X work', 'investigate', 'analyze architecture', 'trace flow', or 'find implementations'." --- # Deep Code Analysis -This Skill provides comprehensive codebase investigation capabilities using the codebase-detective agent with semantic search and pattern matching. +Provides comprehensive codebase investigation by launching the codebase-detective agent with claudemem semantic search and AST-aware pattern matching. -## Prerequisites (MANDATORY) +## Prerequisites -``` -╔══════════════════════════════════════════════════════════════════════════════╗ -║ BEFORE INVOKING THIS SKILL ║ -╠══════════════════════════════════════════════════════════════════════════════╣ -║ ║ -║ 1. INVOKE code-search-selector skill FIRST ║ -║ → Validates tool selection (claudemem vs grep) ║ -║ → Checks if claudemem is indexed ║ -║ → Prevents tool familiarity bias ║ -║ ║ -║ 2. VERIFY claudemem status ║ -║ → Run: claudemem status ║ -║ → If not indexed: claudemem index -y ║ -║ ║ -║ 3. DO NOT start with Read/Glob ║ -║ → Even if file paths are mentioned in the prompt ║ -║ → Semantic search first, Read specific lines after ║ -║ ║ -╚══════════════════════════════════════════════════════════════════════════════╝ -``` - -## When to use this Skill - -Claude should invoke this Skill when: - -- User asks "how does [feature] work?" -- User wants to understand code architecture or patterns -- User is debugging and needs to trace code flow -- User asks "where is [functionality] implemented?" -- User needs to find all usages of a component/service -- User wants to understand dependencies between files -- User mentions: "investigate", "analyze", "find", "trace", "understand" -- User is exploring an unfamiliar codebase -- User needs to understand complex multi-file functionality - -## Instructions - -### Phase 1: Determine Investigation Scope - -Understand what the user wants to investigate: - -1. **Specific Feature**: "How does user authentication work?" -2. **Find Implementation**: "Where is the payment processing logic?" -3. **Trace Flow**: "What happens when I click the submit button?" -4. **Debug Issue**: "Why is the profile page showing undefined?" -5. **Find Patterns**: "Where are all the API calls made?" -6. **Analyze Architecture**: "What's the structure of the data layer?" - -### Phase 2: Invoke codebase-detective Agent - -Use the Task tool to launch the codebase-detective agent with comprehensive instructions: - -``` -Use Task tool with: -- subagent_type: "code-analysis:detective" -- description: "Investigate [brief summary]" -- prompt: [Detailed investigation instructions] -``` - -**Prompt structure for codebase-detective**: - -```markdown -# Code Investigation Task - -## Investigation Target -[What needs to be investigated - be specific] - -## Context -- Working Directory: [current working directory] -- Purpose: [debugging/learning/refactoring/etc] -- User's Question: [original user question] - -## Investigation Steps - -1. **Initial Search** (CLAUDEMEM REQUIRED): - - FIRST: Check `claudemem status` - is index available? - - ALWAYS: Use `claudemem search "semantic query"` for investigation - - NEVER: Use grep/glob for semantic understanding tasks - - Search for: [concepts, functionality, patterns by meaning] - -2. **Code Location**: - - Find exact file paths and line numbers - - Identify entry points and main implementations - - Note related files and dependencies - -3. **Code Flow Analysis**: - - Trace how data/control flows through the code - - Identify key functions and their roles - - Map out component/service relationships - -4. **Pattern Recognition**: - - Identify architectural patterns used - - Note code conventions and styles - - Find similar implementations for reference - -## Deliverables - -Provide a comprehensive report including: - -1. **📍 Primary Locations**: - - Main implementation files with line numbers - - Entry points and key functions - - Configuration and setup files - -2. **🔍 Code Flow**: - - Step-by-step flow explanation - - How components interact - - Data transformation points - -3. **🗺️ Architecture Map**: - - High-level structure diagram - - Component relationships - - Dependency graph - -4. **📝 Code Snippets**: - - Key implementations (show important code) - - Patterns and conventions used - - Notable details or gotchas - -5. **🚀 Navigation Guide**: - - How to explore the code further - - Related files to examine - - Commands to run for testing - -6. **💡 Insights**: - - Why the code is structured this way - - Potential issues or improvements - - Best practices observed - -## Search Strategy - -### ⚠️ CRITICAL: Tool Selection - -**BEFORE ANY SEARCH, CHECK CLAUDEMEM STATUS:** -```bash -claudemem status -``` - -### ✅ PRIMARY METHOD: claudemem (Indexed Memory) - -```bash -# Index if needed -claudemem index -y - -# Semantic search (ALWAYS use this for investigation) -claudemem search "authentication login session" -n 15 -claudemem search "API endpoint handler route" -n 20 -claudemem search "data transformation pipeline" -n 10 -``` - -**Why claudemem is REQUIRED for investigation:** -- Understands code MEANING, not just text patterns -- Finds related code even with different terminology -- Returns ranked, relevant results -- AST-aware (understands code structure) - -### ❌ WHEN NOT TO USE GREP +1. Run `code-search-selector` skill first to validate tool choice. +2. Verify claudemem is indexed: `claudemem status` +3. If not indexed: `claudemem index -y` +4. Do NOT start with Read/Glob — use semantic search first, then read specific lines. -| User Request | ❌ DON'T | ✅ DO | -|-------------|----------|-------| -| "How does auth work?" | `grep -r "auth" src/` | `claudemem search "authentication flow"` | -| "Find API endpoints" | `grep -r "router" src/` | `claudemem search "API endpoint handler"` | -| "Trace data flow" | `grep -r "transform" src/` | `claudemem search "data transformation"` | -| "Audit architecture" | `ls -la src/` | `claudemem search "architecture layers"` | +## Workflow -### ⚠️ DEGRADED FALLBACK (Only if claudemem unavailable) +1. **Determine investigation scope** — classify the query (feature analysis, implementation search, flow tracing, debugging, pattern finding, or architecture audit). +2. **Launch codebase-detective** via the Task tool with detailed investigation instructions. +3. **Present analysis results** — executive summary, file locations with line numbers, code flow, architecture overview. +4. **Offer follow-up** — suggest deeper dives into specific areas. -**Only use grep/find if:** -1. claudemem is NOT installed, AND -2. User explicitly accepts degraded mode +## Example: Understanding Authentication -```bash -# DEGRADED MODE - inferior results expected -grep -r "pattern" src/ # Text match only, no semantic understanding -find . -name "*.ts" # File discovery only +```typescript +Task({ + subagent_type: "code-analysis:detective", + description: "Investigate user authentication and login flow", + prompt: ` + 1. Check claudemem status, index if needed + 2. claudemem search "authentication login session JWT" -n 15 + 3. Trace: login endpoint → auth service → token generation → middleware + 4. Report file locations with line numbers and data flow diagram + ` +}) ``` -**Always warn user**: "Using grep fallback - results will be less accurate than semantic search." - -## Output Format - -Structure your findings clearly with: -- File paths using backticks: `src/auth/login.ts:45` -- Code blocks for snippets -- Clear headings and sections +**Expected output:** +- Primary file locations with line numbers +- Step-by-step code flow explanation +- Architecture map showing component relationships - Actionable next steps -``` - -### Phase 3: Present Analysis Results - -After the agent completes, present results to the user: - -1. **Executive Summary** (2-3 sentences): - - What was found - - Where it's located - - Key insight - -2. **Detailed Findings**: - - Primary file locations with line numbers - - Code flow explanation - - Architecture overview - -3. **Visual Structure** (if complex): - ``` - EntryPoint (file:line) - ├── Validator (file:line) - ├── BusinessLogic (file:line) - │ └── DataAccess (file:line) - └── ResponseHandler (file:line) - ``` - -4. **Code Examples**: - - Show key code snippets inline - - Highlight important patterns - -5. **Next Steps**: - - Suggest follow-up investigations - - Offer to dive deeper into specific parts - - Provide commands to test/run the code - -### Phase 4: Offer Follow-up -Ask the user: -- "Would you like me to investigate any specific part in more detail?" -- "Do you want to see how [related feature] works?" -- "Should I trace [specific function] further?" - -## Example Scenarios - -### Example 1: Understanding Authentication - -``` -User: "How does login work in this app?" - -Skill invokes codebase-detective agent with: -"Investigate user authentication and login flow: -1. Find login API endpoint or form handler -2. Trace authentication logic -3. Identify token generation/storage -4. Find session management -5. Locate authentication middleware" - -Agent provides: -- src/api/auth/login.ts:34-78 (login endpoint) -- src/services/authService.ts:12-45 (JWT generation) -- src/middleware/authMiddleware.ts:23 (token validation) -- Flow: Form → API → Service → Middleware → Protected Routes -``` - -### Example 2: Debugging Undefined Error - -``` -User: "The dashboard shows 'undefined' for user name" - -Skill invokes codebase-detective agent with: -"Debug undefined user name in dashboard: -1. Find Dashboard component -2. Locate where user name is rendered -3. Trace user data fetching -4. Check data transformation/mapping -5. Identify where undefined is introduced" - -Agent provides: -- src/components/Dashboard.tsx:156 renders user.name -- src/hooks/useUser.ts:45 fetches user data -- Issue: API returns 'full_name' but code expects 'name' -- Fix: Map 'full_name' to 'name' in useUser hook -``` - -### Example 3: Finding All API Calls - -``` -User: "Where are all the API calls made?" - -Skill invokes codebase-detective agent with: -"Find all API call locations: -1. Search for fetch, axios, http client usage -2. Identify API client/service files -3. List all endpoints used -4. Note patterns (REST, GraphQL, etc) -5. Find error handling approach" - -Agent provides: -- 23 API calls across 8 files -- Centralized in src/services/* -- Using axios with interceptors -- Base URL in src/config/api.ts -- Error handling in src/utils/errorHandler.ts -``` - -## Success Criteria - -The Skill is successful when: - -1. ✅ User's question is comprehensively answered -2. ✅ Exact code locations provided with line numbers -3. ✅ Code relationships and flow clearly explained -4. ✅ User can navigate to code and understand it -5. ✅ Architecture patterns identified and explained -6. ✅ Follow-up questions anticipated +## Search Strategy -## Tips for Optimal Results +| Task | Tool | Example | +|------|------|---------| +| Understand how X works | `claudemem search` | `claudemem search "authentication flow"` | +| Find API endpoints | `claudemem search` | `claudemem search "API endpoint handler"` | +| Trace data flow | `claudemem search` | `claudemem search "data transformation"` | +| Exact string match only | `grep` (fallback) | `grep -r "DEPRECATED_FLAG" src/` | -1. **Be Comprehensive**: Don't just find one file, map the entire flow -2. **Provide Context**: Explain why code is structured this way -3. **Show Examples**: Include actual code snippets -4. **Think Holistically**: Connect related pieces across files -5. **Anticipate Questions**: Answer follow-up questions proactively +## Fallback Protocol -## Integration with Other Tools +If claudemem is unavailable or returns no results: -This Skill works well with: +1. **Stop** — do not silently switch to grep. +2. **Diagnose** — run `claudemem status`. +3. **Ask user** — offer reindex, different query, grep fallback (with quality warning), or cancel. -- **claudemem CLI**: For local semantic code search with Tree-sitter parsing -- **MCP gopls**: For Go-specific analysis -- **Standard CLI tools**: grep, ripgrep, find, git -- **Project-specific tools**: Use project's search/navigation tools +**Verification:** Confirm results include exact file paths with line numbers before presenting to user. ## Notes - The codebase-detective agent uses extended thinking for complex analysis -- **claudemem is REQUIRED** - grep/find produce inferior results -- Fallback to grep ONLY if claudemem unavailable AND user accepts degraded mode - claudemem requires OpenRouter API key (https://openrouter.ai) -- Default model: `voyage/voyage-code-3` (best code understanding) -- Run `claudemem --models` to see all options and pricing -- Results are actionable and navigable -- Great for onboarding to new codebases -- Helps prevent incorrect assumptions about code - -## Tool Selection Quick Reference - -``` -┌─────────────────────────────────────────────────────────────────────┐ -│ BEFORE ANY CODE INVESTIGATION: │ -│ │ -│ 1. INVOKE code-search-selector skill │ -│ 2. Run: claudemem status │ -│ 3. If indexed → USE claudemem search │ -│ 4. If not indexed → Index first OR ask user │ -│ 5. NEVER default to grep when claudemem available │ -│ 6. NEVER start with Read/Glob for semantic questions │ -│ │ -│ grep is for EXACT STRING MATCHES only, NOT semantic understanding │ -└─────────────────────────────────────────────────────────────────────┘ -``` - ---- - -**Maintained by:** MadAppGang -**Plugin:** code-analysis v2.2.0 -**Last Updated:** December 2025 +- Default model: `voyage/voyage-code-3` for code understanding +- grep is for exact string matches only, not semantic understanding diff --git a/plugins/code-analysis/skills/developer-detective/SKILL.md b/plugins/code-analysis/skills/developer-detective/SKILL.md index f94477c..6ff1452 100644 --- a/plugins/code-analysis/skills/developer-detective/SKILL.md +++ b/plugins/code-analysis/skills/developer-detective/SKILL.md @@ -1,495 +1,63 @@ --- name: developer-detective -description: "⚡ Implementation analysis skill. Best for: 'how does X work', 'find implementation of', 'trace data flow', 'where is X defined', 'find all usages'. Uses claudemem AST with callers/callees for efficient code tracing." -allowed-tools: Bash, Task, Read, AskUserQuestion +description: "Traces implementation details using claudemem AST callers/callees analysis. Locates function definitions, maps data flow, finds all usages, and assesses change impact. Use when asked 'how does X work', 'find implementation of', 'trace data flow', 'where is X defined', or 'find all usages'." --- -# Developer Detective Skill +# Developer Detective -This skill uses claudemem's callers/callees analysis for implementation investigation. +Software Developer perspective for implementation investigation using claudemem's `callers`, `callees`, `context`, `symbol`, and `impact` commands for precise code tracing. -## Why Claudemem Works Better for Development +## Workflow -| Task | claudemem | Native Tools | -|------|-----------|--------------| -| Find usages | `callers` shows all call sites | Grep (text match) | -| Trace dependencies | `callees` shows called functions | Manual reading | -| Understand context | `context` gives full picture | Multiple reads | -| Impact analysis | Caller chain reveals risk | Unknown | +1. **Verify claudemem** — confirm v0.3.0+ installed and indexed. Check freshness; reindex if stale. +2. **Map the area** — run `claudemem --agent map "feature area"` to get an overview. +3. **Find the entry point** — run `claudemem --agent symbol ` to locate the highest-PageRank symbol. +4. **Trace the flow** — run `callees` to see what the function calls (data flows out), then follow the chain. +5. **Understand usage** — run `callers` to see every place that calls the function (impact of changes). +6. **Check impact** (v0.4.0+) — before modifying code, run `claudemem --agent impact ` for full transitive caller tree. +7. **Read specific code** — use Read tool on exact file:line ranges from results, never whole files. -**Primary commands:** -- `claudemem --agent callers ` - What calls this code -- `claudemem --agent callees ` - What this code calls -- `claudemem --agent context ` - Full understanding - -# Developer Detective Skill - -**Version:** 3.3.0 -**Role:** Software Developer -**Purpose:** Implementation investigation using AST callers/callees and impact analysis - -## Role Context - -You are investigating this codebase as a **Software Developer**. Your focus is on: -- **Implementation details** - How code actually works -- **Data flow** - How data moves through the system (via callees) -- **Usage patterns** - How code is used (via callers) -- **Dependencies** - What a function needs to work -- **Impact analysis** - What breaks if you change something - -## Why callers/callees is Perfect for Development - -The `callers` and `callees` commands show you: -- **callers** = Every place that calls this code (impact of changes) -- **callees** = Every function this code calls (its dependencies) -- **Exact file:line** = Precise locations for reading/editing -- **Call kinds** = call, import, extends, implements - -## Developer-Focused Commands (v0.3.0) - -### Find Implementation +## Example: Tracing Payment Flow ```bash -# Find where a function is defined +# Step 1: Find the function claudemem --agent symbol processPayment -# Get full context with callers and callees -claudemem --agent context processPayment``` -### Trace Data Flow - -```bash -# What does this function call? (data flows OUT) +# Step 2: What does it call? (data flow) claudemem --agent callees processPayment -# Follow the chain -claudemem --agent callees validateCardclaudemem --agent callees chargeStripe``` +# Output: validateCard, getCustomer, chargeStripe, saveTransaction -### Find All Usages - -```bash -# Who calls this function? (usage patterns) +# Step 3: Who calls it? (usage/impact) claudemem --agent callers processPayment -# This shows EVERY place that uses this code -``` - -### Impact Analysis (v0.4.0+ Required) - -```bash -# Before modifying ANY code, check full impact -claudemem --agent impact functionToChange -# Output shows ALL transitive callers: -# direct_callers: -# - LoginController.authenticate:34 -# - SessionMiddleware.validate:12 -# transitive_callers (depth 2): -# - AppRouter.handleRequest:45 -# - TestSuite.runAuth:89 -``` - -**Why impact matters**: -- `callers` shows only direct callers (1 level) -- `impact` shows ALL transitive callers (full tree) -- Critical for refactoring decisions - -**Handling Empty Results:** -```bash -IMPACT=$(claudemem --agent impact functionToChange) -if echo "$IMPACT" | grep -q "No callers"; then - echo "No callers found. This is either:" - echo " 1. An entry point (API handler, main function) - expected" - echo " 2. Dead code - verify with: claudemem dead-code" - echo " 3. Dynamically called - check for import(), reflection" -fi -``` - -### Impact Analysis (BEFORE Modifying) - -```bash -# Quick check - direct callers only (v0.3.0) -claudemem --agent callers functionToChange -# Deep check - ALL transitive callers (v0.4.0+ Required) -IMPACT=$(claudemem --agent impact functionToChange) - -# Handle results -if [ -z "$IMPACT" ] || echo "$IMPACT" | grep -q "No callers"; then - echo "No static callers found - verify dynamic usage patterns" -else - echo "$IMPACT" - echo "" - echo "This tells you:" - echo "- Direct callers (immediate impact)" - echo "- Transitive callers (ripple effects)" - echo "- Grouped by file (for systematic updates)" -fi -``` - -### Understanding Complex Code - -```bash -# Get full picture: definition + callers + callees -claudemem --agent context complexFunction``` - -## PHASE 0: MANDATORY SETUP - -### Step 1: Verify claudemem v0.3.0 - -```bash -which claudemem && claudemem --version -# Must be 0.3.0+ -``` - -### Step 2: If Not Installed → STOP - -Use AskUserQuestion (see ultrathink-detective for template) - -### Step 3: Check Index Status - -```bash -# Check claudemem installation and index -claudemem --version && ls -la .claudemem/index.db 2>/dev/null -``` - -### Step 3.5: Check Index Freshness - -Before proceeding with investigation, verify the index is current: - -```bash -# First check if index exists -if [ ! -d ".claudemem" ] || [ ! -f ".claudemem/index.db" ]; then - # Use AskUserQuestion to prompt for index creation - # Options: [1] Create index now (Recommended), [2] Cancel investigation - exit 1 -fi - -# Count files modified since last index -STALE_COUNT=$(find . -type f \( -name "*.ts" -o -name "*.tsx" -o -name "*.js" -o -name "*.jsx" -o -name "*.py" -o -name "*.go" -o -name "*.rs" \) \ - -newer .claudemem/index.db 2>/dev/null | grep -v "node_modules" | grep -v ".git" | grep -v "dist" | grep -v "build" | wc -l) -STALE_COUNT=$((STALE_COUNT + 0)) # Normalize to integer - -if [ "$STALE_COUNT" -gt 0 ]; then - # Get index time with explicit platform detection - if [[ "$OSTYPE" == "darwin"* ]]; then - INDEX_TIME=$(stat -f "%Sm" -t "%Y-%m-%d %H:%M" .claudemem/index.db 2>/dev/null) - else - INDEX_TIME=$(stat -c "%y" .claudemem/index.db 2>/dev/null | cut -d'.' -f1) - fi - INDEX_TIME=${INDEX_TIME:-"unknown time"} - - # Get sample of stale files - STALE_SAMPLE=$(find . -type f \( -name "*.ts" -o -name "*.tsx" \) \ - -newer .claudemem/index.db 2>/dev/null | grep -v "node_modules" | grep -v ".git" | head -5) - - # Use AskUserQuestion (see template in ultrathink-detective) -fi -``` - -### Step 4: Index if Needed - -```bash -claudemem index -``` - ---- - -## Workflow: Implementation Investigation (v0.3.0) - -### Phase 1: Map the Area - -```bash -# Get overview of the feature area -claudemem --agent map "payment processing"``` - -### Phase 2: Find the Entry Point - -```bash -# Locate the main function (highest PageRank in area) -claudemem --agent symbol PaymentService``` - -### Phase 3: Trace the Flow - -```bash -# What does PaymentService call? -claudemem --agent callees PaymentService -# For each major callee, trace further -claudemem --agent callees validatePaymentclaudemem --agent callees processChargeclaudemem --agent callees saveTransaction``` - -### Phase 4: Understand Usage - -```bash -# Who uses PaymentService? -claudemem --agent callers PaymentService -# This shows the entry points -``` - -### Phase 5: Read Specific Code - -```bash -# Now read ONLY the relevant file:line ranges from results -# DON'T read whole files -``` - -## Output Format: Implementation Report +# Output: CheckoutController.submit:45, SubscriptionService.renew:89 -### 1. Symbol Overview - -``` -┌─────────────────────────────────────────────────────────┐ -│ IMPLEMENTATION ANALYSIS │ -├─────────────────────────────────────────────────────────┤ -│ Symbol: processPayment │ -│ Location: src/services/payment.ts:45-89 │ -│ Kind: function │ -│ PageRank: 0.034 │ -│ Search Method: claudemem v0.3.0 (AST analysis) │ -└─────────────────────────────────────────────────────────┘ -``` - -### 2. Data Flow (Callees) - -``` -processPayment - ├── validateCard (src/validators/card.ts:12) - ├── getCustomer (src/services/customer.ts:34) - ├── chargeStripe (src/integrations/stripe.ts:56) - │ └── stripe.charges.create (external) - └── saveTransaction (src/repositories/transaction.ts:78) - └── database.insert (src/db/index.ts:23) -``` - -### 3. Usage (Callers) - -``` -processPayment is called by: - ├── CheckoutController.submit (src/controllers/checkout.ts:45) - ├── SubscriptionService.renew (src/services/subscription.ts:89) - └── RetryQueue.processPayment (src/workers/retry.ts:23) -``` - -### 4. Impact Analysis - -``` -⚠️ IMPACT: Changing processPayment will affect: - - 3 direct callers (shown above) - - Checkout flow (user-facing) - - Subscription renewals (automated) - - Payment retry logic (background) -``` - -## Scenarios - -### Scenario: "How does X work?" - -```bash -# Step 1: Find X -claudemem --agent symbol X -# Step 2: See what X does -claudemem --agent callees X -# Step 3: See how X is used -claudemem --agent callers X -# Step 4: Read the specific code -# Use Read tool on exact file:line from results -``` - -### Scenario: Refactoring - -```bash -# Step 1: Find ALL usages (callers) -claudemem --agent callers oldFunction -# Step 2: Document each caller location -# Step 3: Update each caller systematically -``` - -### Scenario: Adding to Existing Code - -```bash -# Step 1: Find where to add -claudemem --agent symbol targetModule -# Step 2: Understand dependencies -claudemem --agent callees targetModule -# Step 3: Check existing patterns -claudemem --agent callers targetModule``` - -## Result Validation Pattern - -After EVERY claudemem command, validate results: - -### Symbol/Callers Validation - -When tracing implementation: - -```bash -# Find symbol -SYMBOL=$(claudemem --agent symbol PaymentService) -EXIT_CODE=$? - -if [ "$EXIT_CODE" -ne 0 ] || [ -z "$SYMBOL" ] || echo "$SYMBOL" | grep -qi "not found\|error"; then - # Symbol doesn't exist, typo, or index issue - # Diagnose index health - DIAGNOSIS=$(claudemem --version && ls -la .claudemem/index.db 2>&1) - # Use AskUserQuestion with suggestions: - # [1] Reindex, [2] Try different name, [3] Cancel -fi - -# Check callers -CALLERS=$(claudemem --agent callers PaymentService) -# 0 callers is valid (entry point or unused) -# But error message is not -if echo "$CALLERS" | grep -qi "error\|failed"; then - # Use AskUserQuestion -fi -``` - -### Empty/Irrelevant Results - -```bash -RESULTS=$(claudemem --agent callees FunctionName) - -# Validate relevance -# Extract keywords from the user's investigation query -# Example: QUERY="how does auth work" → KEYWORDS="auth work authentication" -# The orchestrating agent must populate KEYWORDS before this check -MATCH_COUNT=0 -for kw in $KEYWORDS; do - if echo "$RESULTS" | grep -qi "$kw"; then - MATCH_COUNT=$((MATCH_COUNT + 1)) - fi -done - -if [ "$MATCH_COUNT" -eq 0 ]; then - # Results don't match expected dependencies - # Use AskUserQuestion: Reindex, Different query, or Cancel -fi +# Step 4: Full impact before refactoring (v0.4.0+) +claudemem --agent impact processPayment ``` ---- - -## FALLBACK PROTOCOL - -**CRITICAL: Never use grep/find/Glob without explicit user approval.** - -If claudemem fails or returns irrelevant results: +**Verification:** Confirm callers/callees output includes file:line references. If symbol returns "not found", check spelling or reindex. -1. **STOP** - Do not silently switch tools -2. **DIAGNOSE** - Run `claudemem status` -3. **REPORT** - Tell user what happened -4. **ASK** - Use AskUserQuestion for next steps - -```typescript -// Fallback options (in order of preference) -AskUserQuestion({ - questions: [{ - question: "claudemem [command] failed or returned no relevant results. How should I proceed?", - header: "Investigation Issue", - multiSelect: false, - options: [ - { label: "Reindex codebase", description: "Run claudemem index (~1-2 min)" }, - { label: "Try different query", description: "Rephrase the search" }, - { label: "Use grep (not recommended)", description: "Traditional search - loses call graph analysis" }, - { label: "Cancel", description: "Stop investigation" } - ] - }] -}) -``` - -**See ultrathink-detective skill for complete Fallback Protocol documentation.** - ---- - -## Anti-Patterns - -| Anti-Pattern | Why Wrong | Correct Approach | -|--------------|-----------|------------------| -| `grep -r "function"` | No call relationships | `claudemem --agent callees func` | -| Modify without callers | Breaking changes | ALWAYS check `callers` first | -| Read whole files | Token waste | Read specific file:line from results | -| Guess dependencies | Miss connections | Use `callees` for exact deps | -| `cmd \| head/tail` | Hides callers/callees | Use full output or `--tokens` | - -### Output Truncation Warning - -╔══════════════════════════════════════════════════════════════════════════════╗ -║ ║ -║ ❌ Anti-Pattern 7: Truncating Claudemem Output ║ -║ ║ -║ FORBIDDEN (any form of output truncation): ║ -║ → BAD: claudemem --agent map "query" | head -80 ║ -║ → BAD: claudemem --agent callers X | tail -50 ║ -║ → BAD: claudemem --agent search "x" | grep -m 10 "y" ║ -║ → BAD: claudemem --agent map "q" | awk 'NR <= 50' ║ -║ → BAD: claudemem --agent callers X | sed '50q' ║ -║ → BAD: claudemem --agent search "x" | sort | head -20 ║ -║ → BAD: claudemem --agent map "q" | grep "pattern" | head -20 ║ -║ ║ -║ CORRECT (use full output or built-in limits): ║ -║ → GOOD: claudemem --agent map "query" ║ -║ → GOOD: claudemem --agent search "x" -n 10 ║ -║ → GOOD: claudemem --agent map "q" --tokens 2000 ║ -║ → GOOD: claudemem --agent search "x" --page-size 20 --page 1 ║ -║ → GOOD: claudemem --agent context Func --max-depth 3 ║ -║ ║ -║ WHY: Output is pre-optimized; truncation hides critical results ║ -║ ║ -╚══════════════════════════════════════════════════════════════════════════════╝ - ---- +## Common Scenarios -## Feedback Reporting (v0.8.0+) +| Scenario | Commands | +|----------|----------| +| "How does X work?" | `symbol X` → `callees X` → `callers X` | +| Refactoring | `callers oldFunction` → document each location → update systematically | +| Adding to existing code | `symbol targetModule` → `callees` (deps) → `callers` (patterns) | +| Impact assessment | `impact functionToChange` → review all transitive callers | -After completing investigation, report search feedback to improve future results. +## Fallback Protocol -### When to Report +Never use grep/find/Glob without explicit user approval. If claudemem fails: -Report feedback ONLY if you used the `search` command during investigation: - -| Result Type | Mark As | Reason | -|-------------|---------|--------| -| Read and used | Helpful | Contributed to investigation | -| Read but irrelevant | Unhelpful | False positive | -| Skipped after preview | Unhelpful | Not relevant to query | -| Never read | (Don't track) | Can't evaluate | - -### Feedback Pattern - -```bash -# Track during investigation -SEARCH_QUERY="your original query" -HELPFUL_IDS="" -UNHELPFUL_IDS="" - -# When reading a helpful result -HELPFUL_IDS="$HELPFUL_IDS,$result_id" - -# When reading an unhelpful result -UNHELPFUL_IDS="$UNHELPFUL_IDS,$result_id" - -# Report at end of investigation (v0.8.0+ only) -if claudemem feedback --help 2>&1 | grep -qi "feedback"; then - timeout 5 claudemem feedback \ - --query "$SEARCH_QUERY" \ - --helpful "${HELPFUL_IDS#,}" \ - --unhelpful "${UNHELPFUL_IDS#,}" 2>/dev/null || true -fi -``` - -### Output Update - -Include in investigation report: - -``` -Search Feedback: [X helpful, Y unhelpful] - Submitted (v0.8.0+) -``` - ---- +1. Stop — do not silently switch tools. +2. Diagnose — run `claudemem status`. +3. Ask user via AskUserQuestion (reindex, different query, grep fallback with warning, or cancel). ## Notes -- **`callers` is essential before any modification** - Know your impact -- **`callees` traces data flow** - Follow the execution path -- **`context` gives complete picture** - Symbol + callers + callees -- Always read specific file:line ranges, not whole files -- Works best with TypeScript, Go, Python, Rust codebases - ---- - -**Maintained by:** MadAppGang -**Plugin:** code-analysis v2.7.0 -**Last Updated:** December 2025 (v3.3.0 - Cross-platform compatibility, inline templates, improved validation) +- Always check `callers` before modifying any code — know the impact +- `callees` traces data flow — follow the execution path +- `context` gives complete picture — symbol + callers + callees combined +- Read specific file:line ranges, never whole files +- Never truncate claudemem output — use `--tokens` or `-n` flags for size control diff --git a/plugins/code-analysis/skills/investigate/SKILL.md b/plugins/code-analysis/skills/investigate/SKILL.md index cc9deed..c724c73 100644 --- a/plugins/code-analysis/skills/investigate/SKILL.md +++ b/plugins/code-analysis/skills/investigate/SKILL.md @@ -1,346 +1,67 @@ --- name: investigate -description: "Unified entry point for code investigation. Auto-routes to specialized detective based on query keywords. Use when investigation type is unclear or for general exploration." -allowed-tools: Bash, Task, AskUserQuestion +description: "Routes code investigation queries to specialized detectives (debugger, tester, architect, developer) via priority-based keyword matching and Task delegation. Use when the investigation type is unclear, for general code exploration, or to auto-select the right detective skill." --- # Investigate Skill -**Version:** 1.0.0 -**Purpose:** Keyword-based routing to specialized detective skills -**Pattern:** Smart delegation via Task tool - -## Overview - -This skill analyzes your investigation query and routes to the appropriate detective specialist: -- **debugger-detective** (errors, bugs, crashes) -- **tester-detective** (tests, coverage, edge cases) -- **architect-detective** (architecture, design, patterns) -- **developer-detective** (implementation, data flow - default) - -## Routing Logic - -### Priority System (Highest First) - -1. **Error/Debug** (Priority 1) - Time-critical bug fixes - - Keywords: "debug", "error", "broken", "failing", "crash" - - Route to: `debugger-detective` - -2. **Testing** (Priority 2) - Specialized test analysis - - Keywords: "test", "coverage", "edge case", "mock" - - Route to: `tester-detective` - -3. **Architecture** (Priority 3) - High-level understanding - - Keywords: "architecture", "design", "structure", "layer" - - Route to: `architect-detective` - -4. **Implementation** (Default, Priority 4) - Most common - - Keywords: "implementation", "how does", "code flow" - - Route to: `developer-detective` - -### Conflict Resolution - -When multiple keywords from different categories are detected: -- **Highest priority wins** (Priority 1 beats Priority 2, etc.) -- **No matches**: Default to developer-detective +Unified entry point for code investigation. Analyzes query keywords, selects the best detective specialist, and delegates via the Task tool. ## Workflow -### Phase 1: Extract Query - -The investigation query should be available from the task description or user input. +1. **Extract and normalize the query** from the task description or user input. +2. **Detect keywords** using the priority system below and select a detective. +3. **Show the routing decision** to the user before delegating. +4. **Delegate via the Task tool** to the chosen detective. +5. **Offer override** if the user disagrees with the auto-routing. -```bash -# Query comes from the Task description or user request -INVESTIGATION_QUERY="${TASK_DESCRIPTION:-$USER_QUERY}" - -# Normalize to lowercase for case-insensitive matching -QUERY_LOWER=$(echo "$INVESTIGATION_QUERY" | tr '[:upper:]' '[:lower:]') -``` - -### Phase 2: Keyword Detection +## Priority Routing -```bash -# Priority 1: Error/Debug keywords -if echo "$QUERY_LOWER" | grep -qE "debug|error|broken|failing|crash"; then - DETECTIVE="debugger-detective" - KEYWORDS="debug/error keywords" - PRIORITY=1 - RATIONALE="Bug fixes are time-critical and require call chain tracing" +| Priority | Category | Keywords | Detective | +|----------|----------|----------|-----------| +| 1 | Error/Debug | debug, error, broken, failing, crash | `debugger-detective` | +| 2 | Testing | test, coverage, edge case, mock | `tester-detective` | +| 3 | Architecture | architecture, design, structure, layer | `architect-detective` | +| 4 | Implementation (default) | implementation, how does, code flow | `developer-detective` | -# Priority 2: Testing keywords -elif echo "$QUERY_LOWER" | grep -qE "test|coverage|edge case|mock"; then - DETECTIVE="tester-detective" - KEYWORDS="test/coverage keywords" - PRIORITY=2 - RATIONALE="Test analysis is specialized and requires callers analysis" +When multiple categories match, the highest priority wins. No matches default to `developer-detective`. -# Priority 3: Architecture keywords -elif echo "$QUERY_LOWER" | grep -qE "architecture|design|structure|layer"; then - DETECTIVE="architect-detective" - KEYWORDS="architecture/design keywords" - PRIORITY=3 - RATIONALE="High-level understanding requires PageRank analysis" - -# Priority 4: Implementation (default) -else - DETECTIVE="developer-detective" - KEYWORDS="implementation (default)" - PRIORITY=4 - RATIONALE="Most common investigation type - data flow via callers/callees" -fi -``` - -### Phase 3: User Feedback - -Before delegating, inform the user of the routing decision: - -```bash -echo "" -echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━" -echo "🔍 Investigation Routing" -echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━" -echo "" -echo "Query: $INVESTIGATION_QUERY" -echo "" -echo "Detected: $KEYWORDS (Priority $PRIORITY)" -echo "Routing to: $DETECTIVE" -echo "Reason: $RATIONALE" -echo "" -echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━" -echo "" -``` - -### Phase 4: Delegation via Task Tool - -Use the Task tool to delegate to the selected detective: - -```typescript -Task({ - description: INVESTIGATION_QUERY, - agent: DETECTIVE, - context: { - routing_reason: `Auto-routed based on ${KEYWORDS}`, - original_query: INVESTIGATION_QUERY, - priority: PRIORITY - } -}) -``` - -## Examples - -### Example 1: Debug Keywords +## Example **Input:** "Why is login broken?" -**Detection:** -- Keyword matched: "broken" -- Priority: 1 (Error/Debug) -- Route to: debugger-detective - -**Feedback:** -``` -🔍 Investigation Routing -Query: Why is login broken? -Detected: debug/error keywords (Priority 1) -Routing to: debugger-detective -Reason: Bug fixes are time-critical and require call chain tracing -``` - -### Example 2: Test Keywords - -**Input:** "What's the test coverage for payment?" - -**Detection:** -- Keywords matched: "test", "coverage" -- Priority: 2 (Testing) -- Route to: tester-detective - -**Feedback:** -``` -🔍 Investigation Routing -Query: What's the test coverage for payment? -Detected: test/coverage keywords (Priority 2) -Routing to: tester-detective -Reason: Test analysis is specialized and requires callers analysis -``` - -### Example 3: Architecture Keywords - -**Input:** "What's the architecture of the auth layer?" - -**Detection:** -- Keywords matched: "architecture", "layer" -- Priority: 3 (Architecture) -- Route to: architect-detective - -**Feedback:** -``` -🔍 Investigation Routing -Query: What's the architecture of the auth layer? -Detected: architecture/design keywords (Priority 3) -Routing to: architect-detective -Reason: High-level understanding requires PageRank analysis -``` - -### Example 4: No Keywords (Default) - -**Input:** "How does payment work?" - -**Detection:** -- No keywords matched -- Priority: 4 (Default) -- Route to: developer-detective - -**Feedback:** -``` -🔍 Investigation Routing -Query: How does payment work? -Detected: implementation (default) (Priority 4) -Routing to: developer-detective -Reason: Most common investigation type - data flow via callers/callees -``` - -### Example 5: Multi-Keyword Conflict - -**Input:** "Debug the test coverage" - -**Detection:** -- Keywords matched: "debug" (Priority 1) AND "test" (Priority 2) -- Priority 1 wins -- Route to: debugger-detective - -**Feedback:** -``` -🔍 Investigation Routing -Query: Debug the test coverage -Detected: debug/error keywords (Priority 1) -Routing to: debugger-detective -Reason: Bug fixes are time-critical and require call chain tracing -(Note: Also detected test keywords, but debug takes priority) -``` - -## Complete Implementation - -Here's the full workflow: - ```bash -#!/bin/bash - -# Get investigation query from task description -INVESTIGATION_QUERY="${TASK_DESCRIPTION}" - -# Normalize to lowercase +# Normalize query QUERY_LOWER=$(echo "$INVESTIGATION_QUERY" | tr '[:upper:]' '[:lower:]') -# Keyword detection with priority routing -if echo "$QUERY_LOWER" | grep -qE "debug|error|broken|failing|crash"; then - DETECTIVE="debugger-detective" - KEYWORDS="debug/error keywords" - PRIORITY=1 - RATIONALE="Bug fixes are time-critical and require call chain tracing" - -elif echo "$QUERY_LOWER" | grep -qE "test|coverage|edge case|mock"; then - DETECTIVE="tester-detective" - KEYWORDS="test/coverage keywords" - PRIORITY=2 - RATIONALE="Test analysis is specialized and requires callers analysis" - -elif echo "$QUERY_LOWER" | grep -qE "architecture|design|structure|layer"; then - DETECTIVE="architect-detective" - KEYWORDS="architecture/design keywords" - PRIORITY=3 - RATIONALE="High-level understanding requires PageRank analysis" - -else - DETECTIVE="developer-detective" - KEYWORDS="implementation (default)" - PRIORITY=4 - RATIONALE="Most common investigation type - data flow via callers/callees" -fi - -# Show routing decision -echo "" -echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━" -echo "🔍 Investigation Routing" -echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━" -echo "" -echo "Query: $INVESTIGATION_QUERY" -echo "" -echo "Detected: $KEYWORDS (Priority $PRIORITY)" -echo "Routing to: $DETECTIVE" -echo "Reason: $RATIONALE" -echo "" -echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━" -echo "" +# Keyword detection — "broken" matches Priority 1 +DETECTIVE="debugger-detective" ``` -Then use the Task tool to delegate: - ```typescript +// Delegate to selected detective Task({ - description: INVESTIGATION_QUERY, - agent: DETECTIVE + description: "Why is login broken?", + agent: "debugger-detective" }) ``` -## Fallback Protocol - -If routing produces unexpected results: +**Verification:** Confirm the routing decision is shown to the user before delegation proceeds. -1. **Show routing decision** to user -2. **Ask for override** if needed via AskUserQuestion -3. **Default to developer-detective** if ambiguous +## Conflict Resolution Example -### Override Pattern - -```typescript -// If user wants to override the routing -AskUserQuestion({ - questions: [{ - question: `Auto-routing selected ${DETECTIVE}. Override?`, - header: "Investigation Routing", - multiSelect: false, - options: [ - { label: "Continue with auto-routing", description: `Use ${DETECTIVE}` }, - { label: "debugger-detective", description: "Root cause analysis" }, - { label: "tester-detective", description: "Test coverage analysis" }, - { label: "architect-detective", description: "Architecture patterns" }, - { label: "developer-detective", description: "Implementation details" } - ] - }] -}) -``` - -## Integration with Existing Workflow - -This skill is **additive only** and does not change existing behavior: - -- **Direct detective usage** still works (Task → specific detective) -- **/analyze command** unchanged (launches codebase-detective) -- **Parallel orchestration** patterns unchanged -- **All claudemem hooks** preserved +**Input:** "Debug the test coverage" +- "debug" matches Priority 1, "test" matches Priority 2 +- Priority 1 wins → routes to `debugger-detective` -## Use Cases +## Fallback Protocol -| When to Use Investigate Skill | When to Use Direct Detective | -|-------------------------------|------------------------------| -| Investigation type unclear | You know which specialist you need | -| General exploration | Parallel orchestration (multimodel plugin) | -| Quick routing decision | Specific workflow requirements | -| Learning/experimenting | Production automation | +1. Show the routing decision to the user. +2. Offer override via AskUserQuestion if needed. +3. Default to `developer-detective` when ambiguous. ## Notes - Case-insensitive keyword matching -- Priority system resolves conflicts -- User sees routing decision before delegation -- Original query preserved in Task context -- Default to developer-detective when no keywords match +- Additive only — direct detective usage and /analyze command remain unchanged - Works with all claudemem versions (v0.3.0+) - ---- - -**Maintained by:** MadAppGang -**Plugin:** code-analysis v3.1.0 -**Last Updated:** January 2026 (v1.0.0 - Initial release) diff --git a/plugins/code-analysis/skills/search-interceptor/SKILL.md b/plugins/code-analysis/skills/search-interceptor/SKILL.md index 01e2ef7..f45609a 100644 --- a/plugins/code-analysis/skills/search-interceptor/SKILL.md +++ b/plugins/code-analysis/skills/search-interceptor/SKILL.md @@ -1,213 +1,68 @@ --- name: search-interceptor -description: "💡 Bulk file read optimizer. Suggests semantic search alternatives when reading multiple files. Helps reduce token usage by using claudemem's ranked results instead of sequential file reads." -allowed-tools: Bash, AskUserQuestion +description: "Intercepts bulk file read and glob operations, suggests semantic search alternatives via claudemem to reduce token usage by up to 90%. Use when planning to read 3+ files, using broad glob patterns, or investigating code across multiple files." --- # Search Interceptor -This skill helps optimize bulk file operations by suggesting semantic search alternatives when they would be more efficient. +Optimizes bulk file operations by redirecting to semantic search when more efficient. Checks claudemem status, evaluates planned operations, and suggests ranked alternatives. -## When Semantic Search is More Efficient +## Workflow -| Scenario | Token Cost | Alternative | -|----------|------------|-------------| -| Read 5+ files | ~5000 tokens | `claudemem search` (~500 tokens) | -| Glob all *.ts files | ~3000+ tokens | `claudemem --agent map` | -| Sequential reads to understand | Variable | One semantic query | +1. **Pause before bulk execution** — when about to read 3+ files or use broad globs, stop. +2. **Check claudemem status** — run `claudemem status` to verify the index is available. +3. **Evaluate the operation** against the decision matrix below. +4. **Execute the better alternative** — use one semantic query instead of N file reads. +5. **Read specific lines** from ranked results only after the semantic search. -## When to Consider Alternatives - -### Multiple File Reads - -If planning to read several files, consider: -```bash -# Instead of reading 5 files individually -claudemem search "concept from those files" -n 15 -# Gets ranked results with context -``` - -### Broad Glob Patterns - -If using patterns like `src/**/*.ts`: -```bash -# Instead of globbing and reading all matches -claudemem --agent map "what you're looking for" -# Gets structural overview with PageRank ranking -``` - -### File Paths Mentioned in Task - -Even when specific paths are mentioned, semantic search often finds additional relevant code: -```bash -claudemem search "concept related to mentioned files" -``` - ---- - -## Interception Protocol - -### Step 1: Pause Before Execution - -When you're about to execute bulk file operations, STOP and run: - -```bash -claudemem status -``` - -### Step 2: Evaluate - -**If claudemem is indexed:** - -| Your Plan | Better Alternative | -|-----------|-------------------| -| Read 5 auth files | `claudemem search "authentication login session"` | -| Glob all services | `claudemem search "service layer business logic"` | -| Read mentioned paths | `claudemem search "[concept from those paths]"` | - -**If claudemem is NOT indexed:** - -```bash -claudemem index -y -``` -Then proceed with semantic search. - -### Step 3: Execute Better Alternative - -```bash -# Instead of reading N files, run ONE semantic query -claudemem search "concept describing what you need" -n 15 - -# ONLY THEN read specific lines from results -``` - ---- - -## Interception Decision Matrix +## Decision Matrix | Situation | Intercept? | Action | |-----------|-----------|--------| | Read 1-2 specific files | No | Proceed with Read | -| Read 3+ files in investigation | **YES** | Convert to claudemem search | +| Read 3+ files in investigation | **Yes** | `claudemem search "concept" -n 15` | | Glob for exact filename | No | Proceed with Glob | -| Glob for pattern discovery | **YES** | Convert to claudemem search | +| Glob for pattern discovery | **Yes** | `claudemem search "concept"` | | Grep for exact string | No | Proceed with Grep | -| Grep for semantic concept | **YES** | Convert to claudemem search | -| Files mentioned in prompt | **YES** | Search semantically first | +| Grep for semantic concept | **Yes** | `claudemem search "concept"` | ---- - -## Examples of Interception - -### Example 1: Auth Investigation +## Example: Auth Investigation -**❌ Original plan:** -``` -I see the task mentions auth, let me read: -- src/services/auth/login.ts -- src/services/auth/session.ts -- src/services/auth/jwt.ts -- src/services/auth/middleware.ts -- src/services/auth/utils.ts -``` - -**✅ After interception:** ```bash -claudemem status # Check if indexed +# Instead of reading 5 auth files individually (~5000 tokens): +claudemem status claudemem search "authentication login session JWT token validation" -n 15 -# Now I have ranked, relevant chunks instead of 5 full files -``` - -### Example 2: API Integration Audit - -**❌ Original plan:** -``` -Audit mentions Prime API files: -- src/services/prime/internal_api/client.ts -- src/services/prime/api.ts -Let me just Read these directly... +# Result: ~500 tokens with ranked, relevant chunks ``` -**✅ After interception:** -```bash -claudemem search "Prime API integration endpoints HTTP client" -n 20 -# This finds ALL Prime-related code, ranked by relevance -# Not just the 2 files mentioned -``` +**Verification:** Compare token cost of planned reads vs. semantic search result size. -### Example 3: Pattern Discovery +## Example: Pattern Discovery -**❌ Original plan:** -``` -Glob("src/**/*.controller.ts") -Then read all 15 controllers to understand routing -``` - -**✅ After interception:** ```bash +# Instead of: Glob("src/**/*.controller.ts") then reading 15 files claudemem search "HTTP controller endpoint route handler" -n 20 -# Gets the most relevant routing code, not all controllers +# Gets the most relevant routing code, ranked by PageRank ``` ---- - -## Why Semantic Search Often Works Better +## Token Cost Comparison -| Native Tools | Semantic Search | -|--------------|-----------------| -| No ranking | Ranked by relevance + PageRank | -| No relationships | Shows code connections | -| ~5000 tokens for 5 files | ~500 tokens for ranked results | -| Only explicitly requested code | Discovers related code | +| Approach | Token Cost | Ranking | +|----------|------------|---------| +| Read 5+ files | ~5000 tokens | None | +| Glob + read all matches | ~3000+ tokens | None | +| `claudemem search` once | ~500 tokens | By relevance + PageRank | -**Tip:** For investigation tasks, try `claudemem search` first to get a ranked view of relevant code. +## Bypass Flag ---- - -## Integration with Other Skills - -This skill works with: - -| Skill | Relationship | -|-------|-------------| -| `code-search-selector` | Selector determines WHAT tool; Interceptor validates BEFORE execution | -| `claudemem-search` | Interceptor redirects to claudemem; this skill shows HOW to search | -| `deep-analysis` | Interceptor prevents bad patterns; deep-analysis uses good patterns | -| Detective skills | Interceptor prevents duplicate work by trusting detective agents | - ---- - -## Hook System Integration - -The hook system may provide claudemem results proactively when the index is available: - -- **Grep queries** → May receive claudemem search results instead -- **Bulk reads** → May receive suggestion to use semantic search -- **Broad globs** → May receive map results +When native tool behavior is specifically needed: -### Using the Bypass Flag - -When you specifically need native tool behavior: ```json { "pattern": "exact string", "_bypass_claudemem": true } ``` -This tells hooks you intentionally want native tool output. - ---- - -## Quick Reference - -Before bulk Read/Glob operations, consider: - -1. **Is claudemem indexed?** → `claudemem status` -2. **Can this be one semantic query?** → Often yes -3. **Do you need exact matches?** → Use native tools with bypass flag - -**General guideline:** For understanding/investigation, try semantic search first. For exact matches, use native tools. - ---- +## Notes -**Maintained by:** MadAppGang -**Plugin:** code-analysis v2.16.0 -**Purpose:** Help optimize bulk file operations with semantic search alternatives +- Works with code-search-selector (determines tool), deep-analysis (uses good patterns), and detective skills +- If claudemem is not indexed, run `claudemem index -y` first +- Hook system may proactively provide claudemem results for grep and bulk read operations diff --git a/plugins/code-analysis/skills/ultrathink-detective/SKILL.md b/plugins/code-analysis/skills/ultrathink-detective/SKILL.md index 29ce3e0..34f9a75 100644 --- a/plugins/code-analysis/skills/ultrathink-detective/SKILL.md +++ b/plugins/code-analysis/skills/ultrathink-detective/SKILL.md @@ -1,892 +1,91 @@ --- name: ultrathink-detective -description: "⚡ Comprehensive analysis skill. Best for: 'comprehensive audit', 'deep analysis', 'full codebase review', 'multi-perspective investigation', 'complex questions'. Combines all perspectives (architect+developer+tester+debugger). Uses Opus model with full claudemem AST analysis." -allowed-tools: Bash, Task, Read, AskUserQuestion -model: opus +description: "Runs comprehensive multi-perspective codebase analysis using all claudemem AST commands (map, symbol, callers, callees, context, search). Covers architecture, implementation, testing, reliability, security, performance, and code health dimensions. Use when asked for a comprehensive audit, deep analysis, full codebase review, or multi-perspective investigation." --- -# Ultrathink Detective Skill +# Ultrathink Detective -This skill uses ALL claudemem commands for comprehensive multi-perspective investigation. +Senior Principal Engineer analysis combining all detective perspectives (architect, developer, tester, debugger) using Opus model with full claudemem AST analysis across seven dimensions. -## Combines All Detective Perspectives +## Workflow -| Perspective | Focus | Commands Used | -|-------------|-------|---------------| -| Architect | System design, layers | `map`, `symbol` | -| Developer | Implementation, flow | `callers`, `callees` | -| Tester | Coverage, gaps | `callers` for tests | -| Debugger | Root cause, chains | `context` | +1. **Verify setup** — confirm claudemem v0.3.0+ is installed and indexed. Check index freshness; reindex if stale files detected. +2. **Architecture mapping** — run `claudemem --agent map` to identify high-PageRank symbols (> 0.05) as architectural pillars. +3. **Critical path analysis** — for each pillar, run `symbol`, `callers`, `callees`, and `context` to trace dependencies and usage. +4. **Test coverage assessment** — check callers of critical functions for test file references. High PageRank + 0 test callers = critical gap. +5. **Risk identification** — analyze security symbols, error handling chains, and external integrations. +6. **Technical debt inventory** — search for TODO/FIXME, identify god classes (> 20 callees), find orphaned code. +7. **Code health check** (v0.4.0+) — run `dead-code` and `test-gaps` commands, categorize by PageRank impact. +8. **Generate report** — produce executive summary with per-dimension scores and prioritized action items. -**Full command set:** -- `claudemem --agent map "query"` - Architecture overview -- `claudemem --agent symbol ` - Exact locations -- `claudemem --agent callers ` - Impact analysis -- `claudemem --agent callees ` - Dependency tracing -- `claudemem --agent context ` - Full call chain -- `claudemem --agent search "query"` - Semantic search +## Seven Analysis Dimensions -# Ultrathink Detective Skill +| Dimension | Primary Command | Focus | +|-----------|----------------|-------| +| Architecture | `map` | Layers, core abstractions, PageRank | +| Implementation | `callers`/`callees` | Data flow, dependencies | +| Testing | `callers` (test files) | Coverage gaps | +| Reliability | `context` | Error handling chains | +| Security | `symbol` + `callers` | Auth flow, sensitive data | +| Performance | `search` | Database patterns, async, caching | +| Code Health | `dead-code`, `test-gaps` | Cleanup candidates, coverage | -**Version:** 3.3.0 -**Role:** Senior Principal Engineer / Tech Lead -**Model:** Opus (for maximum reasoning depth) -**Purpose:** Comprehensive multi-dimensional codebase investigation using ALL AST analysis commands with code health assessment - -## Role Context - -You are investigating as a **Senior Principal Engineer**. Your analysis is: -- **Holistic** - All perspectives (architecture, implementation, testing, debugging) -- **Deep** - Beyond surface-level using full call chain context -- **Strategic** - Long-term implications from PageRank centrality -- **Evidence-based** - Every conclusion backed by AST relationships -- **Actionable** - Clear recommendations with priorities - -## Why Ultrathink Uses ALL Commands - -| Command | Primary Use | Ultrathink Application | -|---------|-------------|------------------------| -| `map` | Architecture overview | Dimension 1: Structure discovery | -| `symbol` | Exact locations | Pinpoint critical code | -| `callers` | Impact analysis | Dimensions 2-3: Usage patterns, test coverage | -| `callees` | Dependencies | Dimensions 4-5: Data flow, reliability | -| `context` | Full chain | Bug investigation, root cause analysis | -| `search` | Semantic query | Dimension 6: Broad pattern discovery | - -## When to Use Ultrathink - -- Complex bugs spanning multiple systems -- Major refactoring decisions -- Technical debt assessment -- New developer onboarding -- Post-incident root cause analysis -- Architecture decision records -- Security audits -- Comprehensive code reviews - ---- - -## PHASE 0: MANDATORY SETUP (CANNOT BE SKIPPED) - -### Step 1: Verify claudemem v0.3.0 - -```bash -which claudemem && claudemem --version -# Must be 0.3.0+ -``` - -### Step 2: If Not Installed → STOP - -**DO NOT FALL BACK TO GREP.** Use AskUserQuestion: - -```typescript -AskUserQuestion({ - questions: [{ - question: "claudemem v0.3.0 (AST structural analysis) is required. Grep/find are NOT acceptable alternatives. How proceed?", - header: "Required", - multiSelect: false, - options: [ - { label: "Install via npm (Recommended)", description: "npm install -g claude-codemem" }, - { label: "Install via Homebrew", description: "brew tap MadAppGang/claude-mem && brew install --cask claudemem" }, - { label: "Cancel", description: "I'll install manually" } - ] - }] -}) -``` - -### Step 3: Check Index Status - -```bash -# Check claudemem installation and index -claudemem --version && ls -la .claudemem/index.db 2>/dev/null -``` - -### Step 3.5: Check Index Freshness - -Before proceeding with investigation, verify the index is current: - -```bash -# First check if index exists -if [ ! -d ".claudemem" ] || [ ! -f ".claudemem/index.db" ]; then - # Use AskUserQuestion to prompt for index creation - # Options: [1] Create index now (Recommended), [2] Cancel investigation - exit 1 -fi - -# Count files modified since last index -STALE_COUNT=$(find . -type f \( -name "*.ts" -o -name "*.tsx" -o -name "*.js" -o -name "*.jsx" -o -name "*.py" -o -name "*.go" -o -name "*.rs" \) \ - -newer .claudemem/index.db 2>/dev/null | grep -v "node_modules" | grep -v ".git" | grep -v "dist" | grep -v "build" | wc -l) -STALE_COUNT=$((STALE_COUNT + 0)) # Normalize to integer - -if [ "$STALE_COUNT" -gt 0 ]; then - # Get index time with explicit platform detection - if [[ "$OSTYPE" == "darwin"* ]]; then - INDEX_TIME=$(stat -f "%Sm" -t "%Y-%m-%d %H:%M" .claudemem/index.db 2>/dev/null) - else - INDEX_TIME=$(stat -c "%y" .claudemem/index.db 2>/dev/null | cut -d'.' -f1) - fi - INDEX_TIME=${INDEX_TIME:-"unknown time"} - - # Get sample of stale files - STALE_SAMPLE=$(find . -type f \( -name "*.ts" -o -name "*.tsx" \) \ - -newer .claudemem/index.db 2>/dev/null | grep -v "node_modules" | grep -v ".git" | head -5) - - # Use AskUserQuestion to ask user how to proceed - # Options: [1] Reindex now (Recommended), [2] Proceed with stale index, [3] Cancel -fi -``` - -**AskUserQuestion Template for Stale Index:** - -```typescript -AskUserQuestion({ - questions: [{ - question: `${STALE_COUNT} files have been modified since the last index (${INDEX_TIME}). The claudemem index may be outdated, which could cause missing or incorrect results. How would you like to proceed?`, - header: "Index Freshness Warning", - multiSelect: false, - options: [ - { - label: "Reindex now (Recommended)", - description: `Run claudemem index to update. Takes ~1-2 minutes. Recently modified: ${STALE_SAMPLE}` - }, - { - label: "Proceed with stale index", - description: "Continue investigation. May miss recent code changes." - }, - { - label: "Cancel investigation", - description: "I'll handle this manually." - } - ] - }] -}) -``` - -**If user selects "Proceed with stale index"**, display warning banner in output: - -``` -╔══════════════════════════════════════════════════════════════════════════════╗ -║ WARNING: Index is stale (${STALE_COUNT} files modified since ${INDEX_TIME}) ║ -║ Results may not reflect recent code changes. ║ -╚══════════════════════════════════════════════════════════════════════════════╝ -``` - -### Step 4: Index if Needed - -```bash -claudemem index -``` - ---- - -## Multi-Dimensional Analysis Framework (v0.3.0) - -### Dimension 1: Architecture (map command) - -```bash -# Get overall structure with PageRank -claudemem --agent map -# Focus on high-PageRank symbols (> 0.05) - these ARE the architecture - -# Layer identification -claudemem --agent map "controller handler endpoint" # Presentation -claudemem --agent map "service business logic" # Business -claudemem --agent map "repository database query" # Data - -# Pattern detection -claudemem --agent map "factory create builder"claudemem --agent map "interface abstract contract"claudemem --agent map "event emit subscribe"``` - -### Dimension 2: Implementation (callers/callees) - -```bash -# For high-PageRank symbols, trace dependencies -claudemem --agent callees PaymentService -# What calls critical code? -claudemem --agent callers processPayment -# Full dependency chain -claudemem --agent context OrderController``` - -### Dimension 3: Test Coverage (callers analysis) - -```bash -# Find tests for critical functions -claudemem --agent callers authenticateUser# Look for callers from *.test.ts or *.spec.ts - -# Map test infrastructure -claudemem --agent map "test spec describe it"claudemem --agent map "mock stub spy helper" -# Coverage gaps = functions with 0 test callers -claudemem --agent callers criticalFunction# If no test file callers → coverage gap -``` - -### Dimension 4: Reliability (context command) +## Example: Full Audit ```bash -# Error handling chains -claudemem --agent context handleError -# Exception flow -claudemem --agent map "throw error exception"claudemem --agent callers CustomError -# Recovery patterns -claudemem --agent map "retry fallback circuit"``` - -### Dimension 5: Security (symbol + callers) - -```bash -# Authentication -claudemem --agent symbol authenticateclaudemem --agent callees authenticateclaudemem --agent callers authenticate -# Authorization -claudemem --agent map "permission role check guard" -# Sensitive data -claudemem --agent map "password hash token secret"claudemem --agent callers encrypt``` - -### Dimension 6: Performance (semantic search) - -```bash -# Database patterns -claudemem --agent search "query database batch" -# Async patterns -claudemem --agent map "async await promise parallel" -# Caching -claudemem --agent map "cache memoize store"``` - -### Dimension 6: Performance Feedback Tracking (v0.8.0+) - -Ultrathink uses `search` in the Performance dimension. Track feedback for these searches: - -```bash -# Dimension 6: Performance (semantic search) -PERF_QUERY="query database batch" -PERF_RESULTS=$(claudemem --agent search "$PERF_QUERY") - -# Initialize tracking strings (POSIX-compatible) -PERF_HELPFUL="" -PERF_UNHELPFUL="" - -# During analysis, track results: -# When you read a result and it's useful for performance analysis: -PERF_HELPFUL="$PERF_HELPFUL,abc123" - -# When you read a result and it's not relevant: -PERF_UNHELPFUL="$PERF_UNHELPFUL,def456" - -# At end of investigation, report (v0.8.0+ only): -if claudemem feedback --help 2>&1 | grep -qi "feedback"; then - timeout 5 claudemem feedback \ - --query "$PERF_QUERY" \ - --helpful "${PERF_HELPFUL#,}" \ - --unhelpful "${PERF_UNHELPFUL#,}" \ - 2>/dev/null || true -fi -``` - -### Dimension 7: Code Health (v0.4.0+ Required) - -```bash -# Dead code detection -DEAD=$(claudemem --agent dead-code) - -if [ -n "$DEAD" ]; then - # Categorize: - # - High PageRank dead = Something broke (investigate) - # - Low PageRank dead = Cleanup candidate - echo "Dead Code Analysis:" - echo "$DEAD" -else - echo "No dead code found - excellent hygiene!" -fi - -# Test coverage gaps -GAPS=$(claudemem --agent test-gaps) - -if [ -n "$GAPS" ]; then - # Impact analysis for high-PageRank gaps - echo "Test Gap Analysis:" - echo "$GAPS" - - # For critical gaps, show full impact - for symbol in $(echo "$GAPS" | grep "pagerank: 0.0[5-9]" | awk '{print $4}'); do - echo "Impact for critical untested: $symbol" - claudemem --agent impact "$symbol" done -else - echo "No test gaps found - excellent coverage!" -fi -``` - ---- - -## Comprehensive Analysis Workflow (v0.3.0) - -### Phase 1: Architecture Mapping (10 min) - -```bash -# Get structural overview with PageRank +# Phase 1: Architecture claudemem --agent map -# Document high-PageRank symbols (> 0.05) -# These are architectural pillars - understand first - -# Map each layer -claudemem --agent map "controller route endpoint"claudemem --agent map "service business domain"claudemem --agent map "repository data persist"``` +claudemem --agent map "controller handler endpoint" +claudemem --agent map "service business logic" -### Phase 2: Critical Path Analysis (15 min) - -```bash -# For each high-PageRank symbol: - -# 1. Get exact location +# Phase 2: Critical paths claudemem --agent symbol PaymentService -# 2. Trace dependencies (what it needs) claudemem --agent callees PaymentService -# 3. Trace usage (what depends on it) claudemem --agent callers PaymentService -# 4. Full context for complex ones -claudemem --agent context PaymentService``` - -### Phase 3: Test Coverage Assessment (10 min) - -```bash -# For each critical function, check callers -claudemem --agent callers processPaymentclaudemem --agent callers authenticateUserclaudemem --agent callers updateProfile -# Count: -# - Test callers (from *.test.ts, *.spec.ts) -# - Production callers - -# High PageRank + 0 test callers = CRITICAL GAP -``` - -### Phase 4: Risk Identification (10 min) - -```bash -# Security symbols -claudemem --agent map "auth session token"claudemem --agent callers validateToken -# Error handling -claudemem --agent map "error exception throw"claudemem --agent context handleFailure -# External integrations -claudemem --agent map "API external webhook"claudemem --agent callers stripeClient``` - -### Phase 5: Technical Debt Inventory (10 min) - -```bash -# Deprecated patterns -claudemem --agent search "TODO FIXME deprecated" -# Complexity indicators (high PageRank but many callees) -claudemem --agent callees LargeService# > 20 callees = potential god class - -# Orphaned code (low PageRank, 0 callers) -claudemem --agent callers unusedFunction``` - ---- - -## Output Format: Comprehensive Report (v0.3.0) - -### Executive Summary - -``` -┌─────────────────────────────────────────────────────────────────┐ -│ CODEBASE COMPREHENSIVE ANALYSIS (v0.3.0) │ -├─────────────────────────────────────────────────────────────────┤ -│ Overall Health: 🟡 MODERATE (7.2/10) │ -│ Search Method: claudemem v0.3.0 (AST + PageRank) │ -│ │ -│ Dimensions: │ -│ ├── Architecture: 🟢 GOOD (8/10) [map analysis] │ -│ ├── Implementation: 🟡 MODERATE (7/10) [callers/callees] │ -│ ├── Testing: 🔴 POOR (5/10) [test-gaps] │ -│ ├── Reliability: 🟢 GOOD (8/10) [context tracing] │ -│ ├── Security: 🟡 MODERATE (7/10) [auth callers] │ -│ ├── Performance: 🟢 GOOD (8/10) [async patterns] │ -│ └── Code Health: 🟡 MODERATE (6/10) [dead-code + impact] │ -│ │ -│ Critical: 3 | Major: 7 | Minor: 15 │ -│ │ -│ Search Feedback: │ -│ └── Performance queries: 2 submitted │ -│ └── Helpful results: 5 │ -│ └── Unhelpful results: 3 │ -└─────────────────────────────────────────────────────────────────┘ -``` - -### Dimension 1: Architecture (from map) - -``` -Core Abstractions (PageRank > 0.05): -├── UserService (0.092) - Central business logic -├── Database (0.078) - Data access foundation -├── AuthMiddleware (0.056) - Security boundary -└── EventBus (0.051) - Cross-cutting concerns - -Layer Structure: -┌─────────────────────────────────────────────────────────┐ -│ PRESENTATION (src/controllers/) │ -│ └── UserController (0.034) │ -│ └── AuthController (0.028) │ -│ ↓ │ -│ BUSINESS (src/services/) │ -│ └── UserService (0.092) ⭐HIGH PAGERANK │ -│ └── AuthService (0.067) │ -│ ↓ │ -│ DATA (src/repositories/) │ -│ └── UserRepository (0.045) │ -│ └── Database (0.078) ⭐HIGH PAGERANK │ -└─────────────────────────────────────────────────────────┘ -``` - -### Dimension 2: Implementation (from callers/callees) - -``` -Critical Data Flows: -processPayment (PageRank: 0.045) -├── CALLEES (dependencies): -│ ├── validateCard → stripeClient.validateCard -│ ├── getCustomer → Database.query -│ ├── chargeStripe → stripeClient.charge -│ └── saveTransaction → TransactionRepository.save -│ -└── CALLERS (usage): - ├── CheckoutController.submit:45 - ├── SubscriptionService.renew:89 - └── RetryQueue.processPayment:23 -``` - -### Dimension 3: Test Coverage (from callers) - -``` -| Function | Test Callers | Prod Callers | Coverage | -|---------------------|--------------|--------------|----------| -| authenticateUser | 5 | 12 | ✅ Good | -| processPayment | 3 | 8 | ✅ Good | -| calculateDiscount | 0 | 4 | ❌ None | -| sendEmail | 1 | 6 | ⚠️ Low | -| updateUserProfile | 0 | 3 | ❌ None | - -🔴 CRITICAL GAPS (high PageRank + 0 test callers): - └── calculateDiscount (PageRank: 0.034) - └── callers: 4 production, 0 tests -``` - -### Dimension 4: Reliability (from context) - -``` -Error Handling Chain: - -handleAuthError (context analysis): -├── Defined: src/middleware/auth.ts:45 -├── CALLERS (error sources): -│ ├── validateToken:23 → throws on invalid -│ ├── refreshSession:67 → throws on expired -│ └── checkPermission:89 → throws on denied -└── CALLEES (error handling): - ├── logError → Logger.error - ├── notifyAdmin → AlertService.send (if critical) - └── formatResponse → ErrorFormatter.toJSON -``` - -### Dimension 5: Security (from symbol + callers) - -``` -Authentication Flow: - -authenticate (PageRank: 0.067) -├── Location: src/services/auth.ts:23-67 -├── CALLEES: -│ ├── bcrypt.compare (password verification) -│ ├── jwt.sign (token generation) -│ └── SessionStore.create (session persistence) -└── CALLERS (entry points): - ├── LoginController.login:12 ✅ - ├── OAuthController.callback:45 ✅ - └── APIMiddleware.verify:23 ⚠️ (rate limiting?) -``` - -### Dimension 6: Performance (from map + callees) - -``` -Database Access Patterns: - -UserRepository.findWithRelations (PageRank: 0.028) -├── CALLEES: -│ ├── Database.query (1 call) -│ ├── RelationLoader.load (per relation) ⚠️ N+1? -│ └── Cache.get (optimization) -└── CALLERS: 8 locations - └── 3 in loops ⚠️ Potential N+1 - -Recommendation: Batch relation loading or use joins -``` - ---- - -## Action Items (Prioritized by PageRank Impact) - -``` -🔴 IMMEDIATE (This Sprint) - Affects High-PageRank Code - - 1. Add tests for calculateDiscount (PageRank: 0.034) - └── callers show: 4 production uses, 0 tests - - 2. Fix N+1 query in UserRepository.findWithRelations - └── callees show: RelationLoader called per item - - 3. Add rate limiting to APIMiddleware.verify - └── callers show: All API endpoints exposed - -🟠 SHORT-TERM (Next 2 Sprints) - - 4. Add error recovery to PaymentService - └── context shows: No retry on Stripe failures - - 5. Increase test coverage for AuthService - └── callers show: Only 2 test files cover critical code - -🟡 MEDIUM-TERM (This Quarter) - - 6. Refactor UserService (PageRank: 0.092) - └── callees show: 23 dependencies (god class pattern) - - 7. Add observability to EventBus - └── callers show: 15 publishers, no monitoring -``` - ---- - -## Result Validation Pattern - -After EVERY claudemem command, validate results to ensure quality: - -### Validation Per Dimension - -Each dimension MUST validate its claudemem results before proceeding: - -**Dimension 1: Architecture (map)** - -```bash -RESULTS=$(claudemem --agent map) -EXIT_CODE=$? - -# Check for command failure -if [ "$EXIT_CODE" -ne 0 ]; then - echo "ERROR: claudemem map failed" - # Diagnose and ask user (see Fallback Protocol below) - exit 1 -fi - -# Check for empty results -if [ -z "$RESULTS" ]; then - echo "WARNING: No architectural symbols found - index may be empty" - # Ask user to reindex or cancel -fi - -# Validate PageRank values present -if ! echo "$RESULTS" | grep -q "pagerank:"; then - echo "WARNING: No PageRank data - index may be corrupted or outdated" - # Ask user to reindex -fi -``` - -**Dimension 2-6: All Other Commands** - -```bash -RESULTS=$(claudemem --agent [command] [args]) -EXIT_CODE=$? - -# Check exit code -if [ "$EXIT_CODE" -ne 0 ]; then - # Diagnose index health - DIAGNOSIS=$(claudemem --version && ls -la .claudemem/index.db 2>&1) - # Use AskUserQuestion for recovery options -fi - -# Check for empty/irrelevant results -# Extract keywords from the user's investigation query -# Example: QUERY="how does auth work" → KEYWORDS="auth work authentication" -# The orchestrating agent must populate KEYWORDS before this check -MATCH_COUNT=0 -for kw in $KEYWORDS; do - if echo "$RESULTS" | grep -qi "$kw"; then - MATCH_COUNT=$((MATCH_COUNT + 1)) - fi -done - -if [ "$MATCH_COUNT" -eq 0 ]; then - # Results don't match query - potentially irrelevant - # Use AskUserQuestion (see Fallback Protocol) -fi -``` - -**Dimension 3: Test Coverage (callers)** - -```bash -RESULTS=$(claudemem --agent callers $FUNCTION) +# Phase 3: Test coverage +claudemem --agent callers authenticateUser +# Check: are any callers from *.test.ts or *.spec.ts? -# Even 0 callers is valid - but validate it's not an error -if echo "$RESULTS" | grep -qi "error\|not found"; then - # Actual error vs no callers - # Use AskUserQuestion -fi +# Phase 4: Code health (v0.4.0+) +claudemem --agent dead-code +claudemem --agent test-gaps ``` ---- - -## FALLBACK PROTOCOL - -**CRITICAL: Never use grep/find/Glob without explicit user approval.** - -``` -╔══════════════════════════════════════════════════════════════════════════════╗ -║ ║ -║ FALLBACK PROTOCOL (NEVER SILENT) ║ -║ ║ -║ If claudemem fails OR returns irrelevant results: ║ -║ ║ -║ 1. STOP - Do not silently switch to grep/find ║ -║ 2. DIAGNOSE - Run claudemem status to check index health ║ -║ 3. COMMUNICATE - Tell user what happened ║ -║ 4. ASK - Get explicit user permission via AskUserQuestion ║ -║ ║ -║ grep/find/Glob ARE FORBIDDEN without explicit user approval ║ -║ ║ -╚══════════════════════════════════════════════════════════════════════════════╝ -``` - -### Fallback Decision Tree - -If claudemem fails or returns unexpected results: - -1. **STOP** - Do not silently switch tools -2. **DIAGNOSE** - Run `claudemem status` -3. **REPORT** - Tell user what happened -4. **ASK** - Use AskUserQuestion for next steps - -```typescript -// Fallback AskUserQuestion Template -AskUserQuestion({ - questions: [{ - question: "claudemem [command] failed or returned irrelevant results. How should I proceed?", - header: "Investigation Issue", - multiSelect: false, - options: [ - { label: "Reindex codebase", description: "Run claudemem index (~1-2 min)" }, - { label: "Try different query", description: "Rephrase the search" }, - { label: "Use grep (not recommended)", description: "Traditional search - loses semantic understanding" }, - { label: "Cancel", description: "Stop investigation" } - ] - }] -}) -``` - -### Grep Fallback Warning - -If user explicitly chooses grep fallback, display this warning: - -```markdown -## WARNING: Using Fallback Search (grep) - -You have chosen to use grep as a fallback. Please understand the limitations: - -| Feature | claudemem | grep | -|---------|-----------|------| -| Semantic understanding | Yes | No | -| Call graph analysis | Yes | No | -| Symbol relationships | Yes | No | -| PageRank ranking | Yes | No | -| False positives | Low | High | - -**Recommendation:** After completing this task, run `claudemem index` to rebuild -the index for future investigations. - -Proceeding with grep... -``` - ---- - -## 🚫 FORBIDDEN: DO NOT USE - -```bash -# ❌ ALL OF THESE ARE FORBIDDEN -grep -r "pattern" . -rg "pattern" -find . -name "*.ts" -git grep "term" -Glob({ pattern: "**/*.ts" }) -Grep({ pattern: "function" }) -``` - -## ✅ REQUIRED: ALWAYS USE - -```bash -# ✅ claudemem v0.3.0 AST Commands -claudemem --agent map "query" # Architecture -claudemem --agent symbol # Location -claudemem --agent callers # Impact -claudemem --agent callees # Dependencies -claudemem --agent context # Full chain -claudemem --agent search "query" # Semantic -``` - ---- - -## CRITICAL: NEVER TRUNCATE CLAUDEMEM OUTPUT - -╔══════════════════════════════════════════════════════════════════════════════╗ -║ ║ -║ ⛔ OUTPUT TRUNCATION IS FORBIDDEN ║ -║ ║ -║ claudemem output is ALREADY OPTIMIZED for LLM context windows. ║ -║ Truncating it may hide the most critical results. ║ -║ ║ -║ ❌ NEVER DO THIS (any form of output truncation): ║ -║ claudemem --agent map "query" | head -80 ║ -║ claudemem --agent callers UserService | head -100 ║ -║ claudemem --agent callees Func | tail -50 ║ -║ claudemem --agent impact Svc | head -N ║ -║ claudemem --agent search "auth" | grep -m 10 "pattern" ║ -║ claudemem --agent map "q" | awk 'NR <= 50' ║ -║ claudemem --agent callers X | sed '50q' ║ -║ claudemem --agent search "x" | sort | head -20 ║ -║ claudemem --agent map "q" | grep "pattern" | head -20 ║ -║ ║ -║ WHY `tail` IS EQUALLY PROBLEMATIC: ║ -║ `tail` skips the BEGINNING of output, which often contains: ║ -║ • Summary headers showing total counts ║ -║ • Highest-ranked results (PageRank, relevance score) ║ -║ • Context that explains what follows ║ -║ ║ -║ ✅ ALWAYS DO THIS: ║ -║ claudemem --agent map "query" ║ -║ claudemem --agent callers UserService ║ -║ claudemem --agent callees Func ║ -║ claudemem --agent impact Svc ║ -║ claudemem --agent search "auth" -n 10 # Use built-in limit ║ -║ ║ -║ WHY THIS MATTERS: ║ -║ • search results are sorted by relevance - truncating loses best matches ║ -║ • map results are sorted by PageRank - truncating loses core architecture ║ -║ • callers/callees show ALL dependencies - truncating causes missed changes ║ -║ • impact shows full blast radius - truncating underestimates risk ║ -║ ║ -║ ═══════════════════════════════════════════════════════════════════════ ║ -║ IF OUTPUT IS TOO LARGE, USE BUILT-IN FLAGS: ║ -║ ═══════════════════════════════════════════════════════════════════════ ║ -║ ║ -║ --tokens N Token-limited output (respects LLM context) ║ -║ Example: claudemem --agent map "query" --tokens 2000 ║ -║ ║ -║ --page-size N Pagination with N results per page ║ -║ --page N Fetch specific page number ║ -║ Example: claudemem --agent search "x" --page-size 20 --page 1║ -║ ║ -║ -n N Limit result count at query level (not post-hoc) ║ -║ Example: claudemem --agent search "auth" -n 10 ║ -║ ║ -║ --max-depth N Limit traversal depth (for context, callers, impact) ║ -║ Example: claudemem --agent context Func --max-depth 3 ║ -║ ║ -║ ACCEPTABLE: Piping to file for later analysis ║ -║ claudemem --agent map "query" > /tmp/full-map.txt ║ -║ (Full output preserved, can be processed separately) ║ -║ ║ -╚══════════════════════════════════════════════════════════════════════════════╝ - -NOTE: The freshness check pattern `head -5` for sampling stale files remains valid. - This prohibition applies only to truncating claudemem COMMAND OUTPUT. - ---- - -## Feedback Reporting (v0.8.0+) - -After completing investigation, report search feedback to improve future results. - -### When to Report - -Report feedback ONLY if you used the `search` command during investigation: - -| Result Type | Mark As | Reason | -|-------------|---------|--------| -| Read and used | Helpful | Contributed to investigation | -| Read but irrelevant | Unhelpful | False positive | -| Skipped after preview | Unhelpful | Not relevant to query | -| Never read | (Don't track) | Can't evaluate | - -### Feedback Pattern - -```bash -# Track during investigation -SEARCH_QUERY="your original query" -HELPFUL_IDS="" -UNHELPFUL_IDS="" - -# When reading a helpful result -HELPFUL_IDS="$HELPFUL_IDS,$result_id" +**Verification:** Confirm every conclusion is backed by specific claudemem output with file:line references. -# When reading an unhelpful result -UNHELPFUL_IDS="$UNHELPFUL_IDS,$result_id" +## Output Format -# Report at end of investigation (v0.8.0+ only) -if claudemem feedback --help 2>&1 | grep -qi "feedback"; then - timeout 5 claudemem feedback \ - --query "$SEARCH_QUERY" \ - --helpful "${HELPFUL_IDS#,}" \ - --unhelpful "${UNHELPFUL_IDS#,}" 2>/dev/null || true -fi ``` +CODEBASE COMPREHENSIVE ANALYSIS +Overall Health: [score]/10 -### Output Update +Dimensions: + Architecture: [score] [map analysis] + Implementation: [score] [callers/callees] + Testing: [score] [test-gaps] + Reliability: [score] [context tracing] + Security: [score] [auth callers] + Performance: [score] [async patterns] + Code Health: [score] [dead-code + impact] -Include in investigation report: - -``` -Search Feedback: [X helpful, Y unhelpful] - Submitted (v0.8.0+) +Action Items (by PageRank impact): + IMMEDIATE: [high-PageRank critical gaps] + SHORT-TERM: [important improvements] + MEDIUM-TERM: [tech debt cleanup] ``` ---- - -## Cross-Plugin Integration +## Fallback Protocol -This skill should be used by ANY agent that needs deep analysis: +Never use grep/find/Glob without explicit user approval. If claudemem fails: -| Agent Type | Should Use | From Plugin | -|------------|-----------|-------------| -| `frontend-architect` | `ultrathink-detective` | frontend | -| `api-architect` | `ultrathink-detective` | bun | -| `senior-code-reviewer` | `ultrathink-detective` | frontend | -| Any architect agent | `ultrathink-detective` | any | - -**Agents reference this skill in their frontmatter:** -```yaml ---- -skills: code-analysis:ultrathink-detective ---- -``` +1. **Stop** — do not silently switch tools. +2. **Diagnose** — run `claudemem status`. +3. **Ask user** via AskUserQuestion for next steps (reindex, different query, or cancel). ---- - -## ⚠️ FINAL REMINDER - -``` -╔══════════════════════════════════════════════════════════════════════════════╗ -║ ║ -║ ULTRATHINK = ALL claudemem v0.3.0 AST COMMANDS ║ -║ ║ -║ WORKFLOW: ║ -║ 1. claudemem --agent map ← Architecture (PageRank) ║ -║ 2. claudemem --agent symbol ← Exact locations ║ -║ 3. claudemem --agent callers ← Impact analysis ║ -║ 4. claudemem --agent callees ← Dependencies ║ -║ 5. claudemem --agent context ← Full call chain ║ -║ 6. claudemem --agent search ← Semantic search ║ -║ 7. Read specific file:line (NOT whole files) ║ -║ 8. claudemem feedback ... ← Report helpful/unhelpful (if search used) ║ -║ ║ -║ ❌ grep, find, rg, Glob, Grep tool ║ -║ ║ -║ PageRank > 0.05 = Architectural pillar = Analyze FIRST ║ -║ High PageRank + 0 test callers = CRITICAL coverage gap ║ -║ Performance dimension uses search → Track feedback for Dimension 6 ║ -║ ║ -╚══════════════════════════════════════════════════════════════════════════════╝ -``` - ---- +## Notes -**Maintained by:** MadAppGang -**Plugin:** code-analysis v2.7.0 -**Last Updated:** December 2025 (v3.4.0 - Search feedback protocol support) +- Never truncate claudemem output — use built-in flags (`-n`, `--tokens`, `--max-depth`) instead +- PageRank > 0.05 = architectural pillar, analyze first +- Submit search feedback via `claudemem feedback` (v0.8.0+) after investigation +- Works best with TypeScript, Go, Python, Rust codebases diff --git a/plugins/conductor/skills/help/SKILL.md b/plugins/conductor/skills/help/SKILL.md index 43c2d9a..ae546d6 100644 --- a/plugins/conductor/skills/help/SKILL.md +++ b/plugins/conductor/skills/help/SKILL.md @@ -1,67 +1,47 @@ --- name: help -description: Get help with Conductor - commands, usage examples, and best practices -version: 1.0.0 -tags: [conductor, help, documentation, guide] -keywords: [help, guide, usage, commands, reference] +description: "Displays Conductor commands, usage examples, directory structure, and troubleshooting tips. Provides quick-start guide and best practices for context-driven development. Use when asking how Conductor works or what commands are available." --- -plugin: conductor -updated: 2026-01-20 # Conductor Help -Conductor implements Context-Driven Development for Claude Code. +Conductor implements Context-Driven Development for Claude Code - your project context (goals, tech stack, workflow) is documented and maintained alongside your code. -## Philosophy +## Workflow -**Context as a Managed Artifact:** -Your project context (goals, tech stack, workflow) is documented and maintained alongside your code. This context guides all development work. +1. **Identify user need** - determine if user wants command reference, troubleshooting help, or conceptual guidance +2. **Present relevant section** - show the specific information requested rather than the entire reference +3. **Suggest next action** - recommend the appropriate Conductor command to run -**Pre-Implementation Planning:** -Before coding, create a spec (WHAT) and plan (HOW). This ensures clear direction and traceable progress. +## Available Commands -**Safe Iteration:** -Human approval gates at key points. Git-linked commits for traceability. Easy rollback when needed. +| Command | Purpose | +|---------|---------| +| `conductor:setup` | Initialize Conductor - creates `conductor/` with product.md, tech-stack.md, workflow.md | +| `conductor:new-track` | Create a development track with spec.md and hierarchical plan.md | +| `conductor:implement` | Execute tasks from plan with TDD workflow and git commits | +| `conductor:status` | View progress, current tasks, and blockers across all tracks | +| `conductor:revert` | Git-aware logical undo at track, phase, or task level | +| `conductor:help` | Show this reference | -## Available Skills - -### conductor:setup -Initialize Conductor for your project. -- Creates conductor/ directory structure -- Generates product.md, tech-stack.md, workflow.md -- Interactive Q&A with resume capability +## Quick Start -### conductor:new-track -Create a new development track. -- Generates spec.md with requirements -- Creates hierarchical plan.md (phases -> tasks) -- Updates tracks.md index +```bash +# 1. Initialize project context +conductor:setup -### conductor:implement -Execute tasks from your plan. -- Status progression: [ ] -> [~] -> [x] -- Git commits linked to track/task -- Follows workflow.md procedures +# 2. Plan your first feature +conductor:new-track -### conductor:status -View project progress. -- Overall completion percentage -- Current task and blockers -- Multi-track overview +# 3. Start implementing +conductor:implement -### conductor:revert -Git-aware logical undo. -- Revert at Track, Phase, or Task level -- Preview before executing -- State validation after revert +# 4. Check progress anytime +conductor:status -## Quick Start - -1. **Initialize:** Run `conductor:setup` to create context files -2. **Plan:** Run `conductor:new-track` to create your first track -3. **Implement:** Run `conductor:implement` to start working -4. **Check:** Run `conductor:status` to see progress -5. **Undo:** Run `conductor:revert` if you need to roll back +# 5. Roll back if needed +conductor:revert +``` ## Directory Structure @@ -80,24 +60,16 @@ conductor/ ## Best Practices -1. **Keep Context Updated:** Review product.md and tech-stack.md periodically -2. **One Task at a Time:** Focus on completing tasks fully before moving on -3. **Commit Often:** Each task should result in at least one commit -4. **Use Blockers:** Mark tasks as [!] blocked rather than skipping silently -5. **Review Before Proceeding:** Use phase gates to verify quality +1. **Keep context updated** - review product.md and tech-stack.md periodically +2. **One task at a time** - complete tasks fully before moving on +3. **Commit often** - each task should produce at least one commit +4. **Use blockers** - mark tasks as `[!]` blocked rather than skipping silently +5. **Review at phase gates** - verify quality before proceeding to next phase ## Troubleshooting -**"Conductor not initialized"** -Run `conductor:setup` to initialize the conductor/ directory. - -**"Track not found"** -Check tracks.md for available tracks. Track IDs are case-sensitive. - -**"Revert failed"** -Check for uncommitted changes. Commit or stash before reverting. - -## Getting Help - -Use `conductor:help` anytime for this reference. -For issues, check the project documentation or file an issue. +| Error | Solution | +|-------|----------| +| "Conductor not initialized" | Run `conductor:setup` to create the conductor/ directory | +| "Track not found" | Check `tracks.md` for available tracks (IDs are case-sensitive) | +| "Revert failed" | Commit or stash uncommitted changes before reverting | diff --git a/plugins/conductor/skills/implement/SKILL.md b/plugins/conductor/skills/implement/SKILL.md index 2aec3f7..4ff46f5 100644 --- a/plugins/conductor/skills/implement/SKILL.md +++ b/plugins/conductor/skills/implement/SKILL.md @@ -1,267 +1,51 @@ --- name: implement -description: Execute tasks from track plan with TDD workflow and git commit integration -version: 1.1.0 -tags: [conductor, implement, execute, tasks, git, tdd] -keywords: [implement, execute, task, commit, progress, workflow, tdd, phase] +description: "Executes tasks from a track plan using TDD red/green/refactor workflow with git commit integration. Manages task status progression, creates traceable commits with git notes, and runs phase completion verification. Use when ready to start coding tasks from a plan." --- -plugin: conductor -updated: 2026-01-20 - - Implementation Guide & Progress Tracker - - - Task execution and status management - - TDD workflow (Red/Green/Refactor) - - Git commit integration with track references - - Git Notes for audit trail - - Workflow.md procedure following - - Phase Completion Verification Protocol - - Progress tracking and reporting - - - Guide systematic implementation of track tasks using TDD methodology, - maintaining clear status visibility, creating traceable git commits - with notes, following established workflow procedures, and executing - the Phase Completion Protocol at phase boundaries. - - +# Conductor Implement - - - - Use Tasks to mirror plan.md tasks. - Keep Tasks and plan.md in sync. - Mark tasks in BOTH when status changes. - +Guides systematic implementation of track tasks using TDD methodology, maintaining clear status visibility, creating traceable git commits, and executing phase completion verification at phase boundaries. - - Task status MUST follow this progression: - - [ ] (pending) - Not started - - [~] (in_progress) - Currently working - - [x] (complete) - Finished - - [!] (blocked) - Blocked by issue +## Workflow - Only ONE task can be [~] at a time. - +1. **Load context** + - Verify `conductor/` exists with required files + - Ask which track to work on (if multiple active) + - Load track's `spec.md`, `plan.md`, and `conductor/workflow.md` - - Follow Test-Driven Development for each task: +2. **Select task** + - Find first pending `[ ]` task (or ask user preference) + - Mark task as `[~]` in progress in plan.md + - Only ONE task can be `[~]` at a time - **Red Phase:** - 1. Create test file for the feature - 2. Write tests defining expected behavior - 3. Run tests - confirm they FAIL - 4. Do NOT proceed until tests fail +3. **TDD implementation cycle** + - **Red:** Write failing tests for the task, run tests to confirm they FAIL + - **Green:** Write minimum code to pass tests, run tests to confirm they PASS + - **Refactor:** Improve code quality, run tests to confirm they still PASS + - Verify coverage meets >80% requirement - **Green Phase:** - 1. Write MINIMUM code to pass tests - 2. Run tests - confirm they PASS - 3. No refactoring yet +4. **Commit and update** + - Run quality checks (lint, typecheck, test) + - Stage changes and commit with format: `(): ` + - Add git note: `git notes add -m "Task: {phase}.{task} - {title}\nSummary: ...\nFiles Changed: ..."` + - Mark task as `[x]` in plan.md, update `metadata.json` with commit SHA - **Refactor Phase:** - 1. Improve code clarity and performance - 2. Remove duplication - 3. Run tests - confirm they still PASS - +5. **Phase transition check** + - If phase incomplete, continue to next pending task + - If phase complete, execute Phase Completion Protocol (see below) - - After completing each task: - 1. Stage relevant changes - 2. Commit with proper format: - ``` - (): +## Task Status Progression - - Detail 1 - - Detail 2 +| Symbol | Status | Meaning | +|--------|--------|---------| +| `[ ]` | pending | Not started | +| `[~]` | in_progress | Currently working (only one at a time) | +| `[x]` | complete | Finished and committed | +| `[!]` | blocked | Blocked by issue (add note with reason) | - Task: {phase}.{task} - ``` - 3. Attach git note with task summary: - ```bash - git notes add -m "Task: {phase}.{task} - {title} +## Commit Message Format - Summary: {what was accomplished} - - Files Changed: - - {file1}: {description} - - Why: {business reason}" $(git log -1 --format="%H") - ``` - 4. Update metadata.json with commit SHA - - - - | Type | Use For | - |------|---------| - | feat | New feature | - | fix | Bug fix | - | docs | Documentation | - | style | Formatting | - | refactor | Code restructuring | - | test | Adding tests | - | chore | Maintenance | - | perf | Performance | - - - - ALWAYS follow procedures in conductor/workflow.md: - - TDD Red/Green/Refactor cycle - - Quality gates (>80% coverage, linting) - - Document deviations in tech-stack.md - - Phase Completion Protocol at phase end - - - - Pause and ask for user approval: - - Before starting each new phase - - When encountering blockers - - Before marking phase complete - - During Phase Completion Protocol Step 5 - - - - - - Focus on exactly one task. - Complete it fully before moving to next. - No partial implementations. - - - - Write failing tests BEFORE implementation. - This is the Red phase of TDD. - Never skip this step. - - - - Update plan.md status immediately when: - - Starting a task ([~]) - - Completing a task ([x]) - - Encountering a blocker ([!] with note) - - - - Every commit links to track/task. - Commit messages follow type convention. - Git notes provide audit trail. - - - - - - Check conductor/ exists with required files - Ask which track to work on (if multiple active) - Load track's spec.md and plan.md - Load conductor/workflow.md for procedures - Initialize Tasks from plan.md tasks - - - - Find first pending task (or ask user) - Mark task as [~] in_progress in plan.md - TaskUpdate to match - Read task requirements and context - - - - **Red Phase:** Write failing tests for the task - Run tests, confirm they FAIL - **Green Phase:** Write minimum code to pass - Run tests, confirm they PASS - **Refactor Phase:** Improve code quality - Run tests, confirm they still PASS - Verify coverage meets >80% requirement - - - - Run all quality checks (lint, typecheck, test) - If checks fail, fix before proceeding - Stage relevant file changes - Create commit with proper type and message - Add git note with task summary - Mark task as [x] complete in plan.md - Commit plan.md update separately - Update metadata.json with commit info - TaskUpdate to match - - - - Check if phase is complete (all tasks [x]) - If NOT complete, continue to next pending task - If phase IS complete, execute Phase Completion Protocol - - - - - **Execute when all tasks in a phase are [x]:** - - 1. **Announce Protocol Start** - Inform user: "Phase {N} complete. Starting verification protocol." - - 2. **Ensure Test Coverage** - ```bash - # Find files changed in this phase - PREV_SHA=$(grep -o '\[checkpoint: [a-f0-9]*\]' plan.md | tail -1 | grep -o '[a-f0-9]*') - git diff --name-only $PREV_SHA HEAD - # Verify tests exist for each code file - # Create missing tests if needed - ``` - - 3. **Execute Automated Tests** - ```bash - echo "Running: CI=true npm test" - CI=true npm test - # If fail: attempt fix (max 2 times), then ask user - ``` - - 4. **Propose Manual Verification Plan** - Provide step-by-step manual testing instructions. - Include specific commands and expected outcomes. - - 5. **Await User Confirmation** - Ask: "Does this meet your expectations? Confirm with 'yes' or provide feedback." - **PAUSE** - do not proceed without explicit yes. - - 6. **Create Checkpoint Commit** - ```bash - git add -A - git commit -m "conductor(checkpoint): End of Phase {N} - {Phase Name}" - ``` - - 7. **Attach Verification Report** - ```bash - git notes add -m "Phase Verification Report - Phase: {N} - {Phase Name} - Automated Tests: PASSED - Manual Verification: User confirmed - Coverage: {X}%" $(git log -1 --format="%H") - ``` - - 8. **Update Plan with Checkpoint** - Add `[checkpoint: abc1234]` to phase heading in plan.md. - - 9. **Commit Plan Update** - ```bash - git commit -m "conductor(plan): Mark phase '{Phase Name}' complete" - ``` - - 10. **Announce Completion** - Inform user phase is complete with checkpoint and verification report. - - - - - - | Symbol | Status | Meaning | - |--------|--------|---------| - | [ ] | pending | Not started | - | [~] | in_progress | Currently working | - | [x] | complete | Finished | - | [!] | blocked | Blocked by issue | - - - ``` (): @@ -271,194 +55,60 @@ updated: 2026-01-20 Task: {phase}.{task} ({task_title}) ``` - Example: -``` -feat(auth): Implement password hashing +Valid types: `feat`, `fix`, `docs`, `style`, `refactor`, `test`, `chore`, `perf` -- Added bcrypt dependency -- Created hashPassword utility function -- Added unit tests for hashing +## Phase Completion Protocol -Task: 2.1 (Implement password hashing) -``` - - - -``` -Task: {phase}.{task} - {task_title} - -Summary: {what was accomplished} - -Files Changed: -- {file1}: {description} -- {file2}: {description} - -Why: {business reason for this change} -``` - +Execute when all tasks in a phase are `[x]`: - - When encountering a blocker: - 1. Mark task as [!] blocked in plan.md - 2. Add note describing blocker: - ```markdown - - [!] 2.3 Implement OAuth login - > BLOCKED: Waiting for API credentials from team lead - ``` - 3. Ask user for guidance - 4. Either resolve or skip to different task - 5. Track blocker in metadata.json - +1. Announce: "Phase {N} complete. Starting verification protocol." +2. Check test coverage for all files changed in this phase +3. Run full test suite: `CI=true npm test` +4. Present manual verification steps to user +5. **PAUSE** - await explicit user confirmation before proceeding +6. Create checkpoint commit: `conductor(checkpoint): End of Phase {N}` +7. Attach verification report via git notes +8. Add `[checkpoint: {sha}]` to phase heading in plan.md - - If implementation differs from tech-stack.md: - 1. STOP implementation - 2. Update tech-stack.md with new design - 3. Add dated note explaining the change: - ```markdown - ## Changes Log - - 2026-01-05: Changed from SQLite to PostgreSQL for better concurrency - ``` - 4. Resume implementation - - +## Human Approval Gates - - - Start implementing the auth feature - - 1. Load feature_auth_20260105 track - 2. Read plan.md - find first pending task: 1.1 Create user table - 3. Mark 1.1 as [~] in plan.md - 4. Initialize Tasks with plan tasks +Pause and ask for user approval: +- Before starting each new phase +- When encountering blockers +- Before marking phase complete (step 5 of Protocol) - **Red Phase:** - 5. Create test file: tests/user-table.test.ts - 6. Write tests for user table schema - 7. Run tests - confirm they FAIL +## Examples - **Green Phase:** - 8. Create migration file - 9. Define user table schema - 10. Run tests - confirm they PASS - - **Refactor Phase:** - 11. Clean up migration code - 12. Run tests - confirm still PASS - - **Commit:** - 13. Run quality checks (all pass) - 14. Commit: "feat(db): Create user table schema" - 15. Add git note with task summary - 16. Commit plan.md update - 17. Mark 1.1 as [x] in plan.md - 18. Update metadata.json - 19. Move to next task 1.2 - - - - - Continue implementing auth - - 1. Complete final task of Phase 1 - 2. All Phase 1 tasks now [x] - 3. Announce: "Phase 1 complete. Starting verification protocol." - - **Phase Completion Protocol:** - 4. Check test coverage for all Phase 1 files - 5. Run: CI=true npm test (PASSED) - 6. Present manual verification steps to user - 7. Ask: "Does this meet your expectations?" - 8. User confirms: "yes" - 9. Create checkpoint commit - 10. Add verification report via git notes - 11. Update plan.md with [checkpoint: abc1234] - 12. Commit plan update - 13. Announce: "Phase 1 checkpoint created. Proceeding to Phase 2." - 14. Ask approval before starting Phase 2 - - - - - Continue implementing auth - - 1. Load track, find current task 2.1 - 2. Start Red Phase, encounter issue - 3. Issue: Missing database credentials - 4. Mark 2.1 as [!] blocked - 5. Add note: "> BLOCKED: Need database credentials configured" - 6. Ask user: "Task 2.1 is blocked. Options: - (A) Provide credentials to continue - (B) Skip to task 2.2 - (C) Pause implementation" - 7. User provides credentials - 8. Remove blocker, mark [~] in_progress - 9. Continue TDD cycle - - - - - - +### Complete a task with TDD ``` -## Implementation Progress - -Track: feature_auth_20260105 -Phase: 2/4 - Core Authentication -Task: 2.1/2.5 - Implement password hashing - -[==========-----] 40% complete - -Recent: -- [x] 1.1 Create user table schema (abc1234) -- [x] 1.2 Add migration scripts (def5678) -- [x] 1.3 Set up database connection (ghi9012) -- [x] 2.1 Implement password hashing (just completed) - -Next: -- [ ] 2.2 Create login endpoint +User: Start implementing the auth feature + +Agent: +1. Loads feature_auth track, finds first pending task: 1.1 Create user table +2. Marks 1.1 as [~] in plan.md +3. Red: Creates test file, writes schema tests, runs → FAIL +4. Green: Creates migration, defines schema, runs → PASS +5. Refactor: Cleans up, runs → PASS +6. Commits: "feat(db): Create user table schema" +7. Adds git note, marks 1.1 as [x], moves to 1.2 ``` - - - -## Task Complete - -**Track:** {track_id} -**Task:** {phase}.{task} - {task_title} -**Commit:** {short_sha} -**Type:** {feat/fix/refactor/etc.} - -**TDD Cycle:** -- Red: Tests written and failing -- Green: Implementation complete, tests passing -- Refactor: Code cleaned up - -**Quality Checks:** -- Lint: PASS -- Tests: PASS ({N} tests, {X}% coverage) -- TypeCheck: PASS -**Next Task:** {next_task_id} - {next_task_title} - -Continue to next task? [Yes/No] - - - -## Phase Complete - -**Track:** {track_id} -**Phase:** {N} - {phase_name} -**Checkpoint:** {checkpoint_sha} - -**Verification Report:** -- Automated Tests: PASSED -- Coverage: {X}% -- Manual Verification: User confirmed +### Handle a blocker +``` +User: Continue implementing auth -**Git Note:** Attached to checkpoint commit +Agent: +1. Starts task 2.1, encounters missing database credentials +2. Marks 2.1 as [!] with note: "BLOCKED: Need database credentials" +3. Asks user: "(A) Provide credentials (B) Skip to 2.2 (C) Pause" +4. User provides credentials → removes blocker, continues TDD cycle +``` -**Next Phase:** {N+1} - {next_phase_name} +## Verification -Proceed to next phase? [Yes/No] - - +After each task completion, confirm: +- [ ] Tests pass with >80% coverage +- [ ] Commit follows type convention with task reference +- [ ] Git note attached with task summary +- [ ] plan.md status updated to `[x]` +- [ ] metadata.json updated with commit SHA diff --git a/plugins/conductor/skills/new-track/SKILL.md b/plugins/conductor/skills/new-track/SKILL.md index 483eb95..8d37493 100644 --- a/plugins/conductor/skills/new-track/SKILL.md +++ b/plugins/conductor/skills/new-track/SKILL.md @@ -1,259 +1,113 @@ --- name: new-track -description: Create development track with spec and hierarchical plan through interactive Q&A -version: 1.0.0 -tags: [conductor, track, planning, spec, phases] -keywords: [new track, feature, bugfix, plan, spec, phases, tasks] +description: "Creates a development track by generating spec.md and hierarchical plan.md through interactive Q&A. Reads project context from product.md and tech-stack.md to inform planning. Use when planning a new feature, bugfix, or refactor." --- -plugin: conductor -updated: 2026-01-20 - - Development Planner & Spec Writer - - - Requirements elicitation and specification - - Hierarchical plan creation (phases/tasks/subtasks) - - Track lifecycle management - - Context-aware planning (reads product.md, tech-stack.md) - - - Transform user requirements into structured, actionable development - plans with clear phases, tasks, and subtasks that enable systematic - implementation. - - +# Conductor New Track - - - - You MUST use Tasks to track planning progress: - 1. Validate conductor setup exists - 2. Gather track requirements - 3. Generate track ID - 4. Create spec.md - 5. Create plan.md with phases - 6. Create metadata.json - 7. Update tracks.md index - +Transforms user requirements into structured, actionable development plans with clear phases, tasks, and subtasks that enable systematic implementation. - - FIRST check if conductor/ directory exists with required files. - If not, HALT and guide user to run conductor:setup first. - +## Workflow - - ALWAYS read these files before planning: - - conductor/product.md (understand project goals) - - conductor/tech-stack.md (know technical constraints) - - conductor/workflow.md (follow team processes) - +1. **Validate conductor setup** + - Check `conductor/` directory exists with `product.md`, `tech-stack.md`, `workflow.md` + - If missing, halt and guide user to run `conductor:setup` first - - Format: {shortname}_{YYYYMMDD} - Examples: - - feature_auth_20260105 - - bugfix_login_20260105 - - refactor_api_20260105 - - +2. **Load project context** + - Read `conductor/product.md` (project goals) + - Read `conductor/tech-stack.md` (technical constraints) + - Read `conductor/tracks.md` (existing tracks) - - - Always create spec.md BEFORE plan.md. - Spec defines WHAT, Plan defines HOW. - +3. **Define track type and ID** + - Ask: What type of work? (Feature / Bugfix / Refactor / Task) + - Ask: Short name for this track? (3-10 chars, lowercase) + - Generate track ID: `{type}_{shortname}_{YYYYMMDD}` - - Plans must have: - - 2-6 Phases (major milestones) - - 2-5 Tasks per phase - - 0-3 Subtasks per task (optional) - +4. **Generate specification** + - Ask: What is the goal? (1-2 sentences) + - Ask: Acceptance criteria? (3-5 items) + - Ask: Technical constraints or dependencies? + - Ask: Edge cases or error scenarios? + - Create `conductor/tracks/{track_id}/spec.md` - - Each task must be: - - Specific (clear outcome) - - Estimable (roughly 1-4 hours) - - Independent (minimal dependencies) - - +5. **Generate plan** + - Propose 2-6 phases based on spec + - Ask user to confirm or modify phases + - Generate 2-5 tasks per phase with optional subtasks + - Create `conductor/tracks/{track_id}/plan.md` - - - Check conductor/ directory exists - Check required files: product.md, tech-stack.md, workflow.md - If missing, HALT with guidance to run setup - Initialize Tasks - +6. **Finalize** + - Create `conductor/tracks/{track_id}/metadata.json` + - Update `conductor/tracks.md` index + - Present summary with phase/task counts - - Read conductor/product.md - Read conductor/tech-stack.md - Read conductor/tracks.md for existing tracks - +## Plan Structure - - Ask: What type of work? [Feature, Bugfix, Refactor, Task, Other] - Ask: Short name for this track? (3-10 chars, lowercase) - Generate track ID: {type}_{shortname}_{YYYYMMDD} - - - - Ask: What is the goal of this work? (1-2 sentences) - Ask: What are the acceptance criteria? (list 3-5) - Ask: Any technical constraints or dependencies? - Ask: Any edge cases or error scenarios to handle? - Generate conductor/tracks/{track_id}/spec.md - - - - Based on spec, propose 2-6 phases - Ask user to confirm or modify phases - For each phase, generate 2-5 tasks - Add subtasks where complexity warrants - Generate conductor/tracks/{track_id}/plan.md - - - - Create conductor/tracks/{track_id}/metadata.json - Update conductor/tracks.md with new track - Present summary to user - - - - - - - **Feature:** New functionality - - Larger scope, 4-6 phases typical - - Includes testing and documentation phases - - **Bugfix:** Fix existing issue - - Smaller scope, 2-3 phases typical - - Includes reproduction and verification phases - - **Refactor:** Code improvement - - Medium scope, 3-4 phases typical - - Includes before/after comparison phase - - **Task:** General work item - - Variable scope - - Flexible structure - - - ```markdown # Plan: {Track Title} Track ID: {track_id} -Type: {Feature/Bugfix/Refactor/Task} -Created: {YYYY-MM-DD} +Type: Feature +Created: 2026-01-05 Status: Active ## Phase 1: {Phase Name} - [ ] 1.1 {Task description} - [ ] 1.2 {Task description} - [ ] 1.2.1 {Subtask} - - [ ] 1.2.2 {Subtask} - [ ] 1.3 {Task description} ## Phase 2: {Phase Name} - [ ] 2.1 {Task description} -- [ ] 2.2 {Task description} - -## Phase 3: Testing & Documentation -- [ ] 3.1 Write unit tests -- [ ] 3.2 Update documentation ``` - - -```markdown -# Spec: {Track Title} +## Constraints -Track ID: {track_id} -Type: {Feature/Bugfix/Refactor/Task} -Created: {YYYY-MM-DD} - -## Goal -{1-2 sentence description of what this achieves} - -## Background -{Context from product.md relevant to this work} +- Always create spec.md BEFORE plan.md (spec defines WHAT, plan defines HOW) +- Plans must have 2-6 phases, 2-5 tasks per phase, 0-3 subtasks per task +- Each task should be specific, estimable (1-4 hours), and minimally dependent +- Conductor setup MUST exist before creating tracks -## Acceptance Criteria -- [ ] {Criterion 1} -- [ ] {Criterion 2} -- [ ] {Criterion 3} +## Track Types -## Technical Constraints -- {Constraint 1 from tech-stack.md} -- {Constraint 2} +| Type | Typical Phases | Scope | +|------|---------------|-------| +| Feature | 4-6 phases (includes testing/docs) | New functionality | +| Bugfix | 2-3 phases (reproduce, fix, verify) | Fix existing issue | +| Refactor | 3-4 phases (includes before/after comparison) | Code improvement | +| Task | Variable | General work item | -## Edge Cases -- {Edge case 1} -- {Edge case 2} +## Examples -## Out of Scope -- {What this track does NOT include} +### New feature track +``` +User: I want to add user authentication + +Agent: +1. Validates conductor/ exists, loads context +2. Track type: Feature, short name: "auth" +3. Generates ID: feature_auth_20260105 +4. Gathers spec: goal, criteria, constraints, edge cases +5. Proposes phases: Database, Core Auth, Sessions, Testing +6. User confirms → generates plan.md with tasks +7. Updates tracks.md index, presents summary ``` - - - - - - I want to add user authentication - - 1. Validate conductor/ exists with required files - 2. Load product.md, tech-stack.md context - 3. Ask track type - "Feature" - 4. Ask short name - "auth" - 5. Generate ID: feature_auth_20260105 - 6. Ask spec questions (goal, criteria, constraints) - 7. Generate spec.md - 8. Propose phases: Database, Core Auth, Sessions, Testing - 9. User confirms phases - 10. Generate plan.md with tasks - 11. Update tracks.md index - - - - - Login page keeps redirecting in a loop - - 1. Validate conductor/ exists - 2. Load context - 3. Track type: "Bugfix" - 4. Short name: "login-loop" - 5. Generate ID: bugfix_login-loop_20260105 - 6. Spec: Reproduce, root cause, fix approach - 7. Plan phases: Reproduce (1), Fix (2), Verify (3) - 8. Generate files and update index - - - - - - -## Track Created Successfully -**Track ID:** {track_id} -**Type:** {type} +### Bugfix track +``` +User: Login page keeps redirecting in a loop -**Files Created:** -- conductor/tracks/{track_id}/spec.md -- conductor/tracks/{track_id}/plan.md -- conductor/tracks/{track_id}/metadata.json +Agent: +1. Track type: Bugfix, short name: "login-loop" +2. ID: bugfix_login-loop_20260105 +3. Spec: Reproduction steps, root cause hypothesis +4. Plan: Phase 1 (Reproduce), Phase 2 (Fix), Phase 3 (Verify) +``` -**Plan Summary:** -- Phase 1: {name} ({N} tasks) -- Phase 2: {name} ({N} tasks) -- Phase 3: {name} ({N} tasks) -- Total: {X} phases, {Y} tasks +## Verification -**Next Steps:** -1. Review spec.md and plan.md -2. Adjust if needed -3. Run `conductor:implement` to start executing - - +After track creation, confirm: +- [ ] `conductor/tracks/{track_id}/spec.md` exists with acceptance criteria +- [ ] `conductor/tracks/{track_id}/plan.md` exists with phases and tasks +- [ ] `conductor/tracks/{track_id}/metadata.json` exists +- [ ] `conductor/tracks.md` updated with new track entry diff --git a/plugins/conductor/skills/revert/SKILL.md b/plugins/conductor/skills/revert/SKILL.md index 47c8993..9d62c6f 100644 --- a/plugins/conductor/skills/revert/SKILL.md +++ b/plugins/conductor/skills/revert/SKILL.md @@ -1,238 +1,92 @@ --- name: revert -description: Git-aware logical undo at track, phase, or task level with confirmation gates -version: 1.0.0 -tags: [conductor, revert, undo, git, rollback] -keywords: [revert, undo, rollback, git, track, phase, task] +description: "Performs git-aware logical undo at track, phase, or task level with impact preview and confirmation gates. Creates revert commits to preserve history and validates state consistency after rollback. Use when needing to undo completed work or roll back a phase." --- -plugin: conductor -updated: 2026-01-20 - - - Safe Revert Specialist - - - Git history analysis and reversal - - Logical grouping of commits by track/phase/task - - State validation after reversal - - Safe rollback with confirmation gates - - - Enable safe, logical rollback of development work at meaningful - granularity (track/phase/task) while maintaining git history integrity - and project consistency. - - - - - - - You MUST use Tasks to track the revert workflow. - - **Before starting**, create todo list with these 5 phases: - 1. Scope Selection - Identify what to revert (track/phase/task) - 2. Impact Analysis - Find commits, files, status changes - 3. User Confirmation - Present impact and get approval - 4. Execution - Create revert commits and update files - 5. Validation - Verify consistency and report results - - **Update continuously**: - - Mark "in_progress" when starting each phase - - Mark "completed" immediately after finishing - - Keep only ONE phase "in_progress" at a time - - - - ALWAYS require explicit user confirmation before: - - Reverting any commits - - Modifying plan.md status - - Deleting track files - - Show exactly what will be changed BEFORE doing it. - - - - Default to creating revert commits, not force-pushing. - Preserve git history unless user explicitly requests otherwise. - - - - After any revert: - 1. Verify plan.md matches git state - 2. Verify metadata.json is consistent - 3. Run project quality checks - 4. Report any inconsistencies - - - - - - Revert by logical units (track/phase/task), not raw commits. - A task might have multiple commits - revert them together. - - - - Show user exactly what will be reverted before doing it. - List commits, files, status changes. - - - - If full revert fails, offer partial revert options. - Never leave project in inconsistent state. - - - - - - Ask: What to revert? [Track, Phase, Task] - If Track: Ask which track - If Phase: Ask which track, which phase - If Task: Ask which track, which task - - - - Read metadata.json to find related commits - List all commits that will be reverted - List all files that will be affected - List status changes in plan.md - - - - Present impact analysis to user - Ask for explicit confirmation - If declined, abort with no changes - - - - Create revert commits for each original commit - Update plan.md statuses back to [ ] - Update metadata.json to reflect revert - Remove completed tasks from history - - - - Verify git state matches plan.md - Run project quality checks - Report final state to user - - - - - - - **Task Level:** - - Reverts single task's commits - - Updates task status to [ ] - - Preserves other tasks in phase - - **Phase Level:** - - Reverts all tasks in phase - - Updates all task statuses to [ ] - - Preserves other phases - - **Track Level:** - - Reverts entire track - - Optionally deletes track files - - Updates tracks.md index - - - - Find commits for a task using: - 1. metadata.json commit array - 2. Git log searching for "[{track_id}]" pattern - 3. Git notes with task references - - - - **Safe Revert (Default):** - - Create revert commits - - Preserves full history - - Can be undone - - **Hard Reset (Requires explicit request):** - - Reset branch to before commits - - Loses history (unless pushed) - - Cannot be easily undone - - - - - - Undo task 2.3 - - 1. Identify track with task 2.3 - 2. Find commits for task 2.3 from metadata.json - 3. Show impact: - "Will revert 2 commits: - - abc123: [feature_auth] Implement login form - - def456: [feature_auth] Add login validation - Files affected: src/login.tsx, src/auth.ts" - 4. Ask confirmation - 5. Create revert commits - 6. Update plan.md: 2.3 [x] -> [ ] - 7. Update metadata.json - 8. Validate state - - - - - Roll back Phase 2 of the auth feature - - 1. Find all tasks in Phase 2 - 2. Find all commits for those tasks - 3. Show impact: - "Will revert 8 commits affecting Phase 2 (5 tasks): - - 2.1 Implement password hashing (2 commits) - - 2.2 Create login endpoint (3 commits) - - 2.3 Create registration endpoint (3 commits) - Files affected: 12 files" - 4. Ask confirmation: "This will undo significant work. Proceed?" - 5. Create revert commits in reverse order - 6. Update all Phase 2 task statuses to [ ] - 7. Update metadata.json - 8. Validate state - - - - - - -## Revert Impact Analysis - -**Scope:** {Task/Phase/Track} {identifier} - -**Commits to Revert:** {N} -{#each commit} -- {short_sha}: {message} -{/each} - -**Files Affected:** {N} -{#each file} -- {filepath} -{/each} - -**Status Changes in plan.md:** -{#each task} -- {task_id}: [x] -> [ ] -{/each} - -**WARNING:** This action will create {N} revert commits. -Git history will be preserved. - -Proceed with revert? [Yes/No] - - - -## Revert Complete - -**Reverted:** {scope} {identifier} -**Commits Created:** {N} revert commits -**Tasks Reset:** {N} tasks now pending - -**Validation:** -- Plan.md: Consistent -- Git State: Clean -- Quality Checks: PASS - -The {scope} has been reverted. You can re-implement or abandon this work. - - + +# Conductor Revert + +Enables safe, logical rollback of development work at meaningful granularity (track/phase/task) while maintaining git history integrity and project consistency. + +## Workflow + +1. **Select revert scope** + - Ask user: What to revert? (Track / Phase / Task) + - For Track: ask which track + - For Phase: ask which track and which phase + - For Task: ask which track and which task + +2. **Analyze impact** + - Read `metadata.json` to find related commits + - Search git log for `[{track_id}]` pattern + - List all commits, affected files, and plan.md status changes + +3. **Present impact and confirm** + - Show exact commits to revert, files affected, and status changes + - Ask for explicit user confirmation before proceeding + - If declined, abort with no changes + +4. **Execute revert** + - Create revert commits for each original commit (preserves history) + - Update plan.md task statuses back to `[ ]` + - Update `metadata.json` to reflect revert + +5. **Validate consistency** + - Verify plan.md matches git state + - Verify `metadata.json` is consistent + - Run project quality checks + - Report final state to user + +## Revert Levels + +| Level | Scope | Effect | +|-------|-------|--------| +| Task | Single task's commits | Reverts task commits, sets status to `[ ]`, preserves other tasks | +| Phase | All tasks in a phase | Reverts all phase commits, resets all task statuses | +| Track | Entire track | Reverts all commits, optionally deletes track files, updates `tracks.md` | + +## Constraints + +- ALWAYS require explicit user confirmation before reverting any commits +- Default to creating revert commits (non-destructive); only hard-reset if user explicitly requests +- Never leave project in inconsistent state - if full revert fails, offer partial options +- Show exact changes BEFORE executing them + +## Examples + +### Revert a single task +``` +User: Undo task 2.3 + +Agent: +1. Finds commits for task 2.3 from metadata.json +2. Shows impact: + "Will revert 2 commits: + - abc123: [feature_auth] Implement login form + - def456: [feature_auth] Add login validation + Files affected: src/login.tsx, src/auth.ts" +3. Asks confirmation → user confirms +4. Creates revert commits +5. Updates plan.md: 2.3 [x] → [ ] +6. Validates state consistency +``` + +### Revert an entire phase +``` +User: Roll back Phase 2 of the auth feature + +Agent: +1. Finds all tasks in Phase 2 (5 tasks, 8 commits) +2. Shows impact with file count and task list +3. Asks: "This will undo significant work. Proceed?" +4. Creates revert commits in reverse order +5. Resets all Phase 2 task statuses to [ ] +6. Validates state consistency +``` + +## Verification + +After every revert, confirm: +- [ ] Plan.md statuses match git state +- [ ] metadata.json is consistent +- [ ] Project quality checks pass +- [ ] No uncommitted changes remain diff --git a/plugins/conductor/skills/setup/SKILL.md b/plugins/conductor/skills/setup/SKILL.md index 356028d..1bcba06 100644 --- a/plugins/conductor/skills/setup/SKILL.md +++ b/plugins/conductor/skills/setup/SKILL.md @@ -1,240 +1,106 @@ --- name: setup -description: Initialize Conductor with product.md, tech-stack.md, and workflow.md -version: 1.1.0 -tags: [conductor, setup, initialization, context, project] -keywords: [conductor setup, initialize, project context, greenfield, brownfield] +description: "Initializes Conductor by creating product.md, tech-stack.md, and workflow.md through interactive Q&A. Detects greenfield vs brownfield projects and supports resume from interruption. Use when starting a new project or onboarding an existing codebase to Conductor." --- -plugin: conductor -updated: 2026-01-20 - - - Project Context Architect - - - Project initialization and context gathering - - Interactive Q&A for requirements elicitation - - State management and resume capability - - Greenfield vs Brownfield project handling - - - Guide users through structured project initialization, creating - comprehensive context artifacts that serve as the foundation for - all future development work. - - - - - - - You MUST use Tasks to track setup progress: - 1. Check for existing conductor/ directory - 2. Determine project type (Greenfield/Brownfield) - 3. Create product.md through Q&A - 4. Create product-guidelines.md - 5. Create tech-stack.md through Q&A - 6. Create code styleguides - 7. Copy workflow.md template - 8. Finalize setup - - - - Check for conductor/setup_state.json FIRST. - If exists with status != "complete": - 1. Load saved answers - 2. Resume from last incomplete section - 3. Show user what was already collected - - - - - Ask questions SEQUENTIALLY (one at a time) - - Maximum 5 questions per section - - Always include "Type your own answer" option - - Use AskUserQuestion with appropriate question types - - Save state after EACH answer (for resume) - - - - Before any operation: - 1. Check if conductor/ already exists - 2. If complete setup exists, ask: "Re-initialize or abort?" - 3. Respect .gitignore patterns - - - - - - Never ask multiple questions at once. - Wait for answer before asking next question. - - - - Save progress after each answer. - Enable resume from any interruption point. - - - - Gather enough context to be useful. - Don't overwhelm with excessive questions. - - - - - - Check if conductor/ directory exists - If exists, check setup_state.json for resume - If complete setup exists, confirm re-initialization - Initialize Tasks with setup phases - - - - Check for existing code files (src/, package.json, etc.) - Ask user: Greenfield (new) or Brownfield (existing)? - For Brownfield: Scan existing code for context - - - - Ask: What is this project about? (1-2 sentences) - Ask: Who is the target audience? - Ask: What are the 3 main goals? - Ask: Any constraints or requirements? - Generate product.md from answers - - - - Ask: Primary programming language(s)? - Ask: Key frameworks/libraries? - Ask: Database/storage preferences? - Ask: Deployment target? - Generate tech-stack.md from answers - - - - Ask: Any specific coding conventions? - Ask: Testing requirements? - Generate product-guidelines.md - Generate code_styleguides/general.md (always) - Generate language-specific styleguides based on tech stack: - - TypeScript/JavaScript → typescript.md, javascript.md - - Web projects → html-css.md - - Python → python.md - - Go → go.md - - - - - Copy workflow.md template - Create empty tracks.md - Mark setup_state.json as complete - Present summary to user - - - - - - - **Greenfield (New Project):** - - No existing code to analyze - - More questions needed about vision - - Focus on future architecture - - **Brownfield (Existing Project):** - - Scan existing files for context - - Infer tech stack from package.json, requirements.txt, etc. - - Focus on documenting current state - - - - **Additive (Multi-Select):** - - "Which frameworks are you using?" [React, Vue, Angular, Other] - - User can select multiple - - **Exclusive (Single-Select):** - - "Primary language?" [TypeScript, Python, Go, Other] - - User picks one - - **Open-Ended:** - - "Describe your project in 1-2 sentences" - - Free text response - - - + +# Conductor Setup + +Guides users through structured project initialization, creating context artifacts that serve as the foundation for all future development work. + +## Workflow + +1. **Validate existing state** + - Check if `conductor/` directory exists + - If `conductor/setup_state.json` exists with `status != "complete"`, load saved answers and resume from last incomplete section + - If complete setup exists, ask user: "Re-initialize or abort?" + +2. **Detect project type** + - Scan for existing code files (`src/`, `package.json`, `requirements.txt`) + - Ask user: Greenfield (new) or Brownfield (existing)? + - For Brownfield: scan existing code to infer context + +3. **Gather product context** (one question at a time, max 5 per section) + - What is this project about? (1-2 sentences) + - Who is the target audience? + - What are the 3 main goals? + - Any constraints or requirements? + - Generate `conductor/product.md` from answers + +4. **Gather technical context** + - Primary programming language(s)? + - Key frameworks/libraries? + - Database/storage preferences? + - Deployment target? + - Generate `conductor/tech-stack.md` from answers + +5. **Generate guidelines** + - Ask about coding conventions and testing requirements + - Generate `conductor/product-guidelines.md` + - Generate `conductor/code_styleguides/general.md` + - Generate language-specific styleguides (TypeScript, Python, Go, etc.) + +6. **Finalize setup** + - Copy `workflow.md` template + - Create empty `tracks.md` + - Mark `setup_state.json` as complete + - Present summary to user + +## Constraints + +- Ask questions SEQUENTIALLY (one at a time, never multiple) +- Save state to `setup_state.json` after EACH answer for resume capability +- Always include "Type your own answer" option in question prompts +- Respect `.gitignore` patterns when creating files + +## State File Schema + ```json { - "status": "in_progress" | "complete", + "status": "in_progress", "startedAt": "ISO-8601", "lastUpdated": "ISO-8601", - "projectType": "greenfield" | "brownfield", - "currentSection": "product" | "tech" | "guidelines", + "projectType": "greenfield", + "currentSection": "product", "answers": { - "product": { - "description": "...", - "audience": "...", - "goals": ["...", "...", "..."] - }, - "tech": { - "languages": ["TypeScript"], - "frameworks": ["React", "Node.js"] - } + "product": { "description": "...", "audience": "...", "goals": ["..."] }, + "tech": { "languages": ["TypeScript"], "frameworks": ["React"] } } } ``` - - - - - - I want to set up Conductor for my new project - - 1. Check for existing conductor/ - not found - 2. Ask: "Is this a new project (Greenfield) or existing codebase (Brownfield)?" - 3. User: "New project" - 4. Begin product context questions (one at a time) - 5. Save each answer to setup_state.json - 6. After all sections, generate artifacts - 7. Present summary with next steps - - - - - Continue setting up Conductor - - 1. Check conductor/setup_state.json - found, status: "in_progress" - 2. Load previous answers from state - 3. Show: "Resuming setup. You've completed: Product Context" - 4. Continue from Technical Context section - 5. Complete remaining sections - - - - - - - - Friendly, guiding tone - - Clear progress indicators - - Explain why each question matters - - Confirm understanding before proceeding - - - -## Conductor Setup Complete - -**Project:** {project_name} -**Type:** {Greenfield/Brownfield} - -**Created Artifacts:** -- conductor/product.md - Project vision and goals -- conductor/product-guidelines.md - Standards and conventions -- conductor/tech-stack.md - Technical preferences -- conductor/workflow.md - Development workflow (comprehensive) -- conductor/tracks.md - Track index (empty) -- conductor/code_styleguides/general.md - General coding principles -- conductor/code_styleguides/{language}.md - Language-specific guides - -**Next Steps:** -1. Review generated artifacts and adjust as needed -2. Use `conductor:new-track` to plan your first feature -3. Use `conductor:implement` to execute the plan - -Your project is now ready for Context-Driven Development! - - + +## Examples + +### New project setup +``` +User: I want to set up Conductor for my new project + +Agent: +1. Checks for existing conductor/ → not found +2. Asks: "Is this a new project or existing codebase?" +3. User: "New project" +4. Begins product context questions (one at a time) +5. Saves each answer to setup_state.json +6. Generates all artifacts +7. Presents summary with next steps +``` + +### Resume interrupted setup +``` +User: Continue setting up Conductor + +Agent: +1. Finds conductor/setup_state.json with status: "in_progress" +2. Loads previous answers +3. Shows: "Resuming setup. You've completed: Product Context" +4. Continues from Technical Context section +``` + +## Verification + +After setup completes, confirm these files exist: +- `conductor/product.md` +- `conductor/product-guidelines.md` +- `conductor/tech-stack.md` +- `conductor/workflow.md` +- `conductor/tracks.md` +- `conductor/code_styleguides/general.md` diff --git a/plugins/conductor/skills/status/SKILL.md b/plugins/conductor/skills/status/SKILL.md index d2a728b..1cfa530 100644 --- a/plugins/conductor/skills/status/SKILL.md +++ b/plugins/conductor/skills/status/SKILL.md @@ -1,184 +1,87 @@ --- name: status -description: Show active tracks, progress, current tasks, and blockers -version: 1.0.0 -tags: [conductor, status, progress, overview] -keywords: [status, progress, tracks, overview, blockers, current] +description: "Reads conductor tracks, calculates completion percentages, identifies blockers, and recommends next actions. Parses plan.md and metadata.json across all active tracks. Use when checking project progress or asking what to work on next." --- -plugin: conductor -updated: 2026-01-20 - - - Progress Reporter & Status Analyzer - - - Plan.md parsing and analysis - - Progress calculation and visualization - - Blocker identification - - Multi-track overview - - - Provide clear, actionable status reports that help users understand - their project progress, identify next actions, and spot blockers. - - - - - - - This is a read-only skill that only displays status. - Tasks are NOT required because there are no implementation phases. - The skill performs a single atomic operation: read and present status. - - - - This skill ONLY reads files. - It does NOT modify any conductor/ files. - For modifications, use other skills. - - - - Parse ALL of: - - conductor/tracks.md (index) - - conductor/tracks/*/plan.md (all plans) - - conductor/tracks/*/metadata.json (state) - - - - - - Always end with clear "Next Action" recommendation. - Don't just report status, guide next step. - - - - Prominently display any blocked tasks. - Blockers need attention. - - - - - - Check conductor/ exists - Read conductor/tracks.md for track list - For each track, read plan.md and metadata.json - - - - Count tasks by status: pending, in_progress, complete, blocked - Calculate completion percentage per track - Identify current phase and task - Find any blocked tasks - - - - Display overview summary - Show per-track details - Highlight blockers if any - Recommend next action - - - - - - + +# Conductor Status + +Provides clear, actionable status reports showing project progress, blockers, and recommended next actions. This is a read-only skill that never modifies conductor files. + +## Workflow + +1. **Collect data** + - Verify `conductor/` directory exists + - Read `conductor/tracks.md` for track list + - For each track, read `plan.md` and `metadata.json` + +2. **Analyze progress** + - Count tasks by status: `[ ]` pending, `[~]` in progress, `[x]` complete, `[!]` blocked + - Calculate completion percentage per track: `(completed / total) * 100` + - Identify current phase and active task + - Find any blocked tasks + +3. **Present status report** + - Display overview summary (active tracks, total progress, blocker count) + - Show per-track details with progress bars + - Highlight blockers prominently at the top + - Recommend specific next action + +## Display Priority + +1. BLOCKED tasks (need immediate attention) +2. IN_PROGRESS tasks (current work) +3. Active tracks summary +4. Completed tracks (brief mention) + +## Constraints + +- Read-only: never modify any conductor files +- Always end with a clear "Next Action" recommendation +- Parse ALL tracks, not just the first one found + +## Examples + +### Single active track +``` +User: What's my status? + +Agent displays: +## Conductor Status + +### Active Tracks: 1 + +**feature_auth_20260105** - User Authentication +- Progress: [========------] 53% (8/15 tasks) +- Phase: 2/4 - Core Authentication +- Current: 2.3 Create login endpoint +- Blockers: None + +### Next Action +Continue task 2.3 (Create login endpoint). Run: `conductor:implement` ``` -Completion % = (completed_tasks / total_tasks) * 100 - -Example: -- Total tasks: 15 -- Completed: 6 -- In Progress: 1 -- Pending: 8 -- Completion: 40% + +### Multiple tracks with blocker ``` - - - - Display order by priority: - 1. BLOCKED tasks (need attention) - 2. IN_PROGRESS tasks (current work) - 3. Active tracks summary - 4. Completed tracks (brief) - - - - - - What's my status? - - ## Conductor Status - - ### Active Tracks: 1 - - **feature_auth_20260105** - User Authentication - - Progress: [========------] 53% (8/15 tasks) - - Phase: 2/4 - Core Authentication - - Current: 2.3 Create login endpoint - - Blockers: None - - ### Next Action - Continue task 2.3 (Create login endpoint) in feature_auth track. - Run: `conductor:implement` - - - - - Show me the status - - ## Conductor Status - - ### BLOCKERS (1) - - **feature_auth_20260105** Task 2.4: "Waiting for API key from team lead" - - ### Active Tracks: 2 - - **feature_auth_20260105** - User Authentication - - Progress: [=======-------] 47% (7/15 tasks) - - Phase: 2/4 - Core Authentication - - Status: BLOCKED at 2.4 - - **bugfix_login_20260104** - Login redirect fix - - Progress: [==============] 100% (5/5 tasks) - - Status: Ready for review - - ### Next Action - Resolve blocker in feature_auth (Task 2.4) or complete review of bugfix_login. - - - - - - +User: Show me the status + +Agent displays: ## Conductor Status -### Overview -- Active Tracks: {N} -- Total Progress: {X}% ({completed}/{total} tasks) -- Blockers: {N} - -{#if blockers} -### BLOCKERS -{#each blocker} -- **{track_id}** Task {task_id}: "{blocker_description}" -{/each} -{/if} - -### Active Tracks -{#each active_track} -**{track_id}** - {title} -- Progress: [{progress_bar}] {percent}% ({completed}/{total}) -- Phase: {current_phase}/{total_phases} - {phase_name} -- Current: {current_task_id} {current_task_title} -{/each} - -{#if completed_tracks} -### Completed Tracks -{#each completed_track} -- {track_id} - Completed {date} -{/each} -{/if} +### BLOCKERS (1) +- **feature_auth** Task 2.4: "Waiting for API key from team lead" + +### Active Tracks: 2 + +**feature_auth_20260105** - Progress: 47% (7/15) - BLOCKED at 2.4 +**bugfix_login_20260104** - Progress: 100% (5/5) - Ready for review ### Next Action -{recommendation} - - +Resolve blocker in feature_auth (Task 2.4) or review bugfix_login. +``` + +## Verification + +Confirm the status report includes: +- [ ] Completion percentages for all tracks +- [ ] Any blocked tasks highlighted +- [ ] Clear next action recommendation diff --git a/plugins/dev/skills/backend/auth-patterns/SKILL.md b/plugins/dev/skills/backend/auth-patterns/SKILL.md index bbf8a95..5e81759 100644 --- a/plugins/dev/skills/backend/auth-patterns/SKILL.md +++ b/plugins/dev/skills/backend/auth-patterns/SKILL.md @@ -1,21 +1,6 @@ --- name: auth-patterns -version: 1.0.0 -description: Use when implementing authentication (JWT, sessions, OAuth), authorization (RBAC, ABAC), password hashing, MFA, or security best practices for backend services. -keywords: - - authentication - - authorization - - JWT - - sessions - - OAuth - - RBAC - - ABAC - - password hashing - - bcrypt - - MFA - - security -plugin: dev -updated: 2026-01-20 +description: "Implements JWT token generation with refresh rotation, session-based auth with Redis, OAuth 2.0 flows, RBAC/ABAC authorization, bcrypt password hashing, and TOTP multi-factor authentication. Use when adding authentication, authorization, password security, MFA, or rate limiting to backend services." --- # Authentication Patterns diff --git a/plugins/dev/skills/backend/bunjs-architecture/SKILL.md b/plugins/dev/skills/backend/bunjs-architecture/SKILL.md index 05a3dd4..0aaa2ae 100644 --- a/plugins/dev/skills/backend/bunjs-architecture/SKILL.md +++ b/plugins/dev/skills/backend/bunjs-architecture/SKILL.md @@ -1,18 +1,6 @@ --- name: bunjs-architecture -version: 1.0.0 -description: Use when implementing clean architecture (routes/controllers/services/repositories), establishing camelCase conventions, designing Prisma schemas, or planning structured workflows for Bun.js applications. See bunjs for basics, bunjs-production for deployment. -keywords: - - clean architecture - - layered architecture - - camelCase - - naming conventions - - Prisma schema - - repository pattern - - separation of concerns - - code organization -plugin: dev -updated: 2026-01-20 +description: "Implements clean layered architecture (routes/controllers/services/repositories) for Bun.js TypeScript backends, enforces camelCase conventions end-to-end, and provides Prisma schema templates with Zod validation. Use when structuring Bun.js applications, designing database schemas, or establishing coding conventions." --- # Bun.js Clean Architecture Patterns diff --git a/plugins/dev/skills/backend/database-patterns/SKILL.md b/plugins/dev/skills/backend/database-patterns/SKILL.md index 0198804..c24feab 100644 --- a/plugins/dev/skills/backend/database-patterns/SKILL.md +++ b/plugins/dev/skills/backend/database-patterns/SKILL.md @@ -1,20 +1,6 @@ --- name: database-patterns -version: 1.0.0 -description: Use when designing database schemas, implementing repository patterns, writing optimized queries, managing migrations, or working with indexes and transactions for SQL/NoSQL databases. -keywords: - - database design - - schema design - - repository pattern - - SQL queries - - PostgreSQL - - MySQL - - MongoDB - - indexes - - migrations - - transactions -plugin: dev -updated: 2026-01-20 +description: "Designs normalized schemas with indexing strategies, implements repository patterns with typed queries, manages migrations safely, and handles transactions with proper isolation levels for SQL and NoSQL databases. Use when designing schemas, optimizing queries, implementing pagination, or managing database migrations." --- # Database Patterns diff --git a/plugins/dev/skills/backend/error-handling/SKILL.md b/plugins/dev/skills/backend/error-handling/SKILL.md index 989900b..d2d6952 100644 --- a/plugins/dev/skills/backend/error-handling/SKILL.md +++ b/plugins/dev/skills/backend/error-handling/SKILL.md @@ -1,18 +1,6 @@ --- name: error-handling -version: 1.0.0 -description: Use when implementing custom error classes, error middleware, structured logging, retry logic, or graceful shutdown patterns in backend applications. -keywords: - - error handling - - custom errors - - error middleware - - logging - - retry logic - - graceful shutdown - - error responses - - debugging -plugin: dev -updated: 2026-01-20 +description: "Implements custom error class hierarchies, Express error middleware with structured JSON responses, retry logic with exponential backoff, and graceful shutdown handlers. Use when building error handling for backend services, adding structured logging, or implementing retry patterns." --- # Error Handling Patterns diff --git a/plugins/dev/skills/backend/python/SKILL.md b/plugins/dev/skills/backend/python/SKILL.md index c49796e..9476978 100644 --- a/plugins/dev/skills/backend/python/SKILL.md +++ b/plugins/dev/skills/backend/python/SKILL.md @@ -1,18 +1,6 @@ --- name: python -version: 1.0.0 -description: Use when building FastAPI applications, implementing async endpoints, setting up Pydantic schemas, working with SQLAlchemy, or writing pytest tests for Python backend services. -keywords: - - Python - - FastAPI - - Pydantic - - SQLAlchemy - - async - - pytest - - backend - - API -plugin: dev -updated: 2026-01-20 +description: "Provides FastAPI application templates with async SQLAlchemy repositories, Pydantic schema validation, dependency injection patterns, and pytest integration test fixtures. Use when building Python backends, implementing async API endpoints, designing Pydantic models, or writing pytest tests." --- # Python Backend Patterns diff --git a/plugins/dev/skills/backend/rust/SKILL.md b/plugins/dev/skills/backend/rust/SKILL.md index 9f2e02b..98b44bc 100644 --- a/plugins/dev/skills/backend/rust/SKILL.md +++ b/plugins/dev/skills/backend/rust/SKILL.md @@ -1,18 +1,6 @@ --- name: rust -version: 1.0.0 -description: Use when building Axum applications, implementing type-safe handlers, working with SQLx, setting up error handling with thiserror, or writing Rust backend services. -keywords: - - Rust - - Axum - - SQLx - - tokio - - async - - type safety - - backend - - thiserror -plugin: dev -updated: 2026-01-20 +description: "Provides Axum application templates with SQLx type-safe queries, thiserror-based error handling that implements IntoResponse, repository patterns with async Tokio runtime, and integration test setup. Use when building Rust backends, implementing Axum handlers, designing SQLx models, or writing Rust API tests." --- # Rust Backend Patterns diff --git a/plugins/dev/skills/context-detection/SKILL.md b/plugins/dev/skills/context-detection/SKILL.md index 564ae52..b94ba8e 100644 --- a/plugins/dev/skills/context-detection/SKILL.md +++ b/plugins/dev/skills/context-detection/SKILL.md @@ -1,19 +1,6 @@ --- name: context-detection -version: 1.1.0 -description: Use when detecting project technology stack from files/configs/directory structure, auto-loading framework-specific skills, or analyzing multi-stack fullstack projects (e.g., React + Go). -keywords: - - context detection - - stack detection - - technology stack - - project analysis - - auto-detection - - framework detection - - skill discovery -plugin: dev -updated: 2026-02-03 -used_by: stack-detector agent, all dev commands -allowed-tools: Bash(node *) +description: "Detects project technology stacks from config files, directory structure, and file extensions, then auto-loads matching framework-specific skills. Use when analyzing project structure, detecting multi-stack fullstack setups (e.g., React + Go), or mapping stacks to quality checks." --- # Context Detection Skill diff --git a/plugins/dev/skills/core/debugging-strategies/SKILL.md b/plugins/dev/skills/core/debugging-strategies/SKILL.md index 48ce920..1279853 100644 --- a/plugins/dev/skills/core/debugging-strategies/SKILL.md +++ b/plugins/dev/skills/core/debugging-strategies/SKILL.md @@ -1,18 +1,6 @@ --- name: debugging-strategies -version: 1.0.0 -description: Use when troubleshooting bugs, analyzing stack traces, using debugging tools (breakpoints, loggers), or applying systematic debugging methodology across any technology stack. -keywords: - - debugging - - troubleshooting - - stack traces - - breakpoints - - logging - - error analysis - - bug fixing - - debugging tools -plugin: dev -updated: 2026-01-20 +description: "Applies systematic debugging methodology (scientific method, binary search, wolf fence) with structured logging strategies, breakpoint placement patterns, and common bug pattern detection. Use when troubleshooting bugs, analyzing stack traces, tracing data flow, or diagnosing race conditions across any stack." --- # Universal Debugging Strategies diff --git a/plugins/dev/skills/core/testing-strategies/SKILL.md b/plugins/dev/skills/core/testing-strategies/SKILL.md index bc525a3..1b26113 100644 --- a/plugins/dev/skills/core/testing-strategies/SKILL.md +++ b/plugins/dev/skills/core/testing-strategies/SKILL.md @@ -1,19 +1,6 @@ --- name: testing-strategies -version: 1.0.0 -description: Use when writing tests, setting up test frameworks, implementing mocking strategies, or establishing testing best practices (unit, integration, E2E) across any technology stack. -keywords: - - testing - - unit tests - - integration tests - - E2E testing - - mocking - - test coverage - - test-driven development - - TDD - - test pyramid -plugin: dev -updated: 2026-01-20 +description: "Implements the test pyramid (unit/integration/E2E) with AAA pattern, provides mocking strategies (stubs, spies, fakes), and establishes coverage targets by code criticality. Use when writing tests, setting up test frameworks, implementing mocks, or defining testing best practices across any technology stack." --- # Universal Testing Strategies diff --git a/plugins/dev/skills/core/universal-patterns/SKILL.md b/plugins/dev/skills/core/universal-patterns/SKILL.md index 9292c46..8d269c7 100644 --- a/plugins/dev/skills/core/universal-patterns/SKILL.md +++ b/plugins/dev/skills/core/universal-patterns/SKILL.md @@ -1,17 +1,6 @@ --- name: universal-patterns -version: 1.0.0 -description: Use when implementing language-agnostic patterns like layered architecture, dependency injection, error handling, or code organization principles across any technology stack. -keywords: - - architecture - - design patterns - - clean code - - SOLID principles - - error handling - - best practices - - code organization -plugin: dev -updated: 2026-01-20 +description: "Applies language-agnostic architecture patterns including layered architecture, dependency injection, SOLID principles, and error handling strategies. Use when organizing code structure, implementing design patterns, or establishing best practices across any technology stack." --- # Universal Development Patterns diff --git a/plugins/dev/skills/design/design-references/SKILL.md b/plugins/dev/skills/design/design-references/SKILL.md index 11edce4..65d4843 100644 --- a/plugins/dev/skills/design/design-references/SKILL.md +++ b/plugins/dev/skills/design/design-references/SKILL.md @@ -1,10 +1,6 @@ --- name: design-references -version: 1.0.0 -description: | - Predefined design system references for UI reviews. Includes Material Design 3, - Apple Human Interface Guidelines, Tailwind UI, Ant Design, and Shadcn/ui. - Use when conducting design reviews against established design systems. +description: "Provides predefined design system references (Material Design 3, Apple HIG, Tailwind UI, Ant Design, Shadcn/ui) with color palettes, typography scales, spacing tokens, and review checklists. Use when conducting design reviews against established design systems or validating component consistency." --- # Design References Skill diff --git a/plugins/dev/skills/design/ui-analyse/SKILL.md b/plugins/dev/skills/design/ui-analyse/SKILL.md index 548d2b3..83fd8c5 100644 --- a/plugins/dev/skills/design/ui-analyse/SKILL.md +++ b/plugins/dev/skills/design/ui-analyse/SKILL.md @@ -1,10 +1,6 @@ --- name: ui-analyse -version: 2.0.0 -description: | - UI visual analysis patterns using Gemini 3 Pro Preview multimodal capabilities. - Analysis-only - no code changes. Use dev:ui-implement for applying improvements. - Includes provider detection, prompting patterns, and severity guidelines. +description: "Performs visual UI analysis using Gemini multimodal capabilities, detects usability issues with severity scoring, and audits WCAG accessibility compliance. Use when reviewing screenshots, conducting accessibility audits, or comparing implementation against design references." --- # UI Analysis Skill diff --git a/plugins/dev/skills/design/ui-design-review/SKILL.md b/plugins/dev/skills/design/ui-design-review/SKILL.md index 3e15e2b..1417bc5 100644 --- a/plugins/dev/skills/design/ui-design-review/SKILL.md +++ b/plugins/dev/skills/design/ui-design-review/SKILL.md @@ -1,9 +1,6 @@ --- name: ui-design-review -version: 1.0.0 -description: | - Prompting patterns and review templates for UI design analysis with Gemini multimodal capabilities. - Use when conducting design reviews, accessibility audits, or design system validation. +description: "Provides structured prompting patterns for Gemini-powered UI analysis including usability reviews, WCAG accessibility audits, design system consistency checks, and comparative design reviews with severity-ranked output. Use when reviewing screenshots, auditing accessibility, or validating implementation against design specs." --- # UI Design Review Skill diff --git a/plugins/dev/skills/design/ui-implement/SKILL.md b/plugins/dev/skills/design/ui-implement/SKILL.md index ad9caa9..b2f9bbf 100644 --- a/plugins/dev/skills/design/ui-implement/SKILL.md +++ b/plugins/dev/skills/design/ui-implement/SKILL.md @@ -1,10 +1,6 @@ --- name: ui-implement -version: 1.0.0 -description: | - Patterns for implementing UI improvements based on design analysis. - Works with review documents from dev:ui-analyse or /dev:ui command. - Includes Anti-AI design rules and visual verification. +description: "Transforms design review findings into code changes using 5 Anti-AI design rules (break symmetry, add texture, dramatic typography, micro-interactions, bespoke colors) with optional Gemini visual verification. Use when applying UI improvements from analysis, implementing design system changes, or converting generic layouts into distinctive designs." --- # UI Implementation Skill diff --git a/plugins/dev/skills/design/ui-style-format/SKILL.md b/plugins/dev/skills/design/ui-style-format/SKILL.md index 04b06e8..11ec341 100644 --- a/plugins/dev/skills/design/ui-style-format/SKILL.md +++ b/plugins/dev/skills/design/ui-style-format/SKILL.md @@ -1,10 +1,6 @@ --- name: ui-style-format -version: 1.0.0 -description: | - UI design style file format specification with reference image support. - Defines the schema for .claude/design-style.md and .claude/design-references/. - Use when creating, validating, or parsing project design styles. +description: "Defines the schema for .claude/design-style.md and .claude/design-references/ including brand colors, typography, spacing tokens, and reference image management with validation checklists. Use when creating project design style files, validating design tokens, or configuring style-aware UI reviews." --- # UI Style Format Specification diff --git a/plugins/dev/skills/documentation-standards/SKILL.md b/plugins/dev/skills/documentation-standards/SKILL.md index ba71aa4..246ebff 100644 --- a/plugins/dev/skills/documentation-standards/SKILL.md +++ b/plugins/dev/skills/documentation-standards/SKILL.md @@ -1,18 +1,6 @@ --- name: documentation-standards -version: 1.0.0 -description: Use when writing README files, API documentation, user guides, or technical documentation following industry standards from Google, Microsoft, and GitLab style guides. -keywords: - - documentation - - README - - technical writing - - API docs - - style guides - - Markdown - - documentation best practices -plugin: dev -updated: 2026-01-20 -research_source: 73+ authoritative sources with 98% factual integrity +description: "Applies 15 ranked documentation best practices from Google, Microsoft, and GitLab style guides, provides 7 ready-to-use templates (README, TSDoc, ADR, changelog), and detects 42 anti-patterns. Use when writing READMEs, API docs, troubleshooting guides, or validating documentation quality." --- # Documentation Standards diff --git a/plugins/dev/skills/frontend/css-modules/SKILL.md b/plugins/dev/skills/frontend/css-modules/SKILL.md index 7a53195..81d6820 100644 --- a/plugins/dev/skills/frontend/css-modules/SKILL.md +++ b/plugins/dev/skills/frontend/css-modules/SKILL.md @@ -1,9 +1,6 @@ --- name: css-modules -description: | - CSS Modules with Lightning CSS and PostCSS for component-scoped styling. - Covers *.module.css patterns, TypeScript integration, Vite configuration, and composition. - Use when building complex animations, styling third-party components, or migrating legacy CSS. +description: "Configures CSS Modules with Lightning CSS and Vite for component-scoped styling, provides TypeScript type generation options, and implements composition patterns with hybrid Tailwind approaches. Use when building complex animations, styling third-party components, configuring CSS Module TypeScript support, or migrating legacy CSS." --- # CSS Modules diff --git a/plugins/dev/skills/frontend/testing-frontend/SKILL.md b/plugins/dev/skills/frontend/testing-frontend/SKILL.md index eb88310..efa58b5 100644 --- a/plugins/dev/skills/frontend/testing-frontend/SKILL.md +++ b/plugins/dev/skills/frontend/testing-frontend/SKILL.md @@ -1,18 +1,6 @@ --- name: testing-frontend -version: 1.0.0 -description: Use when writing component tests, testing user interactions, mocking APIs, or setting up Vitest/React Testing Library/Vue Test Utils for frontend applications. -keywords: - - frontend testing - - Vitest - - React Testing Library - - Vue Test Utils - - component testing - - user event testing - - mocking - - accessibility testing -plugin: dev -updated: 2026-01-20 +description: "Implements user-centric component tests with React Testing Library and Vue Test Utils, provides API mocking patterns with Vitest, and includes accessibility testing with jest-axe. Use when writing component tests, testing async data loading, mocking APIs, or validating form interactions in frontend applications." --- # Frontend Testing Patterns diff --git a/plugins/dev/skills/optimize/SKILL.md b/plugins/dev/skills/optimize/SKILL.md index 1ca715e..54afeb8 100644 --- a/plugins/dev/skills/optimize/SKILL.md +++ b/plugins/dev/skills/optimize/SKILL.md @@ -1,11 +1,6 @@ --- name: optimize -description: On-demand performance and optimization analysis. Use when identifying bottlenecks, improving build times, reducing bundle size, or optimizing code performance. Trigger keywords - "optimize", "performance", "bottleneck", "bundle size", "build time", "speed up". -version: 0.1.0 -tags: [dev, optimize, performance, bottleneck, bundle] -keywords: [optimize, performance, bottleneck, bundle-size, build-time, speed, profiling] -plugin: dev -updated: 2026-01-28 +description: "Profiles build times, analyzes bundle sizes, identifies runtime bottlenecks, and generates prioritized optimization reports with before/after metrics. Use when improving performance, reducing bundle size, optimizing database queries, or diagnosing slow API endpoints." --- # Optimize Skill diff --git a/plugins/dev/skills/planning/brainstorming/SKILL.md b/plugins/dev/skills/planning/brainstorming/SKILL.md index bd2cd09..7fc8d6c 100644 --- a/plugins/dev/skills/planning/brainstorming/SKILL.md +++ b/plugins/dev/skills/planning/brainstorming/SKILL.md @@ -1,122 +1,11 @@ --- name: brainstorming -version: 2.0.0 -description: "Collaborative ideation and planning with resilient multi-model exploration, consensus scoring, and adaptive confidence-based validation" -author: "MAG Claude Plugins" -tags: - - planning - - ideation - - collaboration - - multi-model - - resilient -dependencies: - skills: - - superpowers:using-git-worktrees - - superpowers:writing-plans - tools: - - Task - - TaskCreate - - TaskUpdate - - TaskList - - TaskGet - - Read - - Write - - Edit - - Glob - models: - primary: - - anthropic/claude-opus-4-20250514 - - anthropic/claude-sonnet-4-20250514 - - anthropic/claude-haiku-3-20250514 - explorers: - fallback_chain: - - x-ai/grok-code-fast-1 - - google/gemini-2-5-pro - - deepseek/deepseek-coder - - anthropic/claude-sonnet-4-20250514 - - anthropic/claude-haiku-3-20250514 -parameters: - exploration_models: 3 - chunk_size: 250 - confidence_threshold_auto: 95 - confidence_threshold_confirm: 60 - retry_attempts: 2 - timeout_per_model_ms: 120000 -gates: - - phase: 0 - type: USER_GATE - trigger: "Problem understanding validated" - - phase: 1 - type: AUTO_GATE - trigger: "Parallel exploration consolidated" - - phase: 2 - type: AUTO_GATE - trigger: "Consensus scores calculated" - - phase: 3 - type: USER_GATE - trigger: "User selects approach" - - phase: 4 - type: MIXED_GATE - trigger: "Section-by-section validation" - - phase: 5 - type: USER_GATE - trigger: "Final plan approval" +description: "Generates diverse solutions via parallel multi-model exploration, scores consensus across approaches, and produces validated implementation plans with confidence-based gating. Use when planning features, brainstorming architecture, or evaluating multiple design approaches before implementation." --- -# Brainstorming v2.0: Resilient Multi-Model Planning +# Brainstorming: Resilient Multi-Model Planning -Turn ideas into validated designs through collaborative AI dialogue with resilient model execution and confidence-based validation. - -## Overview - -This skill improves upon v1.0 by addressing critical reliability gaps: - -**Key v2.0 Improvements:** -- **No AskUserQuestion dependency**: Uses Task + Tasks for structured interaction -- **Fallback chains**: 3+ models per role ensures completion even if some fail -- **Explicit parallelism**: Documented Task call patterns for parallel execution -- **Defined algorithms**: Consensus matrix and confidence scoring are mathematically specified - -## When to Use - -Use this skill BEFORE implementing any feature: -- "Design a user authentication system" -- "Brainstorm approaches for API rate limiting" -- "Plan architecture for a new dashboard feature" -- "Evaluate options for real-time data synchronization" - -## Prerequisites - -### Required Setup - -```bash -# 1. Install required skills -/plugin marketplace add MadAppGang/claude-code -skill install superpowers:using-git-worktrees -skill install superpowers:writing-plans - -# 2. Verify OpenRouter access (for multi-model) -export OPENROUTER_API_KEY=your-key - -# 3. Configure models in ~/.claude/settings.json -{ - "brainstorming": { - "primary_model": "anthropic/claude-opus-4-20250514", - "explorer_models": [ - "x-ai/grok-code-fast-1", - "google/gemini-2-5-pro", - "anthropic/claude-sonnet-4-20250514" - ] - } -} -``` - -### Model Requirements - -| Role | Min Context | Capabilities | -|------|-------------|--------------| -| Primary | 200K tokens | Complex reasoning, orchestration | -| Explorer | 100K tokens | Code generation, analysis | +Turn ideas into validated implementation plans through parallel multi-model exploration with consensus scoring and confidence-based gating. ## Workflow diff --git a/plugins/dev/skills/test-coverage/SKILL.md b/plugins/dev/skills/test-coverage/SKILL.md index 7eae6f7..a8877e4 100644 --- a/plugins/dev/skills/test-coverage/SKILL.md +++ b/plugins/dev/skills/test-coverage/SKILL.md @@ -1,11 +1,6 @@ --- name: test-coverage -description: On-demand test coverage analysis. Use when identifying untested code, finding test gaps, measuring coverage metrics, or improving test quality. Trigger keywords - "test coverage", "coverage report", "untested code", "test gaps", "missing tests", "coverage metrics". -version: 0.1.0 -tags: [dev, testing, coverage, quality, gaps] -keywords: [test-coverage, coverage, gaps, untested, metrics, testing, quality] -plugin: dev -updated: 2026-01-28 +description: "Measures line, branch, and function coverage metrics, identifies untested critical code paths prioritized by risk, and generates gap analysis reports with specific test recommendations. Use when measuring coverage, finding test gaps before deployment, or identifying untested security-critical code." --- # Test Coverage Skill diff --git a/plugins/frontend/skills/browser-debugger/SKILL.md b/plugins/frontend/skills/browser-debugger/SKILL.md index 0a252d8..fc06640 100644 --- a/plugins/frontend/skills/browser-debugger/SKILL.md +++ b/plugins/frontend/skills/browser-debugger/SKILL.md @@ -1,930 +1,110 @@ --- name: browser-debugger -description: Systematically tests UI functionality, validates design fidelity with AI visual analysis, monitors console output, tracks network requests, and provides debugging reports using Chrome DevTools MCP. Use after implementing UI features, for design validation, when investigating console errors, for regression testing, or when user mentions testing, browser bugs, console errors, or UI verification. +description: "Tests UI in a real browser via Chrome DevTools MCP — captures screenshots, monitors console errors, tracks network requests, and validates design fidelity with optional vision models through Claudish. Use after implementing UI features, for design validation, when investigating console errors, or for responsive regression testing." allowed-tools: Task, Bash --- # Browser Debugger -This Skill provides comprehensive browser-based UI testing, visual analysis, and debugging capabilities using Chrome DevTools MCP server and optional external vision models via Claudish. +Browser-based UI testing and debugging via Chrome DevTools MCP with optional vision model analysis through Claudish. -## When to Use This Skill +## Workflow -Claude and agents (developer, reviewer, tester, ui-developer) should invoke this Skill when: - -- **Validating Own Work**: After implementing UI features, agents should verify their work in a real browser -- **Design Fidelity Checks**: Comparing implementation screenshots against design references -- **Visual Regression Testing**: Detecting layout shifts, styling issues, or visual bugs -- **Console Error Investigation**: User reports console errors or warnings -- **Form/Interaction Testing**: Verifying user interactions work correctly -- **Pre-Commit Verification**: Before committing or deploying code -- **Bug Reproduction**: User describes UI bugs that need investigation +1. **Verify dependencies** — confirm Chrome DevTools MCP is available via `mcp__chrome-devtools__list_pages`. Check OpenRouter API key for vision models. +2. **Navigate and capture** — load the target URL, resize viewport, take screenshot. +3. **Inspect console and network** — check for errors, failed requests, and warnings. +4. **Analyze visually** — compare screenshot against design reference using embedded Claude or external vision model. +5. **Report findings** — categorize issues by severity, provide actionable fixes. ## Prerequisites -### Required: Chrome DevTools MCP - -This skill requires Chrome DevTools MCP. Check availability and install if needed: - ```bash -# Check if available -mcp__chrome-devtools__list_pages 2>/dev/null && echo "Available" || echo "Not available" - -# Install via claudeup (recommended) -npm install -g claudeup@latest +# Required: Chrome DevTools MCP claudeup mcp add chrome-devtools -``` -### Optional: External Vision Models (via OpenRouter) - -For advanced visual analysis, use external vision-language models via Claudish: - -```bash -# Check OpenRouter API key -[[ -n "${OPENROUTER_API_KEY}" ]] && echo "OpenRouter configured" || echo "Not configured" - -# Install claudish +# Optional: Vision models via Claudish + OpenRouter npm install -g claudish +# Set OPENROUTER_API_KEY for external vision models ``` ---- - -## Visual Analysis Models (Recommended) - -For best visual analysis of UI screenshots, use these models via Claudish: - -### Tier 1: Best Quality (Recommended for Design Validation) - -| Model | Strengths | Cost | Best For | -|-------|-----------|------|----------| -| **qwen/qwen3-vl-32b-instruct** | Best OCR, spatial reasoning, GUI automation, 32+ languages | ~$0.06/1M input | Design fidelity, OCR, element detection | -| **google/gemini-2.5-flash** | Fast, excellent price/performance, 1M context | ~$0.05/1M input | Real-time validation, large pages | -| **openai/gpt-4o** | Most fluid multimodal, strong all-around | ~$0.15/1M input | Complex visual reasoning | - -### Tier 2: Fast & Affordable +## Vision Model Selection -| Model | Strengths | Cost | Best For | -|-------|-----------|------|----------| -| **qwen/qwen3-vl-30b-a3b-instruct** | Good balance, MoE architecture | ~$0.04/1M input | Quick checks, multiple iterations | -| **google/gemini-2.5-flash-lite** | Ultrafast, very cheap | ~$0.01/1M input | High-volume testing | +| Task | Recommended Model | Cost | +|------|-------------------|------| +| Design fidelity | `qwen/qwen3-vl-32b-instruct` | ~$0.06/1M | +| Quick smoke tests | `google/gemini-2.5-flash` | ~$0.05/1M | +| Complex layouts | `openai/gpt-4o` | ~$0.15/1M | +| Budget/free | `openrouter/polaris-alpha` | Free | -### Tier 3: Free Options +Before first analysis in a session, ask user which model to use via AskUserQuestion. Save choice to session metadata. -| Model | Notes | -|-------|-------| -| **openrouter/polaris-alpha** | FREE, good for testing workflows | - -### Model Selection Guide - -``` -Design Fidelity Validation → qwen/qwen3-vl-32b-instruct (best OCR & spatial) -Quick Smoke Tests → google/gemini-2.5-flash (fast & cheap) -Complex Layout Analysis → openai/gpt-4o (best reasoning) -High Volume Testing → google/gemini-2.5-flash-lite (ultrafast) -Budget Conscious → openrouter/polaris-alpha (free) -``` - ---- - -## Visual Analysis Model Selection (Interactive) - -**Before the first screenshot analysis in a session, ask the user which model to use.** - -### Step 1: Check for Saved Preference - -First, check if user has a saved model preference: - -```bash -# Check for saved preference in project settings -SAVED_MODEL=$(cat .claude/settings.json 2>/dev/null | jq -r '.pluginSettings.frontend.visualAnalysisModel // empty') - -# Or check session-specific preference -if [[ -f "ai-docs/sessions/${SESSION_ID}/session-meta.json" ]]; then - SESSION_MODEL=$(jq -r '.visualAnalysisModel // empty' "ai-docs/sessions/${SESSION_ID}/session-meta.json") -fi -``` - -### Step 2: If No Saved Preference, Ask User - -Use **AskUserQuestion** with these options: - -```markdown -## Visual Analysis Model Selection - -For screenshot analysis and design validation, which AI vision model would you like to use? - -**Your choice will be remembered for this session.** -``` - -**AskUserQuestion options:** - -| Option | Label | Description | -|--------|-------|-------------| -| 1 | `qwen/qwen3-vl-32b-instruct` (Recommended) | Best for design fidelity - excellent OCR, spatial reasoning, detailed analysis. ~$0.06/1M tokens | -| 2 | `google/gemini-2.5-flash` | Fast & affordable - great balance of speed and quality. ~$0.05/1M tokens | -| 3 | `openai/gpt-4o` | Most capable - best for complex visual reasoning. ~$0.15/1M tokens | -| 4 | `openrouter/polaris-alpha` (Free) | No cost - good for testing, basic analysis | -| 5 | Skip visual analysis | Use embedded Claude only (no external models) | - -**Recommended based on task type:** -- Design validation → Option 1 (Qwen VL) -- Quick iterations → Option 2 (Gemini Flash) -- Complex layouts → Option 3 (GPT-4o) -- Budget-conscious → Option 4 (Free) - -### Step 3: Save User's Choice - -After user selects, save their preference: - -**Option A: Save to Session (temporary)** -```bash -# Update session metadata -jq --arg model "$SELECTED_MODEL" '.visualAnalysisModel = $model' \ - "ai-docs/sessions/${SESSION_ID}/session-meta.json" > tmp.json && \ - mv tmp.json "ai-docs/sessions/${SESSION_ID}/session-meta.json" -``` - -**Option B: Save to Project Settings (persistent)** -```bash -# Update project settings for future sessions -jq --arg model "$SELECTED_MODEL" \ - '.pluginSettings.frontend.visualAnalysisModel = $model' \ - .claude/settings.json > tmp.json && mv tmp.json .claude/settings.json -``` - -### Step 4: Use Selected Model - -Store the selected model in a variable and use it for all subsequent visual analysis: - -```bash -# VISUAL_MODEL is now set to user's choice -# Use it in all claudish calls: - -npx claudish --model "$VISUAL_MODEL" --stdin --quiet <