Skip to content

Latest commit

 

History

History
76 lines (51 loc) · 3.69 KB

File metadata and controls

76 lines (51 loc) · 3.69 KB

codebase-knowledge-builder

An agent skill that studies any repository and produces structured knowledge artifacts. Drop it into Claude Code, Cursor, OpenCode, or any agent that supports the agentskills spec, point it at a codebase, and get back documentation that actually helps.

skillcheck passed License: MIT skills.sh

What it does

Most agents forget what they read three files ago. This skill fixes that by following a four-phase process:

  1. Reconnaissance -- scan the repo structure, identify the tech stack, map module boundaries
  2. Deep-dive study -- trace happy paths, error paths, and edge cases through each subsystem
  3. Artifact authoring -- fill a structured template covering architecture, key functions, gotchas, and Mermaid diagrams
  4. Delivery -- hand back self-contained Markdown artifacts that any developer (or agent) can read cold

The output is a set of knowledge artifacts. Each one covers a single subsystem and stands on its own. No prior context needed.

When to use it

  • Onboarding onto an unfamiliar codebase
  • Producing documentation for a repo that has none
  • Preparing knowledge files so other agents can work on the project without re-reading everything
  • Studying a specific subsystem (auth, database layer, API routing, etc.) in depth

Install

npx skills add OthmanAdi/codebase-knowledge-builder --skill codebase-knowledge-builder -g

Works with Claude Code, Cursor, Codex, Gemini CLI, and 40+ agents supporting the Agent Skills spec.

Manual install: Copy skills/codebase-knowledge-builder/ to your agent's skills folder.

What's inside

skills/codebase-knowledge-builder/
  SKILL.md                              # Skill definition and workflow
  references/
    recon-checklist.md                   # Phase 1 checklist
    deep-dive-methodology.md            # File reading and tracing strategies
  templates/
    knowledge_artifact.md               # Output template for each subsystem

The SKILL.md stays lean (~80 lines). Detailed methodology lives in references/ and only gets loaded when needed. The template in templates/ defines the exact structure of every knowledge artifact the skill produces.

Example output

After running the skill on a Node.js API, each artifact includes:

  • Architecture overview with design pattern identification
  • Key components table (component, file path, responsibility)
  • Step-by-step data and control flow
  • Key functions table with parameters and return values
  • Configuration and environment variable mapping
  • Gotchas and pitfalls (race conditions, caching quirks, historical fixes)
  • Extension points for adding new functionality
  • Mermaid diagrams for visual flow

How it works under the hood

The skill uses progressive disclosure. When an agent triggers it, only the SKILL.md body loads into context (~600 words). The references and template load on demand during each phase. This keeps the context window clean for the actual codebase files being studied.

Scratch files (recon_findings.md, per-file notes) are saved during study so the agent doesn't lose findings as it reads more files. The quality checklist at the end catches incomplete sections, missing diagrams, and placeholder text before delivery.

Contributing

See CONTRIBUTING.md for guidelines on submitting issues and pull requests.

License

MIT