High-level design, patterns, and data flow. For contributors and the curious.
YoAuditor is a pipeline: clone → scan → analyze → report. The “analyze” step is pluggable: either single-call (one big LLM request) or tool-calling (agent loop). Config and CLI are merged once; the rest of the app is stateless per run.
┌─────────┐ ┌─────────┐ ┌──────────────┐ ┌───────────┐
│ Clone │ ──▶│ Scan │ ──▶│ Analyze │ ──▶│ Report │
│ Repo │ │ Files │ │ (LLM mode) │ │ (MD/JSON) │
└─────────┘ └─────────┘ └──────────────┘ └───────────┘
│ │ │ │
▼ ▼ ▼ ▼
repo::cloner scanner::* agent::* report::*
(git2) (one source (strategy) (generator)
of truth)
| Pattern | Where | What it does |
|---|---|---|
| Pipeline / Orchestrator | main::run_audit |
Runs the four stages in order, passes config and paths, handles --dry-run and --init-config early exits. |
| Strategy | agent::CodeAnalysisAgent::run_analysis |
Picks implementation at runtime: run_single_call_analysis() vs run_tool_calling_analysis(). Same interface (Vec<ReportedIssue>), different algorithms. |
| Tool Executor | agent::tools::ToolExecutor |
Executes tool calls from the LLM (e.g. list_files, read_file, report_issue). Single place that talks to the filesystem and collects issues; agent loop only does HTTP + message handling. |
| Config Merge | config::Config::merge_with_args |
CLI overrides config. Optional flags (e.g. --timeout, --single-call) only override when set; required-ish ones (e.g. model, ollama_url) always come from CLI (which has defaults). |
| Adapter | scanner::ScanConfig::from(&ScannerConfig) |
Turns config::ScannerConfig into scanner::ScanConfig so the scanner stays independent of the TOML shape. |
| Single source of truth (scanner) | scanner::FileScanner |
All file discovery and filtering (extensions, excludes, max size, max files) lives here. Both single-call and tool-calling use it; no duplicated walk logic. |
.yoauditor.toml (optional) CLI (Args)
│ │
└──────────┬─────────────────┘
▼
Config::merge_with_args
│
▼
Config (model, scanner, …)
ScanConfig = ScannerConfig.into()
- One
Configis built per run (file → merge args). - Scanner gets a
ScanConfigderived fromConfig.scanner(and thus from CLI when those args are set).
--repo URL | --local DIR
│
▼
get_repository(args) → PathBuf (repo root)
│
▼
FileScanner::new(repo_root, scan_config)
│
┌────────┴────────┐
│ │
▼ ▼
.scan() .collect_files()
Vec<ScannedFile> HashMap<path, content>
(dry-run, tools) (single-call only)
- Clone/local is handled once; the same
PathBufis used for scanner and agent. - Dry-run stops after scan and never calls the agent.
CodeAnalysisAgent::run_analysis()
│
├── single_call_mode ──▶ run_single_call_analysis()
│ │
│ ├── scanner.collect_files()
│ ├── build prompt (all files)
│ ├── send_simple_prompt()
│ └── parse_issues_from_response()
│
└── !single_call_mode ──▶ run_tool_calling_analysis()
│
├── push system + user message
└── loop:
chat_with_tools()
→ tool_calls? → ToolExecutor::execute()
→ push tool results, prune context
→ finish_analysis? → break
→ else push assistant message, continue
- Single-call: one request, no tools; response is parsed as JSON lines into
ReportedIssues. - Tool-calling: classic agent loop (message → LLM → tool_calls → execute → append results → repeat until
finish_analysisor max iterations).
Vec<ReportedIssue> (from agent)
│
▼
map into models::Issue (severity, category, …)
│
optional: filter by --min-severity
│
▼
analysis::group_by_file(issues)
│
▼
Vec<AnalyzedFile> + ReportMetadata + IssueSummary
│
▼
Report → generate_markdown_report() | generate_json_report()
│
▼
write to args.output ; optional: --fail-on → exit 2
- Domain model lives in
models; report layer only formats and writes. - Exit code: 0 = success, 1 = error, 2 = issues above
--fail-onthreshold.
| Module | Role |
|---|---|
| main | Entrypoint, logging, run_audit orchestration, exit codes, --init-config / dry-run. |
| cli | Args (clap), validation, FailOnLevel / OutputFormat. No I/O beyond parsing. |
| config | Load TOML, default config, merge CLI into config, default_toml() for --init-config. |
| scanner | ScanConfig, FileScanner, ScannedFile. Walk tree, filter by config, list_directory for tools. Single place that knows “what is a source file”. |
| repo | Clone (git2), CloneOptions, CloneResult. No knowledge of scanner or agent. |
| agent | AgentConfig, CodeAnalysisAgent. Strategy: single-call vs tool-calling. HTTP to Ollama, message list, context pruning. |
| agent::tools | Tool definitions for Ollama, ToolExecutor, ReportedIssue. Executes tools and collects issues; uses scanner for list/read and for path safety. |
| models | Issue, Severity, AnalyzedFile, Report, ReportMetadata, IssueSummary. Shared domain types. |
| analysis | group_by_file, and helpers (e.g. aggregate, sort by severity). Used when building the report. |
| report | Markdown/JSON generation from Report. No knowledge of agent or scanner. |
- Path safety: Tool executor and scanner resolve paths under
repo_rootand use canonicalization where needed to avoid escape (e.g...or symlinks). - Config vs CLI: All “effective” settings go through
Configafter merge; the rest of the app only seesConfigandScanConfig, not rawArgs. - LLM boundary: Only the agent talks to Ollama. Parsing (e.g. JSON lines for single-call, tool_calls for agentic) is inside the agent. Rest of the app only sees
Vec<ReportedIssue>.
docs/
DESIGN.md ← this file
src/
main.rs # Orchestrator, exit codes
cli.rs # CLI surface
config.rs # Config load + merge
models.rs # Domain types
scanner/ # File discovery (single source of truth)
repo/ # Clone
agent/ # Analysis strategy + HTTP
tools.rs # Tool definitions + executor
agent_loop.rs # Single-call + tool-calling impl
analysis/ # Grouping and aggregates
report/ # Markdown/JSON output
- New analysis mode: Add another branch in
CodeAnalysisAgent::run_analysis()and implement arun_*_analysis()that returnsVec<ReportedIssue>. - New tool: Add definition in
get_tool_definitions(), handle inToolExecutor::execute(), keep path safety and scanner use in mind. - New output format: Add a variant to
OutputFormatand agenerate_*_report()that consumes&Report. - New config section: Add to
Configinconfig.rs, extendmerge_with_argsif CLI should override it.
Last updated to match the post-rename, post–unified-scanner design.