feat(tools): add reference integrity, size validation, and skill scaffolding#87
feat(tools): add reference integrity, size validation, and skill scaffolding#87
Conversation
…folding Port best practices from personal-plugins repo: - validate-references.py: BFS link crawler that walks from each SKILL.md through all referenced markdown files. Catches broken links (exit 1) and orphaned reference files (warning). Covers 59 reference files including cross-skill links. - validate-size.py: SKILL.md size checker with extraction-candidate detection. Identifies oversized code blocks (>30 lines) that should be moved to references/. Complements the existing SKILL001 markdownlint rule with detailed diagnostics. - init-skill.py: Skill scaffolding tool for contributors. Creates correctly structured SKILL.md with frontmatter template + references/. - markdownlint-frontmatter.cjs: Extended SKILL002 rule with frontmatter property whitelist. Rejects non-spec properties that Claude Code would silently ignore. - skill-frontmatter.schema.json: Tightened additionalProperties to false, added license and metadata as defined extension properties. - mise.toml: Added validate:refs, validate:size, validate (composite), and init:skill tasks. Wired validate into the build pipeline. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ion candidates New tools: - validate-urls.py: Async HTTPS URL liveness checker using httpx. HEAD with GET fallback, concurrency-limited, GitHub token aware. Reports broken links and permanent redirects. Not in build pipeline (network- dependent) — run via `mise run validate:urls`. - validate-size.py: SKILL.md size checker that identifies code blocks over 30 lines as extraction candidates for references/. - .url-check-ignore: Skip patterns for placeholder URLs, CloudFormation template variables, and API endpoints. URL fixes found by the checker: - Fix 404: amazon-location-samples repo → aws-geospatial org page - Fix 301: help.github.com → docs.github.com (CONTRIBUTING.md) - Fix 301: docs.github.com actions workflow path (TROUBLESHOOTING.md) - Fix 301: docs.powertools.aws.dev → docs.aws.amazon.com/powertools - Fix 301: aws-otel.github.io trailing slash (observability.md) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Adds new contributor/CI tooling to improve SKILL.md quality gates (reference integrity, size diagnostics, and frontmatter strictness) and provides a scaffolding command for creating new skills.
Changes:
- Introduces Python validators for reference integrity (BFS crawl) and SKILL.md size/extraction-candidate reporting.
- Extends the SKILL frontmatter markdownlint rule to enforce an allowlisted set of properties aligned to the schema.
- Adds a skill scaffolding tool and wires new validation tasks into
mise(includingbuild).
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| tools/validate-size.py | New SKILL.md size checker with per-skill reporting and code-block extraction hints |
| tools/validate-references.py | New BFS-based link crawler to detect broken file references and orphaned reference files |
| tools/markdownlint-frontmatter.cjs | Adds a frontmatter property allowlist check to SKILL002 |
| tools/init-skill.py | New scaffolding tool to create a skill skeleton (SKILL.md + references/) |
| schemas/skill-frontmatter.schema.json | Tightens schema to disallow unknown top-level properties; defines license/metadata |
| mise.toml | Adds validate:* and init:skill tasks; runs validation as part of build |
You can also share your feedback on Copilot code review. Take the survey.
1. validate-references.py: Detect directory-style cross-skill links
like `../aws-lambda-durable-functions/` by treating them as links
to `../aws-lambda-durable-functions/SKILL.md`. These were previously
silently skipped because the regexes required file extensions.
2. init-skill.py: Switch from str.format to string.Template to avoid
crashes when description contains `{`/`}` characters. Template uses
$-substitution which doesn't treat braces as special.
3. markdownlint-frontmatter.cjs: Load allowed properties from
skill-frontmatter.schema.json at runtime instead of duplicating
the list. Eliminates drift risk between the schema and the lint
rule. Falls back to hardcoded set if schema can't be read.
4. validate-size.py: Carry skill_md Path through results tuple instead
of re-scanning the skills list by name. Eliminates latent bug where
duplicate skill names across plugins would match the wrong file.
Dismissed: path traversal concern in validate-references.py. The tool
is read-only and the all_resources set constrains output to files
strictly under plugins/*/skills/*/references/.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Belt-and-suspenders: reject any resolved path that escapes ROOT via symlinks or ../ traversal, even though the tool is read-only and output is already constrained by the all_resources set. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 13 out of 13 changed files in this pull request and generated 4 comments.
You can also share your feedback on Copilot code review. Take the survey.
…it-skill Critical fix: validate-urls.py was attaching GITHUB_TOKEN as a global Authorization header on all outbound requests. An attacker-controlled URL in a PR markdown file would exfiltrate the CI token. Now the token is only sent to github.com, api.github.com, and raw.githubusercontent.com via per-request headers. Additional hardening: - init-skill.py: Reject plugin names containing path traversal by verifying the resolved path is under PLUGINS_DIR. - init-skill.py: Switch to safe_substitute to handle $ in descriptions. - validate-urls.py: Pin httpx to >=0.28,<1 instead of unpinned. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 13 out of 13 changed files in this pull request and generated 3 comments.
You can also share your feedback on Copilot code review. Take the survey.
CI was failing because uv is not available in the GitHub Actions runner environment. Add uv as a mise-managed tool so `mise install` provisions it alongside node, markdownlint, etc. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
9ca87b2 to
9ac56b9
Compare
Summary
Adds validation tooling ported from personal-plugins, adapted for the agent-plugins directory structure. Catches broken links, orphaned files, oversized skills, non-spec frontmatter, and stale URLs.
New tools
validate-references.py: BFS link crawler that walks from each SKILL.md through all referenced markdown files. Catches broken links (exit 1) and orphaned reference files (warning). Covers all 59 reference files including cross-skill links (e.g.,aws-lambda→aws-serverless-deployment) and directory-style links (e.g.,../aws-lambda-durable-functions/). Resolved paths are clamped to the repository root.validate-size.py: SKILL.md size checker with extraction-candidate detection. Identifies oversized code blocks (>30 lines) that should be moved toreferences/. Complements the existing SKILL001 markdownlint rule with detailed per-skill diagnostics table.validate-urls.py: Async HTTPS URL liveness checker (httpx). HEAD with GET fallback, concurrency-limited, GitHub token aware. Reports broken links and permanent redirects. Network-dependent — runs viamise run validate:urls, not wired into the main build.init-skill.py: Skill scaffolding for contributors —uv run tools/init-skill.py <plugin> <skill> '<desc>'. Creates correctly structured SKILL.md with frontmatter template +references/.gitkeep. Usesstring.Templateto safely handle braces in descriptions..url-check-ignore: Skip patterns for placeholder URLs, CloudFormation template variables, and API endpoints.Modified files
markdownlint-frontmatter.cjs: Extended SKILL002 rule with frontmatter property whitelist. Loads allowed properties fromskill-frontmatter.schema.jsonat runtime (with hardcoded fallback) to eliminate drift risk. Rejects non-spec properties that Claude Code would silently ignore.skill-frontmatter.schema.json: TightenedadditionalPropertiestofalse, addedlicenseandmetadataas defined extension properties.mise.toml: Addedvalidate:refs,validate:size,validate:urls,validate(composite), andinit:skilltasks. Wiredvalidateinto thebuildpipeline.URL fixes found by the checker
amazon-location-samplesrepo →aws-geospatialorg page (repo was split)help.github.com→docs.github.com(CONTRIBUTING.md)docs.github.comactions workflow path update (TROUBLESHOOTING.md)docs.powertools.aws.dev→docs.aws.amazon.com/powertoolsaws-otel.github.iotrailing slash (observability.md)Test plan
mise run validate:refs— 59/59 reachable, 0 broken, 0 orphanedmise run validate:size— 6 skills, 0 errors, 0 warningsmise run validate:urls— 73 ok, 1 warning (npmjs WAF), 0 brokenmise run lint:md— frontmatter whitelist accepts all existing skillsmise run init:skill -- deploy-on-aws test-skill 'Test. Use when testing.'→ correct structuremise run init:skillwith braces in description → no crash, braces preservedmise run init:skillerror cases: bad plugin, bad name, duplicate skillmise run build— full pipeline passes (lint + fmt:check + validate + security)Generated with Claude Code
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of the project license.