Skip to content

feat(seo): add sitemap.xml and robots.txt generation (#48, #60)#76

Merged
x3ek merged 4 commits intomainfrom
feat/48-60-sitemap-robots-txt
Apr 3, 2026
Merged

feat(seo): add sitemap.xml and robots.txt generation (#48, #60)#76
x3ek merged 4 commits intomainfrom
feat/48-60-sitemap-robots-txt

Conversation

@x3ek
Copy link
Copy Markdown
Contributor

@x3ek x3ek commented Apr 3, 2026

Summary

  • Add /sitemap.xml endpoint with loc and lastmod for homepage, post index, all published posts, and public-only pages
  • Add /robots.txt endpoint allowing all crawlers, disallowing /admin/*, /auth/*, /health, /webhooks/*, with Sitemap: directive
  • Extract shared get_all_pages() helper in services/content.py (mirrors existing get_all_posts())
  • Refactor feed.py to use shared get_all_posts() instead of inline post-fetching

Closes #48, closes #60

Test plan

  • 22 new tests in test_seo.py (unit + integration) — all passing
  • 169 total tests pass, pyright clean, ruff format + lint clean
  • Manual verification: sitemap.xml serves valid XML with correct URLs and lastmod dates
  • Manual verification: robots.txt serves correct allow/disallow rules and sitemap directive

🤖 Generated with Claude Code

— Claude

Add /sitemap.xml with loc+lastmod for homepage, post index, all published posts, and public pages. Add /robots.txt allowing all crawlers, disallowing admin/auth/health/webhooks paths, with Sitemap directive. Extract get_all_pages() helper in content service and refactor feed.py to use shared get_all_posts().

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds SEO support to SquishMark by introducing dynamic sitemap.xml and robots.txt endpoints backed by cached GitHub content, plus shared content helpers to reduce duplication.

Changes:

  • Add new /sitemap.xml and /robots.txt routes with cache-backed generation logic.
  • Introduce get_all_pages() content helper (mirroring existing get_all_posts()).
  • Refactor Atom feed route to use get_all_posts() instead of inline post fetching.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
src/squishmark/routers/seo.py New SEO router implementing sitemap and robots endpoints plus builders
src/squishmark/services/content.py Adds shared get_all_pages() helper
src/squishmark/routers/feed.py Simplifies feed generation by reusing get_all_posts()
src/squishmark/main.py Registers the new SEO router in the app
tests/test_seo.py Adds unit/integration tests for sitemap + robots behavior

x3ek and others added 3 commits April 3, 2026 09:12
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ntries

Pages can now have an optional date in frontmatter. When present, it appears as lastmod in the sitemap. Pages without dates simply omit lastmod, which is valid per the sitemap spec.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ap builder

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@x3ek x3ek merged commit 119df90 into main Apr 3, 2026
5 checks passed
@x3ek x3ek deleted the feat/48-60-sitemap-robots-txt branch April 3, 2026 14:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add robots.txt generation Add sitemap.xml generation

2 participants