A command-line tool for SEO analysis that crawls websites to detect thin content pages and broken links.
- Crawls websites and checks word count on pages
- Identifies pages with content below a minimum word threshold (default: 300 words)
- Detects broken links (404 errors)
- Option to exclude specific paths from checking
- Rust (latest stable version recommended)
-
Clone the repository:
git clone https://github.com/dunctk/thin-content-checker.git cd thin-content-checker -
Install globally:
cargo install --path .
cargo install --git https://github.com/dunctk/thin-content-checker.gitCheck a website for thin content:
thin-content-checker https://example.com-m, --min-words <MIN_WORDS>: Set minimum word count (default: 300)--exclude <EXCLUDE>: Exclude paths starting with the given string (can be used multiple times)
Check with custom word threshold:
thin-content-checker https://example.com --min-words 500Exclude specific paths:
thin-content-checker https://example.com --exclude /accounts/ --exclude /admin/The tool outputs:
- List of thin content pages with word counts
- List of broken links
MIT