Skip to content

Professional SEO research toolkit with 7 specialized tools for keyword research, SERP analysis, and content optimization. Python-based with centralized configuration.

Notifications You must be signed in to change notification settings

HasData/python-for-seo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

1 Commit
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Python HasData

SEO Research Toolkit

HasData_bannner

A comprehensive Python-based SEO research toolkit powered by HasData API. This collection of tools helps SEO professionals conduct keyword research, competitive analysis, SERP intelligence, and content gap analysis at scale.

Table of Contents

Features

Toolkit Manager

1. Google Suggest Harvester

Extract thousands of long-tail keyword variations using Google's autocomplete API.

  • Use Case: Discover untapped keyword opportunities
  • Method: Recursive alphabetical expansion (depth configurable)
  • Output: CSV file with unique suggestions
  • Speed: Concurrent processing with configurable workers

2. Trends Breakout Analyzer

Identify rising search trends and high-volume queries before competitors.

  • Use Case: Spot emerging topics and seasonal opportunities
  • Data Source: Google Trends API
  • Output: Rising queries (growth rate) + Top queries (volume)
  • Metrics: Growth indicators and search index values

3. PAA Tree Builder

Build hierarchical question trees from "People Also Ask" boxes.

  • Use Case: Map topic authority and content cluster opportunities
  • Method: Recursive question discovery
  • Output: Nested topic structure
  • Depth: Configurable recursion levels

4. SERP Intent Classifier

Automatically classify search intent by analyzing SERP composition.

  • Use Case: Understand what type of content ranks
  • Analysis: URL pattern recognition (blog, product, forum, video, etc.)
  • Output: Strategic content recommendations
  • Metrics: SERP composition breakdown by content type

5. SERP Similarity Matrix

Measure keyword cannibalization and SERP overlap using Jaccard Index.

  • Use Case: Identify clustering opportunities and keyword conflicts
  • Method: URL set intersection analysis
  • Output: Interactive heatmap + common URL frequency table
  • Visualization: Seaborn-powered similarity matrix

6. Content Gap Analyzer

Find missing keywords and phrases compared to ranking competitors.

  • Use Case: Optimize existing content for better rankings
  • Method: N-gram frequency analysis (1, 2, and 3-grams)
  • Data Source: Trafilatura-based content extraction
  • Output: Gap report with competitor coverage metrics

7. AI Overview Monitor

Track domain visibility in Google's AI-generated search summaries.

  • Use Case: Monitor brand presence in AI-powered search
  • Tracking: Citation index and URL detection
  • Output: Coverage report with share-of-voice metrics
  • Metrics: AI trigger rate and citation frequency

πŸ“¦ Installation

Prerequisites

Setup

  1. Clone the repository
git clone https://github.com/yourusername/seo-research-toolkit.git
cd seo-research-toolkit
  1. Install dependencies
pip install -r requirements.txt
  1. Configure API key (choose one method):

Option A: Environment variable

export HASDATA_API_KEY="your_api_key_here"

Option B: Configuration file

echo "your_api_key_here" > .hasdata_config

Option C: Interactive setup

python seo_manager.py
# Select option [8] to configure

Usage

⚠️ IMPORTANT: CONFIGURATION REQUIRED BEFORE USE

All tools in this toolkit are managed via the central script seo_manager.py.

If you run a tool without configuring it first, the manager will silently use default placeholder values, which are unlikely to match your real intent.

To avoid misleading results, you should always configure:

  • [8] Configure Tool Settings - keywords, domains, geo, depth, limits, etc. for each tool
  • [9] Configure API Key - your HasData API key

⚠️ No validation error is thrown when defaults are used. Always review and set your parameters before running any tool.

Interactive Mode (Recommended)

python seo_manager.py

This launches a menu-driven interface where you can select and run any tool.

Direct Tool Execution

# Run specific tool by number
python seo_manager.py 1  # Google Suggest Harvester
python seo_manager.py 2  # Trends Analyzer
# ... etc

Individual Scripts

Each tool can also be run independently:

python google_suggest_harvester.py
python trends_breakout_analyzer.py
python paa_tree_builder.py
python serp_intent_classifier.py
python serp_similarity_matrix.py
python content_gap_analyzer.py
python ai_overview_monitor.py

βš™οΈ Configuration

Tool-Specific Settings

Each tool has configurable parameters at the top of the script:

Google Suggest Harvester

BASE_KEYWORD = "coffee"
MAX_DEPTH = 2  # 1 = a-z, 2 = aa-zz
MAX_WORKERS = 15  # Concurrent requests (check your plan limits)

Trends Breakout Analyzer

SEED_TOPIC = "Coffee"
date = "now 7-d"  # Time range: now 1-d, now 7-d, today 12-m, etc.
geo = "US"  # Country code

PAA Tree Builder

ROOT_KEYWORD = "coffee"
MAX_DEPTH = 2  # Recursion levels

SERP Intent Classifier

KEYWORD = "instant coffee"
deviceType = "desktop"  # or "mobile"

SERP Similarity Matrix

KEYWORDS = ["keyword1", "keyword2", ...]  # List of related terms

Content Gap Analyzer

TARGET_KEYWORD = "health benefits of decaf coffee"
MY_URL = "https://example.com/your-article"
TOP_N_COMPETITORS = 10

AI Overview Monitor

TARGET_DOMAIN = "webmd.com"
KEYWORDS = ["keyword1", "keyword2", ...]

Output Examples

Google Suggest Harvester

Finished in 45.23 seconds. Average Speed: 12.34 req/s
Done. Collected 1847 unique keywords.
Saved to long_tail_keywords_hasdata.csv

Trends Breakout Analyzer

--- Rising Queries (The Opportunity) ---
[Growth: Breakout] mushroom coffee benefits
[Growth: +450%] decaf coffee health
...

PAA Tree Builder

- coffee
    - What are the health benefits of coffee?
        - Is coffee good for your heart?
        - Does coffee help with weight loss?

SERP Intent Classifier

Dominant Type: Informational (Blog) (60.0%)
Action: Create a long-form Guide or Blog Post.

More output examples in our article: Python for SEO


Troubleshooting

Common Issues

"No trend data found"

  • Topic may be too niche or misspelled
  • Try broader keywords or different geo-targeting

"AI Overview not triggered"

  • AI Overviews are region-specific (US has highest coverage)
  • Try different device types: desktop vs mobile

Content extraction returns empty text

  • Enable JS rendering: jsRendering: True

Advanced Workflows

Workflow 1: New Topic Research

1. Trends Breakout Analyzer β†’ Find rising topics
2. Google Suggest Harvester β†’ Extract long-tail variations
3. PAA Tree Builder β†’ Map content structure
4. SERP Intent Classifier β†’ Determine content type

Workflow 2: Content Optimization

1. SERP Similarity Matrix β†’ Group related keywords
2. Content Gap Analyzer β†’ Identify missing topics
3. AI Overview Monitor β†’ Track visibility changes

Workflow 3: Competitive Intelligence

1. SERP Intent Classifier β†’ Analyze competitor strategies
2. Content Gap Analyzer β†’ Reverse-engineer top pages
3. SERP Similarity Matrix β†’ Find unique positioning opportunities

Acknowledgments

  • HasData API for providing reliable SERP and proxy infrastructure
  • Trafilatura for content extraction
  • scikit-learn for NLP capabilities

Support


Made with β˜• by SEO Professionals, for SEO Professionals

Releases

No releases published

Packages

No packages published

Languages