Data Enrichment Agent with LangGraph + Bright Data

Build an AI-powered data enrichment agent that automatically researches and extracts structured data from the web.

What It Does

Takes a research topic and a JSON schema as input
Searches the web using Bright Data SERP API
Scrapes websites using Bright Data Web Unlocker
Uses an LLM to extract and structure the data
Returns structured JSON matching your schema

Prerequisites

Quick Start

1. Install dependencies

pip install -r requirements.txt

2. Set environment variables

export BRIGHT_DATA_API_KEY="your-bright-data-api-key"
export ANTHROPIC_API_KEY="your-anthropic-api-key"

Or copy .env.example to .env and fill in your keys.

3. Run the agent

python enrichment_agent.py

Expected output:

{
  "company_name": "Stripe",
  "industry": "Financial Technology / Payments",
  "headquarters": "San Francisco, California",
  "founded": "2010",
  "key_products": [
    "Stripe Payments",
    "Stripe Billing",
    "Stripe Connect",
    "Stripe Atlas"
  ]
}

How It Works

┌─────────┐     ┌──────────┐     ┌──────────┐     ┌──────────┐
│  Input   │────▶│  Search  │────▶│  Scrape  │────▶│ Extract  │
│ (topic + │     │ (SERP    │     │ (Web     │     │ (LLM     │
│  schema) │     │  API)    │     │ Unlocker)│     │ output)  │
└─────────┘     └──────────┘     └──────────┘     └──────────┘

The agent uses a LangGraph loop: the LLM decides whether to search, scrape a page, or submit the final structured result.

Customization

Different schema

from enrichment_agent import enrich

schema = {
    "type": "object",
    "properties": {
        "competitors": {
            "type": "array",
            "items": {
                "type": "object",
                "properties": {
                    "name": {"type": "string"},
                    "market_position": {"type": "string"},
                    "key_differentiator": {"type": "string"}
                }
            }
        }
    }
}

result = await enrich("Stripe competitors in payment processing", schema)

Geo-targeted search

Modify the serp_tool in enrichment_agent.py:

serp_tool = BrightDataSERP(
    search_engine="google",
    country="de",      # Germany
    language="de",     # German
    results_count=10,
)

Use OpenAI instead of Anthropic

from langchain_openai import ChatOpenAI

# Replace the LLM initialization in create_agent()
llm = ChatOpenAI(model="gpt-4o")

Links

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
enrichment_agent.py		enrichment_agent.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data Enrichment Agent with LangGraph + Bright Data

What It Does

Prerequisites

Quick Start

1. Install dependencies

2. Set environment variables

3. Run the agent

How It Works

Customization

Different schema

Geo-targeted search

Use OpenAI instead of Anthropic

Links

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Data Enrichment Agent with LangGraph + Bright Data

What It Does

Prerequisites

Quick Start

1. Install dependencies

2. Set environment variables

3. Run the agent

How It Works

Customization

Different schema

Geo-targeted search

Use OpenAI instead of Anthropic

Links

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages