SEC Filing Agent -- Always-On Investment Research

Watch the demo on YouTube

An always-on agent that watches for new SEC filings (10-Q, 10-K) from ~500 companies, and automatically produces a comprehensive value-investing research brief with a visual dashboard -- triggered by the outside world, not by a human clicking "run."

Quick Start (for judges)

1. Install (one time)

pip install -U -r requirements.txt
playwright install chromium

2. Set your Gemini API key

# Create .env in the repo root
echo GEMINI_API_KEY=your-key-here > .env

3. Trigger a simulated filing event (fastest way to see it work)

Option A -- One command, no server needed:

python -m sec_agent.demo --ticker AAPL

This runs the full agent pipeline on Apple's latest 10-Q. Browsers open visibly, the agent researches, and an HTML dashboard pops up when done (~2-3 min).

Option B -- Full webhook flow (the real "always-on" architecture):

# Terminal 1: start the webhook server + SEC poller
python -m sec_agent.webhook_server

# Terminal 2: simulate a filing event hitting the webhook
python -m sec_agent.simulate_filing --ticker AAPL

Or just open your browser to http://localhost:8000/simulate/AAPL -- one click.

Check progress at http://localhost:8000/jobs or http://localhost:8000/.

How It Works

  SEC EDGAR RSS Feed          Webhook Server (FastAPI)           Agent
  (updated every 10 min)      http://localhost:8000              (Gemini + tools)
        |                           |                                |
        |   poll every 5 min        |                                |
        +-------------------------->|                                |
        |   new 10-Q detected       |                                |
        |   for AAPL (CIK match)    |                                |
        |                           |   POST /webhook/filing         |
        |                           +------------------------------->|
        |                           |   {cik, ticker, form_type}     |
        |                           |                                |
        |                           |        12-18 tool calls        |
        |                           |   SEC API, Yahoo, Google,      |
        |                           |   webpage visits               |
        |                           |                                |
        |                           |   <--- structured JSON ---------+
        |                           |   + HTML dashboard              |
        |                           |                                |

The three layers

Layer	What it does	Code
Poller	Polls SEC's global XBRL RSS feed, filters by a watchlist of ~500 S&P 500 companies, detects new 10-Q/10-K filings	`poller.py`, `watchlist.py`
Webhook Server	FastAPI server receives filing events (from poller or simulation), spawns agent in background	`webhook_server.py`
Research Agent	Gemini-powered agent with 8 tools, performs 12-18 tool calls, outputs structured JSON + HTML dashboard	`agent.py`, `tools.py`, `models.py`

What comes from where -- data provenance

The dashboard clearly separates hard data from APIs vs. LLM-generated analysis:

Directly from APIs (no LLM involvement)

Data	Source	Tool
Revenue, Net Income, EPS, Assets, Cash (numbers)	SEC EDGAR XBRL API	`sec_get_xbrl_concept`
Filing metadata (date, accession, form type)	SEC EDGAR Submissions API	`sec_list_recent_filings`
Filing text (MD&A, Risk Factors, Liquidity)	SEC EDGAR Archives (raw HTML)	`sec_fetch_filing_excerpt`
Stock price, volume, 52-week range, 5-day history	Yahoo Finance API via `yfinance`	`get_stock_price`
Google search results and links	Google (via Playwright browser)	`search_google`
Article content	Live webpage (via crawl4ai browser)	`visit_webpage`

Generated by the LLM (Gemini)

Data	What the LLM does
Key Metrics cards	Formats raw XBRL numbers into `current_value` / `prior_value` / `yoy_change_percent` -- the numbers are real, the LLM computes the YoY %
Score rings (0-100)	LLM rates six dimensions (revenue growth, profitability, balance sheet, management, sentiment, risk) based on the data it collected
Scored Findings table	LLM identifies the 5-12 most important findings and classifies each by category, importance, sentiment, and surprise factor
Bullet-point sections	LLM synthesizes raw data into structured bullet points for Fundamentals, Management & Operations, Risks, Market & News, and Value Investor Takeaway
Overall sentiment / importance	LLM's aggregate judgment: bullish/bearish/neutral + high/medium/low
One-line summary	A single sentence for a portfolio manager
Action flags	1-3 things to watch going forward

Key design principle: The LLM never invents financial numbers. Every figure in "Fundamentals" traces back to an sec_get_xbrl_concept API call. The LLM's role is to read, contextualize, and score -- not fabricate.

The Agent's Research Strategy

The agent follows a strict 7-step protocol (12-18 tool calls total):

sec_list_recent_filings -- Find the latest 10-Q or 10-K with accession number
sec_get_xbrl_concept x4-6 -- Pull structured financial data: Revenue, Net Income, Operating Income, EPS, Total Assets, Cash
get_stock_price -- Live price, volume, 5-day history from Yahoo Finance
sec_fetch_filing_excerpt x2-3 -- Read actual filing text for MD&A ("Results of Operations"), Risk Factors, and Liquidity sections
search_google x2 -- Search for market reaction and analyst commentary
visit_webpage x1 -- Read the most relevant article in full
submit_result -- Produce the final structured JSON with all fields filled

You can watch the agent work in real-time: both Playwright (Google search) and crawl4ai (article reading) open visible browser windows during execution.

Architecture

sec_agent/
  webhook_server.py    # FastAPI server + background poller thread
  simulate_filing.py   # CLI to fake-POST a filing event for testing
  poller.py            # Polls SEC XBRL RSS, filters by watchlist
  watchlist.py         # Builds CIK list for ~500 S&P 500 companies
  agent.py             # Agent definition, instructions, Gemini config
  tools.py             # 8 function_tools (SEC, Yahoo, Google, browser)
  models.py            # Pydantic schemas (FilingAnalysisResult, KeyMetric, ScoredItem)
  sec_data.py          # SEC API helpers (submissions, XBRL, document fetch)
  dashboard.py         # HTML dashboard generator with company selector
  demo.py              # One-shot demo runner (no server needed)
  browser.py           # Playwright browser manager
  crawl_browser.py     # crawl4ai browser manager
  config.py            # Central configuration (env vars, defaults)

Tech Stack

OpenAI Agents SDK -- agent orchestration, tool calling, structured output
LiteLLM + Gemini -- LLM backend (gemini-3.1-flash-lite-preview)
Pydantic -- strict typed schemas for agent output
FastAPI + uvicorn -- webhook server
Playwright -- visible browser automation for Google searches
crawl4ai -- web page content extraction
yfinance -- live stock prices
SEC EDGAR APIs -- free, no key needed (just a User-Agent header)

Webhook Endpoints

Method	Path	Description
`POST`	`/webhook/filing`	Trigger agent with `{cik, ticker, form_type, ...}`
`GET`	`/simulate/{ticker}`	One-click test -- simulates a filing for AAPL, MSFT, TSLA, etc.
`GET`	`/jobs`	JSON status of all running/completed/failed jobs
`GET`	`/health`	Health check
`GET`	`/`	Mini web dashboard with endpoints and job status

Example POST body

{
  "cik": "0000320193",
  "ticker": "AAPL",
  "form_type": "10-Q",
  "entity_name": "Apple Inc.",
  "filing_date": "2025-01-31"
}

Output

Each run produces two files in sec_agent/output/:

demo_AAPL.json -- full structured result (~50 fields)
demo_AAPL.html -- interactive dashboard with score rings, metric cards, bullet-point analysis, scored findings table, and company selector

When multiple companies have been analyzed, the HTML dashboard includes a dropdown selector to switch between them.

Configuration

All settings live in sec_agent/config.py with env-var overrides:

Variable	Default	Purpose
`GEMINI_API_KEY`	(required)	Google AI / Gemini API key
`AGENT_MODEL`	`gemini/gemini-3.1-flash-lite-preview`	LLM model
`SEC_USER_AGENT`	`YourName your@email.com`	SEC fair-access header (name + email required by SEC)
`BROWSER_VISIBLE`	`true`	Show browser windows during research
`AGENT_MAX_TURNS`	`40`	Max tool-call turns

Hackathon Track: Always-On Agents

This project fits the Always-On Agents track:

Triggered by the outside world -- the SEC RSS poller detects new filings and POSTs to the webhook, just like a GitHub webhook fires on a push
Stays useful after the demo -- leave the server running and it will analyze every new 10-Q/10-K from S&P 500 companies as they're filed
Not a one-off manual run -- the /simulate/{ticker} endpoint exists only for easy testing; in production the poller handles everything automatically

License

Hackathon project -- MIT.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
md		md
sec_agent		sec_agent
tests		tests
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SEC Filing Agent -- Always-On Investment Research

Quick Start (for judges)

1. Install (one time)

2. Set your Gemini API key

3. Trigger a simulated filing event (fastest way to see it work)

How It Works

The three layers

What comes from where -- data provenance

Directly from APIs (no LLM involvement)

Generated by the LLM (Gemini)

The Agent's Research Strategy

Architecture

Tech Stack

Webhook Endpoints

Example POST body

Output

Configuration

Hackathon Track: Always-On Agents

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SEC Filing Agent -- Always-On Investment Research

Quick Start (for judges)

1. Install (one time)

2. Set your Gemini API key

3. Trigger a simulated filing event (fastest way to see it work)

How It Works

The three layers

What comes from where -- data provenance

Directly from APIs (no LLM involvement)

Generated by the LLM (Gemini)

The Agent's Research Strategy

Architecture

Tech Stack

Webhook Endpoints

Example POST body

Output

Configuration

Hackathon Track: Always-On Agents

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages