Skip to content

dimknaf/cursor-hackathon

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SEC Filing Agent -- Always-On Investment Research

Demo Video

Watch the demo on YouTube

An always-on agent that watches for new SEC filings (10-Q, 10-K) from ~500 companies, and automatically produces a comprehensive value-investing research brief with a visual dashboard -- triggered by the outside world, not by a human clicking "run."

Quick Start (for judges)

1. Install (one time)

pip install -U -r requirements.txt
playwright install chromium

2. Set your Gemini API key

# Create .env in the repo root
echo GEMINI_API_KEY=your-key-here > .env

3. Trigger a simulated filing event (fastest way to see it work)

Option A -- One command, no server needed:

python -m sec_agent.demo --ticker AAPL

This runs the full agent pipeline on Apple's latest 10-Q. Browsers open visibly, the agent researches, and an HTML dashboard pops up when done (~2-3 min).

Option B -- Full webhook flow (the real "always-on" architecture):

# Terminal 1: start the webhook server + SEC poller
python -m sec_agent.webhook_server

# Terminal 2: simulate a filing event hitting the webhook
python -m sec_agent.simulate_filing --ticker AAPL

Or just open your browser to http://localhost:8000/simulate/AAPL -- one click.

Check progress at http://localhost:8000/jobs or http://localhost:8000/.


How It Works

  SEC EDGAR RSS Feed          Webhook Server (FastAPI)           Agent
  (updated every 10 min)      http://localhost:8000              (Gemini + tools)
        |                           |                                |
        |   poll every 5 min        |                                |
        +-------------------------->|                                |
        |   new 10-Q detected       |                                |
        |   for AAPL (CIK match)    |                                |
        |                           |   POST /webhook/filing         |
        |                           +------------------------------->|
        |                           |   {cik, ticker, form_type}     |
        |                           |                                |
        |                           |        12-18 tool calls        |
        |                           |   SEC API, Yahoo, Google,      |
        |                           |   webpage visits               |
        |                           |                                |
        |                           |   <--- structured JSON ---------+
        |                           |   + HTML dashboard              |
        |                           |                                |

The three layers

Layer What it does Code
Poller Polls SEC's global XBRL RSS feed, filters by a watchlist of ~500 S&P 500 companies, detects new 10-Q/10-K filings poller.py, watchlist.py
Webhook Server FastAPI server receives filing events (from poller or simulation), spawns agent in background webhook_server.py
Research Agent Gemini-powered agent with 8 tools, performs 12-18 tool calls, outputs structured JSON + HTML dashboard agent.py, tools.py, models.py

What comes from where -- data provenance

The dashboard clearly separates hard data from APIs vs. LLM-generated analysis:

Directly from APIs (no LLM involvement)

Data Source Tool
Revenue, Net Income, EPS, Assets, Cash (numbers) SEC EDGAR XBRL API sec_get_xbrl_concept
Filing metadata (date, accession, form type) SEC EDGAR Submissions API sec_list_recent_filings
Filing text (MD&A, Risk Factors, Liquidity) SEC EDGAR Archives (raw HTML) sec_fetch_filing_excerpt
Stock price, volume, 52-week range, 5-day history Yahoo Finance API via yfinance get_stock_price
Google search results and links Google (via Playwright browser) search_google
Article content Live webpage (via crawl4ai browser) visit_webpage

Generated by the LLM (Gemini)

Data What the LLM does
Key Metrics cards Formats raw XBRL numbers into current_value / prior_value / yoy_change_percent -- the numbers are real, the LLM computes the YoY %
Score rings (0-100) LLM rates six dimensions (revenue growth, profitability, balance sheet, management, sentiment, risk) based on the data it collected
Scored Findings table LLM identifies the 5-12 most important findings and classifies each by category, importance, sentiment, and surprise factor
Bullet-point sections LLM synthesizes raw data into structured bullet points for Fundamentals, Management & Operations, Risks, Market & News, and Value Investor Takeaway
Overall sentiment / importance LLM's aggregate judgment: bullish/bearish/neutral + high/medium/low
One-line summary A single sentence for a portfolio manager
Action flags 1-3 things to watch going forward

Key design principle: The LLM never invents financial numbers. Every figure in "Fundamentals" traces back to an sec_get_xbrl_concept API call. The LLM's role is to read, contextualize, and score -- not fabricate.


The Agent's Research Strategy

The agent follows a strict 7-step protocol (12-18 tool calls total):

  1. sec_list_recent_filings -- Find the latest 10-Q or 10-K with accession number
  2. sec_get_xbrl_concept x4-6 -- Pull structured financial data: Revenue, Net Income, Operating Income, EPS, Total Assets, Cash
  3. get_stock_price -- Live price, volume, 5-day history from Yahoo Finance
  4. sec_fetch_filing_excerpt x2-3 -- Read actual filing text for MD&A ("Results of Operations"), Risk Factors, and Liquidity sections
  5. search_google x2 -- Search for market reaction and analyst commentary
  6. visit_webpage x1 -- Read the most relevant article in full
  7. submit_result -- Produce the final structured JSON with all fields filled

You can watch the agent work in real-time: both Playwright (Google search) and crawl4ai (article reading) open visible browser windows during execution.


Architecture

sec_agent/
  webhook_server.py    # FastAPI server + background poller thread
  simulate_filing.py   # CLI to fake-POST a filing event for testing
  poller.py            # Polls SEC XBRL RSS, filters by watchlist
  watchlist.py         # Builds CIK list for ~500 S&P 500 companies
  agent.py             # Agent definition, instructions, Gemini config
  tools.py             # 8 function_tools (SEC, Yahoo, Google, browser)
  models.py            # Pydantic schemas (FilingAnalysisResult, KeyMetric, ScoredItem)
  sec_data.py          # SEC API helpers (submissions, XBRL, document fetch)
  dashboard.py         # HTML dashboard generator with company selector
  demo.py              # One-shot demo runner (no server needed)
  browser.py           # Playwright browser manager
  crawl_browser.py     # crawl4ai browser manager
  config.py            # Central configuration (env vars, defaults)

Tech Stack

  • OpenAI Agents SDK -- agent orchestration, tool calling, structured output
  • LiteLLM + Gemini -- LLM backend (gemini-3.1-flash-lite-preview)
  • Pydantic -- strict typed schemas for agent output
  • FastAPI + uvicorn -- webhook server
  • Playwright -- visible browser automation for Google searches
  • crawl4ai -- web page content extraction
  • yfinance -- live stock prices
  • SEC EDGAR APIs -- free, no key needed (just a User-Agent header)

Webhook Endpoints

Method Path Description
POST /webhook/filing Trigger agent with {cik, ticker, form_type, ...}
GET /simulate/{ticker} One-click test -- simulates a filing for AAPL, MSFT, TSLA, etc.
GET /jobs JSON status of all running/completed/failed jobs
GET /health Health check
GET / Mini web dashboard with endpoints and job status

Example POST body

{
  "cik": "0000320193",
  "ticker": "AAPL",
  "form_type": "10-Q",
  "entity_name": "Apple Inc.",
  "filing_date": "2025-01-31"
}

Output

Each run produces two files in sec_agent/output/:

  • demo_AAPL.json -- full structured result (~50 fields)
  • demo_AAPL.html -- interactive dashboard with score rings, metric cards, bullet-point analysis, scored findings table, and company selector

When multiple companies have been analyzed, the HTML dashboard includes a dropdown selector to switch between them.


Configuration

All settings live in sec_agent/config.py with env-var overrides:

Variable Default Purpose
GEMINI_API_KEY (required) Google AI / Gemini API key
AGENT_MODEL gemini/gemini-3.1-flash-lite-preview LLM model
SEC_USER_AGENT YourName your@email.com SEC fair-access header (name + email required by SEC)
BROWSER_VISIBLE true Show browser windows during research
AGENT_MAX_TURNS 40 Max tool-call turns

Hackathon Track: Always-On Agents

This project fits the Always-On Agents track:

  • Triggered by the outside world -- the SEC RSS poller detects new filings and POSTs to the webhook, just like a GitHub webhook fires on a push
  • Stays useful after the demo -- leave the server running and it will analyze every new 10-Q/10-K from S&P 500 companies as they're filed
  • Not a one-off manual run -- the /simulate/{ticker} endpoint exists only for easy testing; in production the poller handles everything automatically

License

Hackathon project -- MIT.

About

Earnings Release Analyser Agent

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages