Skip to content

pinglucid/polymarket-bot

Repository files navigation

Polymarket Trading Bot

An autonomous trading agent for Polymarket. Claude analyzes markets without seeing the price,
calibrates its own confidence over time, and sizes every position with Kelly criterion.

Python Mode License LLM Network

Thesis · Quick Start · How It Works · Strategies · AI Pipeline · Edge & Sizing · Risk Management · Configuration · Deployment

Caution

Paper mode is the default — virtual capital, live market data, no risk. Live mode is W.I.P and not yet finalized. Start with paper.


Bot TUI Dashboard

Live TUI dashboard — portfolio stats, strategy activity, pipeline feed, recent trades, and streaming logs


Thesis

Can an LLM find genuine edge in prediction markets, or is it just expensive noise?

This bot is a working experiment. It gives Claude autonomous decision-making power over real markets — then constrains that autonomy with calibration tracking, risk gates, and position sizing math. Every prediction is logged, scored, and used to correct future estimates. The architecture assumes the model is wrong by default and builds accountability into every layer.


Highlights

10 strategies
6 AI analysts · weather ensemble · structural scanner · contrarian · sentiment — 4 active by default

Anti-anchoring research
Market price withheld during AI evidence gathering to prevent LLM anchoring bias

Bayesian update
Model confronts its blind estimate with market price and reasons about the gap — replaces mechanical blend

Self-calibrating
Platt alpha auto-tuned per strategy by minimizing Brier score over resolved predictions

Dual LLM providers
Gemini for grounded research · Claude for ensemble estimation · automatic fallback

Thesis-based position management
LLM re-evaluates held positions every 30 min with adaptive frequency by deadline


Prerequisites

Requirement Purpose Required
Python 3.12+ Runtime Yes
Node.js 20+ Claude CLI host Yes
Claude CLI AI analysis + validation Yes
ANTHROPIC_API_KEY Claude API access Yes
GOOGLE_API_KEY Gemini research provider Recommended
Polymarket credentials Live trading (CLOB signing) Live only

Tip

scripts/setup.sh installs Python deps and generates .env, but does not install Node.js or Claude CLI. Without the CLI, the bot will start successfully but fail on the first AI analysis tick — --dry-run will not catch this.


Quick Start

git clone <repo-url> && cd polymarket-bot
./scripts/setup.sh                                    # creates venv, installs Python deps, generates .env
npm install -g @anthropic-ai/claude-code       # Claude CLI -- NOT installed by setup.sh
export ANTHROPIC_API_KEY="sk-ant-..."          # or add to .env (sourced by run.sh)
python main.py --dry-run          # verify setup (loads components, fetches markets, exits)
python main.py                    # paper trading with TUI dashboard ($1,000 virtual capital)
python main.py --logs             # paper trading with streaming logs (no TUI)

Note

scripts/setup.sh does not install Claude CLI or check for Node.js (20+ required). Without the CLI, the bot will fail on the first AI analysis — not at startup, so --dry-run will not catch it.

Warning

scripts/run.sh defaults to live mode. Always pass --paper explicitly. Running python main.py directly defaults to the mode in config.yaml (paper) — use this for the safest default behavior.

CLI Reference
Command Purpose
python main.py TUI dashboard mode (default)
python main.py --logs Streaming colorized logs, no TUI
python main.py --dry-run Load all components, fetch markets, exit
python main.py --collect-data Snapshot-only mode for building backtest data
python main.py --mode paper|live Override config.yaml mode
python main.py --config PATH Use alternate config file
./scripts/run.sh --paper Run via wrapper (activates venv, sources .env)
./scripts/ctl.sh up Docker: build and start in paper mode
./scripts/ctl.sh dashboard Docker: attach to live TUI (detach: Ctrl+P, Ctrl+Q)

How It Works

The bot runs a single asyncio event loop. An asyncio.Queue receives events from three concurrent producers: a REST price poller (every 60s), a WebSocket feed (real-time), and a resolution poller (every 5 min). The EventHandler drains the queue and runs two paths concurrently on each tick.

Tick Lifecycle

flowchart TD
    A([Event from Queue]) --> B{Event type}

    B -- MARKET_UPDATE --> C[Update market state]
    B -- TIMER_TICK --> D[Increment tick<br>periodic checkpoint]
    B -- MARKET_RESOLVED --> RES[Settle positions at $1/$0<br>backfill Brier scores<br>update knowledge base]

    C & D --> FP

    subgraph FP["Fast Path — synchronous, every event"]
        FP1[Update portfolio prices<br>best_bid valuation] --> FP2[Exit manager<br>scans all positions]
        FP2 --> FP3{Exit triggered?}
        FP3 -- yes --> FP4[Execute SELL immediately<br>bypass risk gate + validator]
    end

    FP3 -- no --> G{TIMER_TICK<br>warmup done<br>prev task done?}
    FP4 --> G

    G -- yes --> SP
    G -- no --> WAIT([Wait for next event])

    subgraph SP["Slow Path — background asyncio.Task"]
        SP1[Run all strategies in parallel] --> SP2[Signal funnel<br>6-stage filter]
        SP2 --> SP3[Risk manager<br>11 checks per signal]
        SP3 --> SP4{Approved?}
        SP4 -- rejected --> SP5([Log rejection])
        SP4 -- approved --> SP6{AI analyst<br>source?}
        SP6 -- "yes → skip validator" --> SP7["Execute in parallel<br>asyncio.gather"]
        SP6 -- no --> SP8{AI validator<br>enabled?}
        SP8 -- yes --> SP9[Claude approval gate]
        SP9 -- approved --> SP7
        SP9 -- rejected --> SP10([AI REJECTED])
        SP8 -- "no → execute directly" --> SP7
        SP7 --> SP11[Paper or Live executor<br>portfolio update + DB log]
    end

    G -- "monitor interval<br>elapsed" --> MON

    subgraph MON["Position Monitor — background task, every 30 min"]
        MON1[Re-evaluate held positions via LLM<br>adaptive frequency by deadline] --> MON2{Verdict}
        MON2 -- "thesis valid" --> MON3([Hold])
        MON2 -- "extend hold" --> MON4([+24h, max 1 extension])
        MON2 -- "thesis invalid" --> MON5[Execute exit immediately]
    end
Loading
Path Behavior
Fast path Synchronous, every tick. Updates portfolio prices using best_bid (not midpoint — that's what you'd actually get on exit). Checks exit conditions: take-profit, edge decay, time expiry, approaching resolution. Exits execute immediately, bypassing risk gate and validator.
Slow path Background asyncio.Task, TIMER_TICK only. Runs all strategies in parallel → 6-stage funnel → 11-check risk gate. AI analyst signals skip the Claude validator (already AI-sourced); other signals pass through the approval gate.
Position monitor Separate background task, every 30 min. Re-evaluates held positions via LLM with adaptive frequency by deadline proximity. Can extend hold time (max 1 extension) or trigger immediate exit on thesis invalidation. Staleness failsafe: if monitor hasn't run in 2 cycles, edge decay re-enables.
Adaptive re-eval frequency
Time to Deadline Re-eval Frequency
< 24 hours Every 10-min cycle
1–3 days Every 3rd cycle (30 min)
3–7 days Every 5th cycle (50 min)
7+ days Every 7th cycle (70 min)

System Overview

flowchart LR
    subgraph Ingestion["Data Ingestion"]
        MM["Market Monitor<br>1,500 markets<br>REST + WebSocket"]
    end

    subgraph Strategies["Strategy Layer"]
        AI["6 AI Analysts<br>politics / crypto / sports<br>econ / tech / general"]
        WX["Weather<br>GFS 30-member ensemble"]
        CX["Complexity<br>3 structural signals"]
    end

    subgraph Pipeline["Signal Pipeline"]
        SF["Signal Funnel<br>6-stage filter"]
        RM["Risk Manager<br>11-check gate"]
        AV["AI Validator<br>fail-closed"]
    end

    subgraph Exec["Execution"]
        P["Paper Simulator"]
        L["Live CLOB<br>(W.I.P)"]
    end

    subgraph State["State"]
        PORT["Portfolio<br>fee-inclusive<br>cost basis"]
        DB["SQLite"]
        KB["Knowledge Base<br>per-category"]
    end

    LLM["Claude / Gemini"] --> AI
    LLM --> AV
    MM --> AI & WX & CX
    AI & WX & CX --> SF --> RM --> AV
    AV --> P & L --> PORT --> DB
    AV --> KB
Loading

Tip

Resilience — Every long-running coroutine runs inside a supervisor with up to 50 restarts and exponential backoff (capped at 5 min). Portfolio checkpoints to SQLite every 10 ticks, surviving crashes and SIGKILL.

Config hot-reloadConfigManager polls config.yaml every 30 seconds. Risk limits, strategy parameters, and exit profiles update inline — no restart needed. Research provider changes require a restart.


Strategies

AI Analysts (6 categories)

Each analyst inherits from ai_analyst_base.py which handles LLM calls, caching, Platt calibration, and Kelly sizing. Up to five LLM calls per market:

  1. Research — 3-thread evidence gathering (base rate, current events, structural factors) via web search. Market price is deliberately withheld to prevent anchoring bias.
  2. Independent estimation — three parallel LLM calls, each with its own context window. No analyst can see another's output. Median probability becomes the estimate; spread drives confidence.
  3. Reconciliation (conditional) — when ensemble spread exceeds 0.15, a supervisor call identifies the source of disagreement and produces a reconciled estimate. Recovers markets that would otherwise be dropped.
  4. Bayesian update — the model sees its blind estimate alongside the market price and reasons about whether the deviation is justified. Replaces the old mechanical blend formula.
Category Tags Perspectives Min Edge Kelly Max Hold
Politics elections, geopolitics Historical · Current evidence · Structural 6% 0.15 48h
Crypto BTC, ETH, SOL, tokens Technical · Momentum · Sentiment 4% 0.10 24h
Sports NBA, NFL, MLB, F1, MMA Statistical · Matchup · Market 6% 0.15 24h
Economics Fed, inflation, GDP, tariffs Consensus · Data-driven · Surprise risk 5% 0.15 48h
Tech AI, launches, semiconductors Timeline · Technical · Strategic 5% 0.15 48h
General everything unclaimed Base-rate · Evidence · Contrarian 5% 0.20 120h
Advanced AI features
Feature Description
Crypto enrichment Live Binance technicals (RSI-14, SMA-20/50, EMA-12/26, VWAP-24h, funding rates) and Fear & Greed Index injected into the prompt — real data, not hallucination
Batch research Markets sharing the same underlying asset (e.g., "Bitcoin $90k / $95k / $100k") are grouped into a single research call. Reduces LLM calls from N to 1+N
Cross-category dedup Global claim registry (1-hour TTL) prevents multiple categories from analyzing the same market
Sibling co-evaluation Timeframe siblings (same event, different deadlines) pulled into the same tick; funnel keeps the best-scoring variant
Cross-market coherence Already-estimated probabilities from related markets are injected as priors, enforcing consistency across outcome variants
Category exclusion Politics excludes oil/crypto/commodities; Economics excludes crypto. Each market routes to exactly one specialist
3-thread research Base rate, current events, and structural factors searched in parallel within a single Gemini call — finds evidence single-query misses (e.g., electoral system rules, pollster bias)
Supervisor reconciliation When ensemble spread > 0.15, a supervisor call identifies the source of disagreement and reconciles. Recovers ~30% of markets that would otherwise be dropped
Bayesian update Blind estimate confronted with market price — model reasons about whether its deviation is justified. Replaces mechanical blend formula
Timeout backoff Markets causing LLM timeouts get geometric cooldowns: 15 min → 30 min → 1h → 2h → 4h max

Other Strategies

Strategy Type Description
Weather GFS Ensemble No LLM calls. Open-Meteo 30-member ensemble forecasts. Parse question → geocode → fetch forecast → count members per bucket → trade mispriced outcomes. Supports temp, precip, snowfall, wind.
Complexity Structural Scanner Zero LLM calls. Three signals: complement spread (informed flow), volume spike (detection-only), resolution proximity (midpoint ~50%, expiry <48h).
Contrarian Mechanical Disabled by default. Bets against extreme consensus: >95% → NO, <5% → YES. Requires 48h+ to expiry.
Sentiment News-driven Disabled by default. NewsAPI headlines + LLM impact assessment. 5-min cooldown per token.

AI Analysis Pipeline

flowchart TD
    A["Market candidates<br>volume-sorted, liquid first"] --> B["Phase A: Research<br>Gemini + Google Search grounding<br>or Claude CLI + web search"]

    B --> |"price WITHHELD<br>prevents anchoring"| C["Phase B: Independent Estimation<br>3 parallel LLM calls<br>separate context windows"]

    C --> REC{"Spread > 0.15?"}
    REC -- yes --> RECON["Supervisor Reconciliation<br>identify disagreement source<br>produce reconciled estimate"]
    RECON --> D["Platt Scaling<br>correct RLHF underconfidence<br>alpha configurable per strategy"]
    REC -- no --> D

    D --> BAY["Bayesian Update<br>confront blind estimate with market price<br>model reasons about the gap"]

    BAY --> E["Edge Calculation<br>subtract fees + spread + slippage<br>category-aware (see below)"]

    E --> F{Edge Validation}
    F --> |"net edge > 30%<br>in liquid market"| G["Reject: implausible<br>model error"]
    F --> |"edge < min_edge"| H["Insufficient edge<br>not deduped, retried"]
    F --> |"stale price<br>0.48-0.52, low vol"| I["Skip: no reliable<br>price signal"]
    F --> |"valid edge"| K["Kelly Sizing<br>uncertainty discount<br>+ inventory adjustment"]
    K --> J["SIGNAL<br>deduped for 1 hour"]
Loading

Dual providers — Gemini 2.5 Flash Lite for 3-thread research (native Google Search grounding), Claude for independent parallel estimation (3 calls, separate context windows). Research uses grounding only — no thinking — combining both is a known Gemini API bug. Automatic Claude CLI fallback on Gemini failure.

Online learning — When a position resolves, Claude extracts one actionable lesson and appends it to data/knowledge/{category}.md. These lessons are injected into future prompts (capped at 100 lines per category).

EdgeStatus dedupSIGNAL, IMPLAUSIBLE_EDGE, and ZERO_SIZE permanently dedup a market. INSUFFICIENT_EDGE and LOW_CONFIDENCE are intentionally retried — they depend on price movement.

Calibration Lifecycle

flowchart TD
    A["AI estimates probability<br>raw_prob → Platt scaling → blended_prob"] --> B["Log to predictions table<br>raw, calibrated, blended, market_price, edge"]
    B --> C["Trade executes<br>edge_status = SIGNAL"]

    D["Resolution Poller<br>every 30 min"] --> E{Market resolved?}
    E -- no --> D
    E -- yes --> F["Backfill outcome<br>1.0 = YES won, 0.0 = NO won"]
    F --> G["Compute Brier component<br>(calibrated_prob - outcome)²"]
    G --> H{"30+ resolved<br>for this strategy?"}
    H -- no --> I["Use default alpha<br>(1.3)"]
    H -- yes --> J["Grid-search alpha<br>0.8 to 2.5, step 0.05<br>minimize Brier score"]
    J --> K["Update strategy's<br>platt_alpha in memory"]
    K --> A
Loading

RLHF training makes LLMs systematically under-confident — probabilities cluster toward 50%. Platt scaling amplifies log-odds: calibrated = sigmoid(alpha * log(p / (1-p))). Default alpha 1.3; auto-tuned once 30+ predictions resolve per strategy.

The dashboard shows the aggregate Brier score with quality labels (excellent < 0.10 · good < 0.20 · fair < 0.30 · poor).

Edge & Sizing Mathematics

Three mathematical stages transform a calibrated probability into a sized trade signal. Each stage is designed to correct for a specific class of error: blend weighting corrects for LLM anchoring risk, edge calculation corrects for transaction costs, and Kelly sizing corrects for estimation uncertainty.

Bayesian Update

After Platt scaling, the model confronts its blind estimate with the market price and reasons about the gap. This replaces the old mechanical blend formula (weighted average of LLM and market). The model explicitly evaluates whether its deviation from the market is justified by specific information, and adjusts accordingly.

flowchart TD
    A["calibrated_prob<br>Platt-scaled median of ensemble"] --> C["Bayesian Update Call"]
    B["market_price<br>midpoint from order book"] --> C

    C --> D{"Is deviation<br>justified?"}

    D -- "YES: specific info<br>market hasn't priced in" --> E["Keep estimate or<br>move slightly toward market"]
    D -- "NO: market with this<br>volume likely knows more" --> F["Move substantially<br>toward market price"]

    E --> G["adjusted_prob<br>with reasoning"]
    F --> G

    G --> H["Safety: direction preserved<br>update can't flip the edge"]
Loading

The Bayesian update produces the most interpretable reasoning in the whole pipeline. Example: blind estimate 0.63, market 0.34. Old formula: 0.58 (mechanical). Bayesian: 0.44 with reasoning: "over-weighted structural factors, market at this volume prices polling correctly."

Edge Calculation Pipeline

Net edge deducts all real-world costs from the raw probability edge — category-aware Polymarket fees, half-spread, and slippage — before comparing to the strategy's minimum edge threshold.

flowchart TD
    A["blended_prob"] --> B["raw_edge = blended_prob - market_price"]

    B --> C{Category}

    C -- "crypto" --> D["entry_fee = p * 0.25 * (p*(1-p))^2<br>exit_fee = ep * 0.25 * (ep*(1-ep))^2"]
    C -- "sports" --> E["entry_fee = p * 0.0175 * p*(1-p)<br>exit_fee = ep * 0.0175 * ep*(1-ep)"]
    C -- "politics / economics<br>tech / general" --> F["entry_fee = 0<br>exit_fee = 0"]

    D & E & F --> G["spread_cost = market_spread / 2"]

    G --> H["net_edge = |raw_edge|<br>- spread_cost<br>- entry_fee - exit_fee<br>- slippage_pct"]

    H --> I{net_edge < min_edge?}
    I -- yes --> J["INSUFFICIENT_EDGE<br>not deduped — retried"]
    I -- no --> K{net_edge > 30%?}
    K -- yes --> L["IMPLAUSIBLE_EDGE<br>deduped — likely model error"]
    K -- no --> M["Kelly sizing →"]
Loading

Note

Politics, economics, tech, and general markets have zero fees on Polymarket (2026). Only crypto and sports incur fees, with crypto fees peaking at ~1.56% at the midpoint and sports at ~0.44%. The same fee formulas are applied in both edge calculation and paper trade simulation for consistency.

Kelly Sizing Pipeline

Kelly criterion sizes each bet proportional to edge, then applies three sequential discounts: an uncertainty penalty from ensemble confidence, a per-strategy fractional Kelly cap, and an inventory adjustment that reduces size as topic exposure grows.

flowchart TD
    A["true_prob<br>calibrated, trade-side adjusted"] --> B["b = (1 / price) - 1<br>implied payout odds"]

    B --> C{confidence<br>provided?}
    C -- yes --> D["uncertainty = clamp((1 - conf) * 0.15, 0.02, 0.08)<br>p = true_prob - uncertainty"]
    C -- no --> E["p = true_prob"]

    D & E --> F["q = 1 - p<br>kelly = (b*p - q) / b"]

    F --> G{kelly <= 0?}
    G -- yes --> H["Return $0 — no edge"]

    G -- no --> I["fraction = kelly * kelly_fraction<br>0.10 - 0.20 per strategy"]

    I --> J["Topic inventory check<br>count positions in same cluster<br>(BTC, Trump, Fed, oil, ...)"]
    J --> K["inventory_scale = max(0, 1 - count / max)<br>linear decay to zero"]

    K --> L["bet = fraction * inventory_scale * portfolio_value"]
    L --> M["Final = min(bet, max_bet_usd)"]
Loading
Discount Source Effect
Uncertainty penalty Ensemble spread → confidence Low agreement → subtract 2–8% from probability before Kelly
Fractional Kelly Per-strategy config (0.10–0.20) Caps bet at 10–20% of full Kelly — reduces variance
Inventory adjustment Open positions in same topic Linear decay: 3 BTC positions out of max 3 → Kelly multiplied by 0

Risk Management

Every signal passes a sequential gate — any single check can reject:

# Check Prevents
0 Halt status Trading during drawdown circuit breaker
1a Duplicate guard Doubling into already-held positions
1b Max open positions Exceeding position cap (25)
2 Liquidity + spread Zero-volume, crossed, or wide-spread markets
3 Price sanity Prices outside 0.01–0.99
4 Position size cap Single position > 2% of portfolio
5 Strategy budget Strategy exceeding max_capital_usd ceiling
6 Daily drawdown Daily PnL < -10% → halt all trading
7 Event correlation Max 3 positions per event
8 Topic correlation Max 2 positions per topic (BTC, ETH, Trump, Fed, oil, etc.)
9 Balance check Insufficient capital (with $5 buffer)
10 Deployment pacing Max 5% deployed per rolling hour

Important

Position count and loss cooldown are re-checked at execution time (not just risk-check time), closing race conditions where parallel asyncio.gather executions could collectively exceed limits.

Position Lifecycle

flowchart TD
    A([BUY order fills]) --> B[Create Position<br>fee-inclusive cost basis]
    B --> C[Compute exit params<br>TP price / max hold / edge decay]
    C --> D([Position held<br>updated every tick])

    D --> E{Exit Manager<br>checks every tick}
    E -- "price >= take_profit" --> F[TAKE_PROFIT]
    E -- "price drop >30%<br>OR time >60%" --> G[EDGE_DECAYED]
    E -- "max hold exceeded" --> H[TIME_EXPIRY]
    E -- "near resolution" --> I[APPROACHING_EXPIRY]
    E -- "no exit trigger" --> D

    D --> J{Position Monitor<br>every 30 min}
    J -- "thesis still valid" --> D
    J -- "extend hold time" --> K["+24h<br>max 1 extension<br>capped by market deadline"]
    K --> D
    J -- "thesis invalidated" --> L[THESIS_EXIT]

    F & G & H & I & L --> M[Execute SELL<br>bypass risk gate]
    M --> N[Calculate realized PnL]
    N --> O[Log trade + update DB]
    O --> P{Loss?}
    P -- yes --> Q[6-hour cooldown<br>on this market]
    P -- no --> R[Update knowledge file<br>via Claude]
Loading

Binary prediction markets pay $1 or $0 at resolution — interim dips are usually noise, not thesis invalidation. No hard stop-loss or trailing stop. Instead:

Exit Type Trigger Rationale
Take-profit 7–14% gain (strategy-dependent) Prediction markets rarely offer more
Edge decayed Price drops >30% OR >60% of hold elapsed OR-logic soft stop-loss
Time expiry Max hold exceeded (24h–120h) Limits capital lock-up
Approaching expiry Near resolution deadline (2h buffer) Avoid liquidity drought
Thesis invalidation Monitor LLM determines thesis invalid Fundamental outlook change
Monitor staleness 2 consecutive missed re-evals Failsafe: re-enables edge decay
Additional exit safeguards
  • Loss cooldown — Same market blocked for 6 hours after a loss. Checked at both strategy selection and execution stages.
  • Complement liquidity — NO token buys require complement best_bid >= $0.05. Illiquid complements are permanently deduped.
  • Minimum hold — Exit checks suppressed for 60s after entry, preventing same-tick buy/sell oscillation.

Configuration

All settings live in config.yaml. Changes take effect within 30 seconds via hot-reload — no restart needed (except research provider).

Section Key Settings Default
mode "paper" or "live" "paper"
risk.max_position_pct Max single position as % of portfolio 2%
risk.daily_drawdown_limit_pct Daily drawdown halt threshold 10%
risk.max_open_positions Concurrent position cap 25
risk.max_deployment_per_hour_pct Hourly deployment pacing 5%
strategies.<name>.min_edge Minimum edge to signal 4–6%
strategies.<name>.kelly_fraction Kelly sizing fraction 0.10–0.20
strategies.<name>.platt_alpha Platt scaling alpha (>1 = away from 50%) 1.3
research.research_provider "gemini" or "claude" for Phase A "gemini"
research.estimation_provider "claude" or "gemini" for Phase B "claude"
strategies.<name>.reconciliation_spread_threshold Ensemble spread triggering supervisor reconciliation 0.15
ai_validation.enabled Claude approval gate for non-AI signals true
position_monitor.max_extensions Max hold-time extensions per position 1
paper_trading.initial_balance_usd Starting virtual capital $1,000
Environment Variables
Variable Required Purpose
ANTHROPIC_API_KEY Yes Claude CLI (all AI strategies + validator). Must be in the shell environment.
GOOGLE_API_KEY If using Gemini Gemini research provider. Raises RuntimeError if missing.
GEMINI_API_KEY No Alternative to GOOGLE_API_KEY (checked second)
POLYMARKET_PRIVATE_KEY Live only EOA private key for CLOB order signing
POLYMARKET_FUNDER_ADDRESS Live only Polymarket proxy wallet address
DATABASE_URL No Override default SQLite path (sqlite:///bot_data.db)

[!NOTE] OpenAI and NewsAPI keys are read from config.yaml, not environment variables.

Data Sources
Source Used By Key Notes
Polymarket Gamma API Market discovery, resolution Rate-limited 5 req/s
Polymarket CLOB API Order books, execution Live Circuit breaker: 5 failures → 60s cooldown
Polymarket WebSocket Real-time bid/ask Auto-reconnect with backoff
Claude CLI Estimation, validation, learning Yes Subprocess; max 3 concurrent
Gemini 2.5 Flash Lite Grounded research Yes Auto-fallback to Claude
Open-Meteo GFS weather (30 members) Free; 10K req/day
Binance API Crypto technicals, funding 60s cache
CoinGecko Crypto spot prices 60s cache
Alternative.me Fear & Greed Index Crypto analyst
NewsAPI Breaking news Yes Sentiment only (disabled)

Deployment

Local

./scripts/setup.sh                                    # one-time: venv + Python deps + .env
npm install -g @anthropic-ai/claude-code       # one-time: Claude CLI (not in setup.sh)
export ANTHROPIC_API_KEY="sk-ant-..."          # or add to .env

./scripts/run.sh --paper                              # paper trading (ALWAYS pass --paper explicitly)
./scripts/run.sh --paper --logs                       # streaming logs instead of TUI

Warning

scripts/run.sh defaults to live mode (line 8: MODE="live"). If credentials are set in .env, running without --paper will trade real money. Use python main.py directly for the safest default (paper via config.yaml).

Docker

cp .env.example .env                          # fill in API keys
./scripts/ctl.sh up                           # build and start (paper mode by default)
./scripts/ctl.sh dashboard                    # attach to live TUI
./scripts/ctl.sh logs bot                     # tail log file
./scripts/ctl.sh down                         # stop everything

Two containers: bot (trading engine + TUI) and collector (background snapshots for backtesting). Data persists in a Docker volume at /app/data. The image includes Node.js 20 and Claude CLI automatically.

All Docker commands
Command Purpose
./scripts/ctl.sh up Build and start bot + collector (paper mode)
./scripts/ctl.sh up --live Start in live mode (credential check first)
./scripts/ctl.sh down Stop all services
./scripts/ctl.sh status Show container status
./scripts/ctl.sh dashboard Attach to TUI — detach: Ctrl+P, Ctrl+Q
./scripts/ctl.sh logs bot|collector Tail container logs
./scripts/ctl.sh backtest --start DATE --end DATE Run backtest in container
./scripts/ctl.sh build Rebuild images
./scripts/ctl.sh restart Restart all services

Backtesting

python main.py --collect-data                                         # collect snapshots
python -m backtest.runner --start 2025-01-01 --end 2025-03-01        # replay through pipeline
python -m backtest.walk_forward --start 2025-01-01 --end 2025-03-01  # rolling optimization

Metrics: win rate · PnL · profit factor · expectancy · max drawdown · Sharpe · Sortino.


Project Structure

Full file tree
polymarket-bot/
├── main.py                       # Entry point, supervisor restart loops, daily reset
├── config.yaml                   # All settings (hot-reloadable, 30s poll)
├── core/
│   ├── event_handler.py          # Tick loop, fast/slow path split, position monitor
│   ├── market_monitor.py         # REST + WebSocket + cross-book price sync
│   ├── portfolio.py              # Positions, PnL, fee-inclusive cost basis (best_bid valuation)
│   ├── exit_manager.py           # Take-profit, edge decay, expiry, thesis exits
│   ├── signal_funnel.py          # 6-stage filter: confidence → cap → dedup → rank → global cap
│   ├── ai_validator.py           # Claude approval gate (fail-closed) + knowledge learning
│   ├── calibration_tracker.py    # Prediction logging, Brier scores, Platt alpha auto-tuning
│   ├── config_manager.py         # Hot-reload config watcher (30s mtime poll)
│   ├── models.py                 # Signal, Order, Position, MarketState dataclasses
│   └── resilience.py             # CircuitBreaker, RateLimiter, retry_with_backoff
├── strategies/
│   ├── base.py                   # BaseStrategy + StrategyEngine (3-phase parallel dispatch)
│   ├── ai_analyst_base.py        # Shared AI logic: cache, Kelly, Platt, claims, dedup, thesis
│   ├── ai_analyst.py             # 6 category subclasses with specialized prompts
│   ├── gemini_research.py        # Gemini provider with Google Search grounding
│   ├── market_grouper.py         # Batch research grouping by underlying asset (regex)
│   ├── weather.py                # GFS ensemble forecasting (no LLM)
│   ├── open_meteo.py             # Geocoding + ensemble API client
│   ├── complexity.py             # Market structure scanner (3 structural signals, no LLM)
│   ├── contrarian.py             # Extreme consensus reversal strategy
│   ├── sentiment.py              # News-driven LLM impact assessment
│   ├── crypto_data.py            # Binance technicals + Fear & Greed
│   ├── crypto_prices.py          # CoinGecko price fetcher
│   ├── signal_funnel.py          # Per-category signal caps before risk manager
│   └── calibration.py            # Platt scaling function
├── risk/
│   └── manager.py                # 11-check sequential risk gate
├── execution/
│   ├── executor.py               # Routes to paper or live
│   ├── paper.py                  # Paper simulator (Polymarket fee formula + slippage)
│   └── live.py                   # CLOB API (EIP-712 signed orders on Polygon) — W.I.P
├── dashboard/
│   ├── cli.py                    # Textual TUI (positions, pipeline feed, calibration)
│   └── metrics.py                # Pipeline stage tracking (17 stages, TTL eviction)
├── learning/
│   └── analyze.py                # Daily analysis + auto-parameter adjustment
├── backtest/
│   ├── runner.py                 # Full pipeline replay engine
│   ├── walk_forward.py           # Walk-forward parameter optimization
│   ├── data_collector.py         # Background market snapshots
│   └── metrics.py                # Sharpe, Sortino, max drawdown, profit factor
├── db/
│   └── schema.py                 # SQLAlchemy models (6 tables) + auto-migration on startup
├── data/
│   ├── knowledge/                # Per-category knowledge files (evolving at runtime)
│   └── research/                 # LLM forecasting research notes
├── scripts/
│   ├── setup.sh                  # Install deps, configure environment
│   ├── run.sh                    # Start bot (--paper or --live)
│   ├── ctl.sh                    # Docker control (up, down, dashboard, logs)
│   ├── market_analyzer.py        # Full-universe market analysis (standalone)
│   └── live_trade.py             # Interactive live trade script
├── tests/
│   └── test_ai_validator.py      # Validator unit tests
├── Dockerfile                    # Multi-stage build, Node.js 20 + Claude CLI, non-root user
├── docker-compose.yml            # bot + collector services, named volume
└── requirements.txt
Database Schema (6 tables)
Table Purpose Key Fields
trades Audit log of every fill token_id, strategy, side, price, shares, pnl, outcome
predictions Every AI estimate condition_id, raw_prob, calibrated_prob, edge, brier_component
daily_snapshots End-of-day summary total_value, daily_pnl, strategy_breakdown (JSON)
position_snapshots Crash recovery token_id, shares, avg_entry_price, extensions_used
portfolio_state_snapshots Capital state available_capital, daily_start_value, realized_pnl
errors Error log timestamp, error_message, strategy, context

Portfolio state uses SQLite savepoints for atomic writes — crashes mid-checkpoint roll back cleanly.


Troubleshooting

Common issues
Problem Cause Fix
FileNotFoundError on first AI analysis Claude CLI not installed npm install -g @anthropic-ai/claude-code
--dry-run passes but bot fails on first tick --dry-run doesn't spawn Claude CLI Install CLI and set ANTHROPIC_API_KEY
RuntimeError: No Gemini API key found Config uses Gemini but no key Set GOOGLE_API_KEY or switch to claude
TUI not rendering Missing rich or textual pip install rich textual
Docker: knowledge files missing data/ excluded by .dockerignore Mount data/knowledge/ as volume
./scripts/run.sh trades real money Defaults to live mode Always pass --paper
WebSocket keeps disconnecting Network issues / API downtime Auto-reconnect; REST poller as fallback
Bot runs but never trades All strategies disabled or no markets match tag filter Check strategies.<name>.enabled and polymarket.relevant_tags in config.yaml
AI validator blocks all trades 3 consecutive Claude CLI errors → fail-closed Check ANTHROPIC_API_KEY, Claude CLI installation, subprocess availability
Config changes have no effect Changed research_provider (requires restart) Most settings hot-reload; provider changes need a restart
google-genai not installed research_provider: "gemini" but package missing pip install google-genai

Known Limitations

Limitation Details
No hard stop-loss Intentional for binary markets ($1/$0), but positions can drawdown before edge decay triggers
Calibration cold start Auto-tune requires 30+ resolved predictions per strategy; uses default alpha (1.3) until then
Gemini thinking + grounding Cannot be combined — research uses grounding only, estimation uses thinking only
Docker knowledge gap .dockerignore excludes data/; fresh deployments start without accumulated knowledge
Live mode is W.I.P execution/live.py exists but is not battle-tested — paper mode is the only fully validated execution path
Single-node only No distributed mode, Kubernetes, or cloud templates

Disclaimer

This is experimental software. Paper mode uses virtual capital — no financial risk. Live mode executes real trades on Polygon and can lose real money. Not financial advice.


MIT License