Polymarket Trading Bot

An autonomous trading agent for Polymarket. Claude analyzes markets without seeing the price,
calibrates its own confidence over time, and sizes every position with Kelly criterion.

Thesis · Quick Start · How It Works · Strategies · AI Pipeline · Edge & Sizing · Risk Management · Configuration · Deployment

Caution

Paper mode is the default — virtual capital, live market data, no risk. Live mode is W.I.P and not yet finalized. Start with paper.

_{Live TUI dashboard — portfolio stats, strategy activity, pipeline feed, recent trades, and streaming logs}

Thesis

Can an LLM find genuine edge in prediction markets, or is it just expensive noise?

This bot is a working experiment. It gives Claude autonomous decision-making power over real markets — then constrains that autonomy with calibration tracking, risk gates, and position sizing math. Every prediction is logged, scored, and used to correct future estimates. The architecture assumes the model is wrong by default and builds accountability into every layer.

Highlights

10 strategies _{6 AI analysts · weather ensemble · structural scanner · contrarian · sentiment — 4 active by default}	Anti-anchoring research _{Market price withheld during AI evidence gathering to prevent LLM anchoring bias}
Bayesian update _{Model confronts its blind estimate with market price and reasons about the gap — replaces mechanical blend}	Self-calibrating _{Platt alpha auto-tuned per strategy by minimizing Brier score over resolved predictions}
Dual LLM providers _{Gemini for grounded research · Claude for ensemble estimation · automatic fallback}	Thesis-based position management _{LLM re-evaluates held positions every 30 min with adaptive frequency by deadline}

Prerequisites

Requirement	Purpose	Required
Python 3.12+	Runtime	Yes
Node.js 20+	Claude CLI host	Yes
Claude CLI	AI analysis + validation	Yes
`ANTHROPIC_API_KEY`	Claude API access	Yes
`GOOGLE_API_KEY`	Gemini research provider	Recommended
Polymarket credentials	Live trading (CLOB signing)	Live only

Tip

scripts/setup.sh installs Python deps and generates .env, but does not install Node.js or Claude CLI. Without the CLI, the bot will start successfully but fail on the first AI analysis tick — --dry-run will not catch this.

Quick Start

git clone <repo-url> && cd polymarket-bot
./scripts/setup.sh                                    # creates venv, installs Python deps, generates .env
npm install -g @anthropic-ai/claude-code       # Claude CLI -- NOT installed by setup.sh
export ANTHROPIC_API_KEY="sk-ant-..."          # or add to .env (sourced by run.sh)

python main.py --dry-run          # verify setup (loads components, fetches markets, exits)
python main.py                    # paper trading with TUI dashboard ($1,000 virtual capital)
python main.py --logs             # paper trading with streaming logs (no TUI)

Note

scripts/setup.sh does not install Claude CLI or check for Node.js (20+ required). Without the CLI, the bot will fail on the first AI analysis — not at startup, so --dry-run will not catch it.

Warning

scripts/run.sh defaults to live mode. Always pass --paper explicitly. Running python main.py directly defaults to the mode in config.yaml (paper) — use this for the safest default behavior.

CLI Reference

Command	Purpose
`python main.py`	TUI dashboard mode (default)
`python main.py --logs`	Streaming colorized logs, no TUI
`python main.py --dry-run`	Load all components, fetch markets, exit
`python main.py --collect-data`	Snapshot-only mode for building backtest data
`python main.py --mode paper\|live`	Override config.yaml mode
`python main.py --config PATH`	Use alternate config file
`./scripts/run.sh --paper`	Run via wrapper (activates venv, sources .env)
`./scripts/ctl.sh up`	Docker: build and start in paper mode
`./scripts/ctl.sh dashboard`	Docker: attach to live TUI (detach: Ctrl+P, Ctrl+Q)

How It Works

The bot runs a single asyncio event loop. An asyncio.Queue receives events from three concurrent producers: a REST price poller (every 60s), a WebSocket feed (real-time), and a resolution poller (every 5 min). The EventHandler drains the queue and runs two paths concurrently on each tick.

Tick Lifecycle

flowchart TD
    A([Event from Queue]) --> B{Event type}

    B -- MARKET_UPDATE --> C[Update market state]
    B -- TIMER_TICK --> D[Increment tick<br>periodic checkpoint]
    B -- MARKET_RESOLVED --> RES[Settle positions at $1/$0<br>backfill Brier scores<br>update knowledge base]

    C & D --> FP

    subgraph FP["Fast Path — synchronous, every event"]
        FP1[Update portfolio prices<br>best_bid valuation] --> FP2[Exit manager<br>scans all positions]
        FP2 --> FP3{Exit triggered?}
        FP3 -- yes --> FP4[Execute SELL immediately<br>bypass risk gate + validator]
    end

    FP3 -- no --> G{TIMER_TICK<br>warmup done<br>prev task done?}
    FP4 --> G

    G -- yes --> SP
    G -- no --> WAIT([Wait for next event])

    subgraph SP["Slow Path — background asyncio.Task"]
        SP1[Run all strategies in parallel] --> SP2[Signal funnel<br>6-stage filter]
        SP2 --> SP3[Risk manager<br>11 checks per signal]
        SP3 --> SP4{Approved?}
        SP4 -- rejected --> SP5([Log rejection])
        SP4 -- approved --> SP6{AI analyst<br>source?}
        SP6 -- "yes → skip validator" --> SP7["Execute in parallel<br>asyncio.gather"]
        SP6 -- no --> SP8{AI validator<br>enabled?}
        SP8 -- yes --> SP9[Claude approval gate]
        SP9 -- approved --> SP7
        SP9 -- rejected --> SP10([AI REJECTED])
        SP8 -- "no → execute directly" --> SP7
        SP7 --> SP11[Paper or Live executor<br>portfolio update + DB log]
    end

    G -- "monitor interval<br>elapsed" --> MON

    subgraph MON["Position Monitor — background task, every 30 min"]
        MON1[Re-evaluate held positions via LLM<br>adaptive frequency by deadline] --> MON2{Verdict}
        MON2 -- "thesis valid" --> MON3([Hold])
        MON2 -- "extend hold" --> MON4([+24h, max 1 extension])
        MON2 -- "thesis invalid" --> MON5[Execute exit immediately]
    end

Path	Behavior
Fast path	Synchronous, every tick. Updates portfolio prices using `best_bid` (not midpoint — that's what you'd actually get on exit). Checks exit conditions: take-profit, edge decay, time expiry, approaching resolution. Exits execute immediately, bypassing risk gate and validator.
Slow path	Background `asyncio.Task`, TIMER_TICK only. Runs all strategies in parallel → 6-stage funnel → 11-check risk gate. AI analyst signals skip the Claude validator (already AI-sourced); other signals pass through the approval gate.
Position monitor	Separate background task, every 30 min. Re-evaluates held positions via LLM with adaptive frequency by deadline proximity. Can extend hold time (max 1 extension) or trigger immediate exit on thesis invalidation. Staleness failsafe: if monitor hasn't run in 2 cycles, edge decay re-enables.

Adaptive re-eval frequency

Time to Deadline	Re-eval Frequency
< 24 hours	Every 10-min cycle
1–3 days	Every 3rd cycle (30 min)
3–7 days	Every 5th cycle (50 min)
7+ days	Every 7th cycle (70 min)

System Overview

flowchart LR
    subgraph Ingestion["Data Ingestion"]
        MM["Market Monitor<br>1,500 markets<br>REST + WebSocket"]
    end

    subgraph Strategies["Strategy Layer"]
        AI["6 AI Analysts<br>politics / crypto / sports<br>econ / tech / general"]
        WX["Weather<br>GFS 30-member ensemble"]
        CX["Complexity<br>3 structural signals"]
    end

    subgraph Pipeline["Signal Pipeline"]
        SF["Signal Funnel<br>6-stage filter"]
        RM["Risk Manager<br>11-check gate"]
        AV["AI Validator<br>fail-closed"]
    end

    subgraph Exec["Execution"]
        P["Paper Simulator"]
        L["Live CLOB<br>(W.I.P)"]
    end

    subgraph State["State"]
        PORT["Portfolio<br>fee-inclusive<br>cost basis"]
        DB["SQLite"]
        KB["Knowledge Base<br>per-category"]
    end

    LLM["Claude / Gemini"] --> AI
    LLM --> AV
    MM --> AI & WX & CX
    AI & WX & CX --> SF --> RM --> AV
    AV --> P & L --> PORT --> DB
    AV --> KB

Tip

Resilience — Every long-running coroutine runs inside a supervisor with up to 50 restarts and exponential backoff (capped at 5 min). Portfolio checkpoints to SQLite every 10 ticks, surviving crashes and SIGKILL.

Config hot-reload — ConfigManager polls config.yaml every 30 seconds. Risk limits, strategy parameters, and exit profiles update inline — no restart needed. Research provider changes require a restart.

Strategies

AI Analysts (6 categories)

Each analyst inherits from ai_analyst_base.py which handles LLM calls, caching, Platt calibration, and Kelly sizing. Up to five LLM calls per market:

Research — 3-thread evidence gathering (base rate, current events, structural factors) via web search. Market price is deliberately withheld to prevent anchoring bias.
Independent estimation — three parallel LLM calls, each with its own context window. No analyst can see another's output. Median probability becomes the estimate; spread drives confidence.
Reconciliation (conditional) — when ensemble spread exceeds 0.15, a supervisor call identifies the source of disagreement and produces a reconciled estimate. Recovers markets that would otherwise be dropped.
Bayesian update — the model sees its blind estimate alongside the market price and reasons about whether the deviation is justified. Replaces the old mechanical blend formula.

Category	Tags	Perspectives	Min Edge	Kelly	Max Hold
Politics	elections, geopolitics	Historical · Current evidence · Structural	6%	0.15	48h
Crypto	BTC, ETH, SOL, tokens	Technical · Momentum · Sentiment	4%	0.10	24h
Sports	NBA, NFL, MLB, F1, MMA	Statistical · Matchup · Market	6%	0.15	24h
Economics	Fed, inflation, GDP, tariffs	Consensus · Data-driven · Surprise risk	5%	0.15	48h
Tech	AI, launches, semiconductors	Timeline · Technical · Strategic	5%	0.15	48h
General	everything unclaimed	Base-rate · Evidence · Contrarian	5%	0.20	120h

Advanced AI features

Feature	Description
Crypto enrichment	Live Binance technicals (RSI-14, SMA-20/50, EMA-12/26, VWAP-24h, funding rates) and Fear & Greed Index injected into the prompt — real data, not hallucination
Batch research	Markets sharing the same underlying asset (e.g., "Bitcoin $90k / $95k / $100k") are grouped into a single research call. Reduces LLM calls from N to 1+N
Cross-category dedup	Global claim registry (1-hour TTL) prevents multiple categories from analyzing the same market
Sibling co-evaluation	Timeframe siblings (same event, different deadlines) pulled into the same tick; funnel keeps the best-scoring variant
Cross-market coherence	Already-estimated probabilities from related markets are injected as priors, enforcing consistency across outcome variants
Category exclusion	Politics excludes oil/crypto/commodities; Economics excludes crypto. Each market routes to exactly one specialist
3-thread research	Base rate, current events, and structural factors searched in parallel within a single Gemini call — finds evidence single-query misses (e.g., electoral system rules, pollster bias)
Supervisor reconciliation	When ensemble spread > 0.15, a supervisor call identifies the source of disagreement and reconciles. Recovers ~30% of markets that would otherwise be dropped
Bayesian update	Blind estimate confronted with market price — model reasons about whether its deviation is justified. Replaces mechanical blend formula
Timeout backoff	Markets causing LLM timeouts get geometric cooldowns: 15 min → 30 min → 1h → 2h → 4h max

Other Strategies

Strategy	Type	Description
Weather	GFS Ensemble	No LLM calls. Open-Meteo 30-member ensemble forecasts. Parse question → geocode → fetch forecast → count members per bucket → trade mispriced outcomes. Supports temp, precip, snowfall, wind.
Complexity	Structural Scanner	Zero LLM calls. Three signals: complement spread (informed flow), volume spike (detection-only), resolution proximity (midpoint ~50%, expiry <48h).
Contrarian	Mechanical	Disabled by default. Bets against extreme consensus: >95% → NO, <5% → YES. Requires 48h+ to expiry.
Sentiment	News-driven	Disabled by default. NewsAPI headlines + LLM impact assessment. 5-min cooldown per token.

AI Analysis Pipeline

flowchart TD
    A["Market candidates<br>volume-sorted, liquid first"] --> B["Phase A: Research<br>Gemini + Google Search grounding<br>or Claude CLI + web search"]

    B --> |"price WITHHELD<br>prevents anchoring"| C["Phase B: Independent Estimation<br>3 parallel LLM calls<br>separate context windows"]

    C --> REC{"Spread > 0.15?"}
    REC -- yes --> RECON["Supervisor Reconciliation<br>identify disagreement source<br>produce reconciled estimate"]
    RECON --> D["Platt Scaling<br>correct RLHF underconfidence<br>alpha configurable per strategy"]
    REC -- no --> D

    D --> BAY["Bayesian Update<br>confront blind estimate with market price<br>model reasons about the gap"]

    BAY --> E["Edge Calculation<br>subtract fees + spread + slippage<br>category-aware (see below)"]

    E --> F{Edge Validation}
    F --> |"net edge > 30%<br>in liquid market"| G["Reject: implausible<br>model error"]
    F --> |"edge < min_edge"| H["Insufficient edge<br>not deduped, retried"]
    F --> |"stale price<br>0.48-0.52, low vol"| I["Skip: no reliable<br>price signal"]
    F --> |"valid edge"| K["Kelly Sizing<br>uncertainty discount<br>+ inventory adjustment"]
    K --> J["SIGNAL<br>deduped for 1 hour"]

Dual providers — Gemini 2.5 Flash Lite for 3-thread research (native Google Search grounding), Claude for independent parallel estimation (3 calls, separate context windows). Research uses grounding only — no thinking — combining both is a known Gemini API bug. Automatic Claude CLI fallback on Gemini failure.

Online learning — When a position resolves, Claude extracts one actionable lesson and appends it to data/knowledge/{category}.md. These lessons are injected into future prompts (capped at 100 lines per category).

EdgeStatus dedup — SIGNAL, IMPLAUSIBLE_EDGE, and ZERO_SIZE permanently dedup a market. INSUFFICIENT_EDGE and LOW_CONFIDENCE are intentionally retried — they depend on price movement.

Calibration Lifecycle

flowchart TD
    A["AI estimates probability<br>raw_prob → Platt scaling → blended_prob"] --> B["Log to predictions table<br>raw, calibrated, blended, market_price, edge"]
    B --> C["Trade executes<br>edge_status = SIGNAL"]

    D["Resolution Poller<br>every 30 min"] --> E{Market resolved?}
    E -- no --> D
    E -- yes --> F["Backfill outcome<br>1.0 = YES won, 0.0 = NO won"]
    F --> G["Compute Brier component<br>(calibrated_prob - outcome)²"]
    G --> H{"30+ resolved<br>for this strategy?"}
    H -- no --> I["Use default alpha<br>(1.3)"]
    H -- yes --> J["Grid-search alpha<br>0.8 to 2.5, step 0.05<br>minimize Brier score"]
    J --> K["Update strategy's<br>platt_alpha in memory"]
    K --> A

RLHF training makes LLMs systematically under-confident — probabilities cluster toward 50%. Platt scaling amplifies log-odds: calibrated = sigmoid(alpha * log(p / (1-p))). Default alpha 1.3; auto-tuned once 30+ predictions resolve per strategy.

The dashboard shows the aggregate Brier score with quality labels (excellent < 0.10 · good < 0.20 · fair < 0.30 · poor).

Edge & Sizing Mathematics

Three mathematical stages transform a calibrated probability into a sized trade signal. Each stage is designed to correct for a specific class of error: blend weighting corrects for LLM anchoring risk, edge calculation corrects for transaction costs, and Kelly sizing corrects for estimation uncertainty.

Bayesian Update

After Platt scaling, the model confronts its blind estimate with the market price and reasons about the gap. This replaces the old mechanical blend formula (weighted average of LLM and market). The model explicitly evaluates whether its deviation from the market is justified by specific information, and adjusts accordingly.

flowchart TD
    A["calibrated_prob<br>Platt-scaled median of ensemble"] --> C["Bayesian Update Call"]
    B["market_price<br>midpoint from order book"] --> C

    C --> D{"Is deviation<br>justified?"}

    D -- "YES: specific info<br>market hasn't priced in" --> E["Keep estimate or<br>move slightly toward market"]
    D -- "NO: market with this<br>volume likely knows more" --> F["Move substantially<br>toward market price"]

    E --> G["adjusted_prob<br>with reasoning"]
    F --> G

    G --> H["Safety: direction preserved<br>update can't flip the edge"]

The Bayesian update produces the most interpretable reasoning in the whole pipeline. Example: blind estimate 0.63, market 0.34. Old formula: 0.58 (mechanical). Bayesian: 0.44 with reasoning: "over-weighted structural factors, market at this volume prices polling correctly."

Edge Calculation Pipeline

Net edge deducts all real-world costs from the raw probability edge — category-aware Polymarket fees, half-spread, and slippage — before comparing to the strategy's minimum edge threshold.

flowchart TD
    A["blended_prob"] --> B["raw_edge = blended_prob - market_price"]

    B --> C{Category}

    C -- "crypto" --> D["entry_fee = p * 0.25 * (p*(1-p))^2<br>exit_fee = ep * 0.25 * (ep*(1-ep))^2"]
    C -- "sports" --> E["entry_fee = p * 0.0175 * p*(1-p)<br>exit_fee = ep * 0.0175 * ep*(1-ep)"]
    C -- "politics / economics<br>tech / general" --> F["entry_fee = 0<br>exit_fee = 0"]

    D & E & F --> G["spread_cost = market_spread / 2"]

    G --> H["net_edge = |raw_edge|<br>- spread_cost<br>- entry_fee - exit_fee<br>- slippage_pct"]

    H --> I{net_edge < min_edge?}
    I -- yes --> J["INSUFFICIENT_EDGE<br>not deduped — retried"]
    I -- no --> K{net_edge > 30%?}
    K -- yes --> L["IMPLAUSIBLE_EDGE<br>deduped — likely model error"]
    K -- no --> M["Kelly sizing →"]

Note

Politics, economics, tech, and general markets have zero fees on Polymarket (2026). Only crypto and sports incur fees, with crypto fees peaking at ~1.56% at the midpoint and sports at ~0.44%. The same fee formulas are applied in both edge calculation and paper trade simulation for consistency.

Kelly Sizing Pipeline

Kelly criterion sizes each bet proportional to edge, then applies three sequential discounts: an uncertainty penalty from ensemble confidence, a per-strategy fractional Kelly cap, and an inventory adjustment that reduces size as topic exposure grows.

flowchart TD
    A["true_prob<br>calibrated, trade-side adjusted"] --> B["b = (1 / price) - 1<br>implied payout odds"]

    B --> C{confidence<br>provided?}
    C -- yes --> D["uncertainty = clamp((1 - conf) * 0.15, 0.02, 0.08)<br>p = true_prob - uncertainty"]
    C -- no --> E["p = true_prob"]

    D & E --> F["q = 1 - p<br>kelly = (b*p - q) / b"]

    F --> G{kelly <= 0?}
    G -- yes --> H["Return $0 — no edge"]

    G -- no --> I["fraction = kelly * kelly_fraction<br>0.10 - 0.20 per strategy"]

    I --> J["Topic inventory check<br>count positions in same cluster<br>(BTC, Trump, Fed, oil, ...)"]
    J --> K["inventory_scale = max(0, 1 - count / max)<br>linear decay to zero"]

    K --> L["bet = fraction * inventory_scale * portfolio_value"]
    L --> M["Final = min(bet, max_bet_usd)"]

Discount	Source	Effect
Uncertainty penalty	Ensemble spread → confidence	Low agreement → subtract 2–8% from probability before Kelly
Fractional Kelly	Per-strategy config (0.10–0.20)	Caps bet at 10–20% of full Kelly — reduces variance
Inventory adjustment	Open positions in same topic	Linear decay: 3 BTC positions out of max 3 → Kelly multiplied by 0

Risk Management

Every signal passes a sequential gate — any single check can reject:

#	Check	Prevents
0	Halt status	Trading during drawdown circuit breaker
1a	Duplicate guard	Doubling into already-held positions
1b	Max open positions	Exceeding position cap (25)
2	Liquidity + spread	Zero-volume, crossed, or wide-spread markets
3	Price sanity	Prices outside 0.01–0.99
4	Position size cap	Single position > 2% of portfolio
5	Strategy budget	Strategy exceeding `max_capital_usd` ceiling
6	Daily drawdown	Daily PnL < -10% → halt all trading
7	Event correlation	Max 3 positions per event
8	Topic correlation	Max 2 positions per topic (BTC, ETH, Trump, Fed, oil, etc.)
9	Balance check	Insufficient capital (with $5 buffer)
10	Deployment pacing	Max 5% deployed per rolling hour

Important

Position count and loss cooldown are re-checked at execution time (not just risk-check time), closing race conditions where parallel asyncio.gather executions could collectively exceed limits.

Position Lifecycle

flowchart TD
    A([BUY order fills]) --> B[Create Position<br>fee-inclusive cost basis]
    B --> C[Compute exit params<br>TP price / max hold / edge decay]
    C --> D([Position held<br>updated every tick])

    D --> E{Exit Manager<br>checks every tick}
    E -- "price >= take_profit" --> F[TAKE_PROFIT]
    E -- "price drop >30%<br>OR time >60%" --> G[EDGE_DECAYED]
    E -- "max hold exceeded" --> H[TIME_EXPIRY]
    E -- "near resolution" --> I[APPROACHING_EXPIRY]
    E -- "no exit trigger" --> D

    D --> J{Position Monitor<br>every 30 min}
    J -- "thesis still valid" --> D
    J -- "extend hold time" --> K["+24h<br>max 1 extension<br>capped by market deadline"]
    K --> D
    J -- "thesis invalidated" --> L[THESIS_EXIT]

    F & G & H & I & L --> M[Execute SELL<br>bypass risk gate]
    M --> N[Calculate realized PnL]
    N --> O[Log trade + update DB]
    O --> P{Loss?}
    P -- yes --> Q[6-hour cooldown<br>on this market]
    P -- no --> R[Update knowledge file<br>via Claude]

Binary prediction markets pay $1 or $0 at resolution — interim dips are usually noise, not thesis invalidation. No hard stop-loss or trailing stop. Instead:

Exit Type	Trigger	Rationale
Take-profit	7–14% gain (strategy-dependent)	Prediction markets rarely offer more
Edge decayed	Price drops >30% OR >60% of hold elapsed	OR-logic soft stop-loss
Time expiry	Max hold exceeded (24h–120h)	Limits capital lock-up
Approaching expiry	Near resolution deadline (2h buffer)	Avoid liquidity drought
Thesis invalidation	Monitor LLM determines thesis invalid	Fundamental outlook change
Monitor staleness	2 consecutive missed re-evals	Failsafe: re-enables edge decay

Additional exit safeguards

Loss cooldown — Same market blocked for 6 hours after a loss. Checked at both strategy selection and execution stages.
Complement liquidity — NO token buys require complement best_bid >= $0.05. Illiquid complements are permanently deduped.
Minimum hold — Exit checks suppressed for 60s after entry, preventing same-tick buy/sell oscillation.

Configuration

All settings live in config.yaml. Changes take effect within 30 seconds via hot-reload — no restart needed (except research provider).

Section	Key Settings	Default
`mode`	`"paper"` or `"live"`	`"paper"`
`risk.max_position_pct`	Max single position as % of portfolio	2%
`risk.daily_drawdown_limit_pct`	Daily drawdown halt threshold	10%
`risk.max_open_positions`	Concurrent position cap	25
`risk.max_deployment_per_hour_pct`	Hourly deployment pacing	5%
`strategies.<name>.min_edge`	Minimum edge to signal	4–6%
`strategies.<name>.kelly_fraction`	Kelly sizing fraction	0.10–0.20
`strategies.<name>.platt_alpha`	Platt scaling alpha (>1 = away from 50%)	1.3
`research.research_provider`	`"gemini"` or `"claude"` for Phase A	`"gemini"`
`research.estimation_provider`	`"claude"` or `"gemini"` for Phase B	`"claude"`
`strategies.<name>.reconciliation_spread_threshold`	Ensemble spread triggering supervisor reconciliation	0.15
`ai_validation.enabled`	Claude approval gate for non-AI signals	`true`
`position_monitor.max_extensions`	Max hold-time extensions per position	1
`paper_trading.initial_balance_usd`	Starting virtual capital	$1,000

Environment Variables

Variable	Required	Purpose
`ANTHROPIC_API_KEY`	Yes	Claude CLI (all AI strategies + validator). Must be in the shell environment.
`GOOGLE_API_KEY`	If using Gemini	Gemini research provider. Raises `RuntimeError` if missing.
`GEMINI_API_KEY`	No	Alternative to `GOOGLE_API_KEY` (checked second)
`POLYMARKET_PRIVATE_KEY`	Live only	EOA private key for CLOB order signing
`POLYMARKET_FUNDER_ADDRESS`	Live only	Polymarket proxy wallet address
`DATABASE_URL`	No	Override default SQLite path (`sqlite:///bot_data.db`)

[!NOTE] OpenAI and NewsAPI keys are read from config.yaml, not environment variables.

Data Sources

Source	Used By	Key	Notes
Polymarket Gamma API	Market discovery, resolution	—	Rate-limited 5 req/s
Polymarket CLOB API	Order books, execution	Live	Circuit breaker: 5 failures → 60s cooldown
Polymarket WebSocket	Real-time bid/ask	—	Auto-reconnect with backoff
Claude CLI	Estimation, validation, learning	Yes	Subprocess; max 3 concurrent
Gemini 2.5 Flash Lite	Grounded research	Yes	Auto-fallback to Claude
Open-Meteo	GFS weather (30 members)	—	Free; 10K req/day
Binance API	Crypto technicals, funding	—	60s cache
CoinGecko	Crypto spot prices	—	60s cache
Alternative.me	Fear & Greed Index	—	Crypto analyst
NewsAPI	Breaking news	Yes	Sentiment only (disabled)

Deployment

Local

./scripts/setup.sh                                    # one-time: venv + Python deps + .env
npm install -g @anthropic-ai/claude-code       # one-time: Claude CLI (not in setup.sh)
export ANTHROPIC_API_KEY="sk-ant-..."          # or add to .env

./scripts/run.sh --paper                              # paper trading (ALWAYS pass --paper explicitly)
./scripts/run.sh --paper --logs                       # streaming logs instead of TUI

Warning

scripts/run.sh defaults to live mode (line 8: MODE="live"). If credentials are set in .env, running without --paper will trade real money. Use python main.py directly for the safest default (paper via config.yaml).

Docker

cp .env.example .env                          # fill in API keys
./scripts/ctl.sh up                           # build and start (paper mode by default)
./scripts/ctl.sh dashboard                    # attach to live TUI
./scripts/ctl.sh logs bot                     # tail log file
./scripts/ctl.sh down                         # stop everything

Two containers: bot (trading engine + TUI) and collector (background snapshots for backtesting). Data persists in a Docker volume at /app/data. The image includes Node.js 20 and Claude CLI automatically.

All Docker commands

Command	Purpose
`./scripts/ctl.sh up`	Build and start bot + collector (paper mode)
`./scripts/ctl.sh up --live`	Start in live mode (credential check first)
`./scripts/ctl.sh down`	Stop all services
`./scripts/ctl.sh status`	Show container status
`./scripts/ctl.sh dashboard`	Attach to TUI — detach: Ctrl+P, Ctrl+Q
`./scripts/ctl.sh logs bot\|collector`	Tail container logs
`./scripts/ctl.sh backtest --start DATE --end DATE`	Run backtest in container
`./scripts/ctl.sh build`	Rebuild images
`./scripts/ctl.sh restart`	Restart all services

Backtesting

python main.py --collect-data                                         # collect snapshots
python -m backtest.runner --start 2025-01-01 --end 2025-03-01        # replay through pipeline
python -m backtest.walk_forward --start 2025-01-01 --end 2025-03-01  # rolling optimization

Metrics: win rate · PnL · profit factor · expectancy · max drawdown · Sharpe · Sortino.

Project Structure

Full file tree

polymarket-bot/
├── main.py                       # Entry point, supervisor restart loops, daily reset
├── config.yaml                   # All settings (hot-reloadable, 30s poll)
├── core/
│   ├── event_handler.py          # Tick loop, fast/slow path split, position monitor
│   ├── market_monitor.py         # REST + WebSocket + cross-book price sync
│   ├── portfolio.py              # Positions, PnL, fee-inclusive cost basis (best_bid valuation)
│   ├── exit_manager.py           # Take-profit, edge decay, expiry, thesis exits
│   ├── signal_funnel.py          # 6-stage filter: confidence → cap → dedup → rank → global cap
│   ├── ai_validator.py           # Claude approval gate (fail-closed) + knowledge learning
│   ├── calibration_tracker.py    # Prediction logging, Brier scores, Platt alpha auto-tuning
│   ├── config_manager.py         # Hot-reload config watcher (30s mtime poll)
│   ├── models.py                 # Signal, Order, Position, MarketState dataclasses
│   └── resilience.py             # CircuitBreaker, RateLimiter, retry_with_backoff
├── strategies/
│   ├── base.py                   # BaseStrategy + StrategyEngine (3-phase parallel dispatch)
│   ├── ai_analyst_base.py        # Shared AI logic: cache, Kelly, Platt, claims, dedup, thesis
│   ├── ai_analyst.py             # 6 category subclasses with specialized prompts
│   ├── gemini_research.py        # Gemini provider with Google Search grounding
│   ├── market_grouper.py         # Batch research grouping by underlying asset (regex)
│   ├── weather.py                # GFS ensemble forecasting (no LLM)
│   ├── open_meteo.py             # Geocoding + ensemble API client
│   ├── complexity.py             # Market structure scanner (3 structural signals, no LLM)
│   ├── contrarian.py             # Extreme consensus reversal strategy
│   ├── sentiment.py              # News-driven LLM impact assessment
│   ├── crypto_data.py            # Binance technicals + Fear & Greed
│   ├── crypto_prices.py          # CoinGecko price fetcher
│   ├── signal_funnel.py          # Per-category signal caps before risk manager
│   └── calibration.py            # Platt scaling function
├── risk/
│   └── manager.py                # 11-check sequential risk gate
├── execution/
│   ├── executor.py               # Routes to paper or live
│   ├── paper.py                  # Paper simulator (Polymarket fee formula + slippage)
│   └── live.py                   # CLOB API (EIP-712 signed orders on Polygon) — W.I.P
├── dashboard/
│   ├── cli.py                    # Textual TUI (positions, pipeline feed, calibration)
│   └── metrics.py                # Pipeline stage tracking (17 stages, TTL eviction)
├── learning/
│   └── analyze.py                # Daily analysis + auto-parameter adjustment
├── backtest/
│   ├── runner.py                 # Full pipeline replay engine
│   ├── walk_forward.py           # Walk-forward parameter optimization
│   ├── data_collector.py         # Background market snapshots
│   └── metrics.py                # Sharpe, Sortino, max drawdown, profit factor
├── db/
│   └── schema.py                 # SQLAlchemy models (6 tables) + auto-migration on startup
├── data/
│   ├── knowledge/                # Per-category knowledge files (evolving at runtime)
│   └── research/                 # LLM forecasting research notes
├── scripts/
│   ├── setup.sh                  # Install deps, configure environment
│   ├── run.sh                    # Start bot (--paper or --live)
│   ├── ctl.sh                    # Docker control (up, down, dashboard, logs)
│   ├── market_analyzer.py        # Full-universe market analysis (standalone)
│   └── live_trade.py             # Interactive live trade script
├── tests/
│   └── test_ai_validator.py      # Validator unit tests
├── Dockerfile                    # Multi-stage build, Node.js 20 + Claude CLI, non-root user
├── docker-compose.yml            # bot + collector services, named volume
└── requirements.txt

Database Schema (6 tables)

Table	Purpose	Key Fields
`trades`	Audit log of every fill	`token_id`, `strategy`, `side`, `price`, `shares`, `pnl`, `outcome`
`predictions`	Every AI estimate	`condition_id`, `raw_prob`, `calibrated_prob`, `edge`, `brier_component`
`daily_snapshots`	End-of-day summary	`total_value`, `daily_pnl`, `strategy_breakdown` (JSON)
`position_snapshots`	Crash recovery	`token_id`, `shares`, `avg_entry_price`, `extensions_used`
`portfolio_state_snapshots`	Capital state	`available_capital`, `daily_start_value`, `realized_pnl`
`errors`	Error log	`timestamp`, `error_message`, `strategy`, `context`

Portfolio state uses SQLite savepoints for atomic writes — crashes mid-checkpoint roll back cleanly.

Troubleshooting

Common issues

Problem	Cause	Fix
`FileNotFoundError` on first AI analysis	Claude CLI not installed	`npm install -g @anthropic-ai/claude-code`
`--dry-run` passes but bot fails on first tick	`--dry-run` doesn't spawn Claude CLI	Install CLI and set `ANTHROPIC_API_KEY`
`RuntimeError: No Gemini API key found`	Config uses Gemini but no key	Set `GOOGLE_API_KEY` or switch to `claude`
TUI not rendering	Missing `rich` or `textual`	`pip install rich textual`
Docker: knowledge files missing	`data/` excluded by `.dockerignore`	Mount `data/knowledge/` as volume
`./scripts/run.sh` trades real money	Defaults to live mode	Always pass `--paper`
WebSocket keeps disconnecting	Network issues / API downtime	Auto-reconnect; REST poller as fallback
Bot runs but never trades	All strategies disabled or no markets match tag filter	Check `strategies.<name>.enabled` and `polymarket.relevant_tags` in config.yaml
AI validator blocks all trades	3 consecutive Claude CLI errors → fail-closed	Check `ANTHROPIC_API_KEY`, Claude CLI installation, subprocess availability
Config changes have no effect	Changed `research_provider` (requires restart)	Most settings hot-reload; provider changes need a restart
`google-genai not installed`	`research_provider: "gemini"` but package missing	`pip install google-genai`

Known Limitations

Limitation	Details
No hard stop-loss	Intentional for binary markets ($1/$0), but positions can drawdown before edge decay triggers
Calibration cold start	Auto-tune requires 30+ resolved predictions per strategy; uses default alpha (1.3) until then
Gemini thinking + grounding	Cannot be combined — research uses grounding only, estimation uses thinking only
Docker knowledge gap	`.dockerignore` excludes `data/`; fresh deployments start without accumulated knowledge
Live mode is W.I.P	`execution/live.py` exists but is not battle-tested — paper mode is the only fully validated execution path
Single-node only	No distributed mode, Kubernetes, or cloud templates

Disclaimer

This is experimental software. Paper mode uses virtual capital — no financial risk. Live mode executes real trades on Polygon and can lose real money. Not financial advice.

MIT License

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Polymarket Trading Bot

Thesis

Highlights

Prerequisites

Quick Start

How It Works

Tick Lifecycle

System Overview

Strategies

AI Analysts (6 categories)

Other Strategies

AI Analysis Pipeline

Calibration Lifecycle

Edge & Sizing Mathematics

Bayesian Update

Edge Calculation Pipeline

Kelly Sizing Pipeline

Risk Management

Position Lifecycle

Configuration

Deployment

Local

Docker

Backtesting

Project Structure

Troubleshooting

Known Limitations

Disclaimer

About

Uh oh!

Releases 1

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.github		.github
assets		assets
backtest		backtest
core		core
dashboard		dashboard
data		data
db		db
execution		execution
learning		learning
risk		risk
scripts		scripts
strategies		strategies
tests		tests
.dockerignore		.dockerignore
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
config.yaml		config.yaml
docker-compose.yml		docker-compose.yml
main.py		main.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Polymarket Trading Bot

Thesis

Highlights

Prerequisites

Quick Start

How It Works

Tick Lifecycle

System Overview

Strategies

AI Analysts (6 categories)

Other Strategies

AI Analysis Pipeline

Calibration Lifecycle

Edge & Sizing Mathematics

Bayesian Update

Edge Calculation Pipeline

Kelly Sizing Pipeline

Risk Management

Position Lifecycle

Configuration

Deployment

Local

Docker

Backtesting

Project Structure

Troubleshooting

Known Limitations

Disclaimer

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 1

Contributors

Uh oh!

Languages