Statistical Arbitrage Engine built from scratch using Python.
This project demonstrates my interest in quantitative finance and statistical modeling by implementing a pair trading strategy —- from data acquisition and pair selection to signal generation, backtesting, and visualization. It applies mathematical techniques like cointegration, z-score normalization, and time series analysis to financial market data.
This engine identifies cointegrated pairs and simulates a mean-reversion trading strategy using the spread between their prices.
- Collects and preprocesses stock data using
yfinance. - Uses Engle-Granger cointegration tests to identify statistically viable pairs.
- Constructs a spread between the two assets and generates trading signals based on Z-score thresholds.
- Backtests the strategy with configurable parameters and visualizes the spread with entry/exit markers.
- Cointegration Testing for identifying long-term relationships between assets.
- Z-Score-based Signal Generation for statistical entry/exit rules.
- Backtesting with trade tracking, P&L logging, and optional slippage parameters.
- Data Cleaning, Resampling, and Visualization using Python’s data stack.
| File | Description |
|---|---|
main.py |
Orchestrates the workflow: fetch → analyze → signal → backtest → plot |
fetch_data.py |
Downloads OHLC data using yfinance |
cointegration.py |
Computes and filters statistically cointegrated pairs |
signals.py |
Creates entry/exit signals based on Z-score of spread |
backtester.py |
Executes trades, tracks P&L, logs strategy performance |
pair_analysis.py |
Filters and scores pairs based on correlation and stationarity |
plotter.py |
Visualizes spread and signals on a matplotlib chart |
streamlit_app.py |
Interactive Streamlit dashboard for parameter tuning and visualization |
Install dependencies using:
pip install -r requirements.txtrequirements.txt includes:
- numpy
- pandas
- matplotlib
- statsmodels
- yfinance
- streamlit
python main.pyYou can customize:
- The ticker universe (e.g., S&P 500 components)
- Start/end date
- Cointegration p-value threshold
- Z-score entry/exit rules
- Lookback window
Launch the interactive dashboard:
streamlit run streamlit_app.pyCustomize in the sidebar:
- Tickers (X & Y)
- Start/End Date
- Entry/Exit Z-score thresholds
- Rolling window size
- Notional value per trade
The dashboard outputs:
- Cointegration test results
- Spread & z-score plots
- Trade signals and timing
- Realistic cumulative PnL in terms of dollars
- Green ▲ = LONG entry
- Red ▼ = SHORT entry
- Blue ● = CLOSE (mean reversion)

