Skip to content

DVampire/FinWorld

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

21 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

FinWorld: An All-in-One Open-Source Platform for End-to-End Financial AI Research and Deployment

Python 3.11 License: MIT arXiv Website

⚠️ Preview Version Notice: This is currently a preview version of FinWorld. Some code components are still being updated and may not be fully available in this repository. We appreciate your patience and look forward to providing the complete implementation soon. Thank you for your understanding!

FinWorld is a comprehensive, all-in-one open-source platform that provides end-to-end support for the entire financial AI workflow, from data acquisition to experimentation and deployment. Built on the foundation of unified AI paradigms, heterogeneous data integration, and advanced agent automation, FinWorld addresses the critical limitations of existing financial AI platforms.

Paper: FinWorld: An All-in-One Open-Source Platform for End-to-End Financial AI Research and Deployment (Arxiv) Website: FinWorld Project Website - Interactive demo and detailed results

FinWorld Architecture Figure 1: Overview of FinWorld's comprehensive architecture and workflow

🎯 Research Contributions

Our main contributions are threefold:

  1. Unified Framework: We propose a unified, end-to-end framework for training and evaluation of ML, DL, RL, LLMs, and LLM agents, covering four critical financial AI task types including time series forecasting, algorithmic trading, portfolio management, and LLM applications.

  2. Modular Design: The framework features a modular architecture that enables flexible construction of custom models and tasks, including the development of personalized LLM agents. The system supports efficient distributed training and testing across multiple environments.

  3. Comprehensive Benchmark: We provide support for multimodal heterogeneous data with over 800 million samples, establishing a comprehensive benchmark for the financial AI community. Extensive experiments across four task types demonstrate the framework's flexibility and effectiveness.

πŸš€ Key Features

πŸ”„ Multi-task Support

  • Time Series Forecasting: Advanced models including Autoformer, Crossformer, DLinear, TimesNet, PatchTST, TimeMixer, and TimeXer
  • Algorithmic Trading: RL-based (PPO, SAC) and ML/DL-based trading strategies
  • Portfolio Management: Multi-asset portfolio optimization with various risk measures
  • LLM Applications: Financial reasoning and LLM agent training with RL fine-tuning

πŸ“Š Multimodal Data Integration

  • Structured Data: OHLCV price data, technical indicators, financial factors
  • Unstructured Data: News articles, financial reports, earnings calls
  • Multi-source Support: FMP, Alpaca, AKShare, TuShare APIs
  • Market Coverage: DJ30, SP500, SSE50, HS300 across US and Chinese markets

🧠 Comprehensive AI Paradigms

  • Machine Learning: LightGBM, XGBoost, and traditional ML models
  • Deep Learning: Transformer, LSTM, and state-of-the-art architectures
  • Reinforcement Learning: PPO, SAC, and custom RL algorithms
  • Large Language Models: GPT-4.1, Claude-4-Sonnet, Qwen series, and custom LLMs
  • LLM Agents: Multi-agent systems with tool use and reasoning capabilities

πŸ› οΈ Advanced Automation

  • Distributed Training: Multi-GPU training and testing support
  • Auto Presentation: Automated report generation and visualization
  • Experiment Tracking: Integration with WandB and TensorBoard
  • Modular Architecture: Extensible framework for rapid prototyping

πŸ“‹ Requirements

  • Python 3.11+
  • CUDA 12.4+ (for GPU acceleration)
  • Conda or Miniconda
  • 16+ GB RAM (recommended for large datasets)

πŸ› οΈ Installation

1. Create Conda Environment

conda create -n finworld python=3.11
conda activate finworld

2. Install Dependencies

# Install base dependencies
make install-base

# Install browser automation tools
make install-browser

# Install VERL framework
make install-verl

Alternative Installation with Poetry

# Install Poetry
pip install poetry

# Install dependencies
poetry install

πŸš€ Quick Start

1. Download Financial Data

# Download DJ30 data (example)
python scripts/download/download.py --config configs/download/dj30/dj30_fmp_price_1day.py
python scripts/download/download.py --config configs/download/dj30/dj30_fmp_price_1min.py

2. Train RL Trading Models

# Train PPO trading models for multiple stocks
CUDA_VISIBLE_DEVICES=0 python scripts/rl_trading/train.py --config=configs/rl_trading/ppo/AAPL_ppo_trading.py

3. Train Portfolio Models

# Train PPO portfolio models for different indices
CUDA_VISIBLE_DEVICES=0 python scripts/rl_portfolio/train.py --config=configs/rl_portfolio/ppo/dj30_ppo_portfolio.py

4. Use Pre-built Scripts

# Run example scripts
bash examples/ppo_trading.sh
bash examples/ppo_portfolio.sh
bash examples/download.sh

πŸ“Š Empirical Results

Time Series Forecasting

Our comprehensive evaluation on DJ30 and HS300 datasets demonstrates the superiority of deep learning approaches:

  • TimeXer achieves MAE of 0.0529 and MSE of 0.0062 on DJ30, significantly outperforming LightGBM (MAE: 0.1392, MSE: 0.0235)
  • TimeMixer and TimeXer show superior performance on HS300 with MAEs of 0.3804 and 0.3727 respectively
  • Deep learning models consistently achieve higher RankICIR scores compared to traditional ML methods

Algorithmic Trading

RL-based methods demonstrate clear advantages in trading performance:

  • SAC achieves 101.55% ARR on TSLA with superior risk-adjusted returns
  • PPO attains 2.10 SR on META, outperforming all baseline methods
  • RL methods consistently deliver higher returns and better risk metrics across all evaluated stocks

Portfolio Management

RL-based portfolio optimization shows significant improvements:

  • SAC achieves up to 31.2% annualized returns on SP500 with Sharpe ratios above 1.5
  • RL methods consistently outperform rule-based and ML-based approaches
  • Superior risk-adjusted performance across all major indices

LLM Applications

Our FinReasoner model demonstrates state-of-the-art performance:

  • Financial Reasoning: Leads all four benchmarks (FinQA, FinEval, ConvFinQA, CFLUE)
  • Trading Capabilities: Strong performance across all evaluated stocks with comprehensive risk management
  • Domain-specific Training: Outperforms generic instruction-tuned models

πŸ—οΈ Architecture Overview

FinWorld employs a layered, object-oriented architecture with seven core layers:

1. Configuration Layer

  • Built on mmengine for unified experiment management
  • Registry mechanism for flexible component management
  • Support for configuration inheritance and overrides

2. Dataset Layer

  • Downloader Module: Multi-source data acquisition (FMP, Alpaca, AKShare)
  • Processor Module: Feature engineering and preprocessing (Alpha158 factors)
  • Dataset Module: Task-specific data organization
  • Environment Module: RL environment encapsulation

3. Model Layer

  • ML Models: Traditional ML algorithms (LightGBM, XGBoost)
  • DL Models: Neural architectures (Transformer, LSTM, VAE)
  • RL Models: Actor-critic networks with financial constraints
  • LLM Models: Unified interface for commercial and open-source LLMs

4. Training Layer

  • Optimizer: Adam, AdamW, SGD with gradient centralization
  • Loss: Regression, classification, and RL surrogate losses
  • Scheduler: Cosine, linear, and adaptive learning rate scheduling
  • Trainer: Task-specific training pipelines with distributed support

5. Evaluation Layer

  • Metrics: Financial-specific metrics (ARR, SR, MDD, CR, SoR)
  • Visualization: K-line charts, cumulative returns, compass plots
  • Standardized Protocols: Consistent evaluation across all tasks

6. Task Layer

  • Time Series Forecasting: Multi-step prediction with various horizons
  • Algorithmic Trading: Single-asset trading with risk management
  • Portfolio Management: Multi-asset allocation with constraints
  • LLM Applications: Financial reasoning and agent training

7. Presentation Layer

  • Auto-reporting: LaTeX technical reports and HTML dashboards
  • Multi-channel Publishing: GitHub, GitHub Pages, and experiment tracking
  • Version Control: Systematic archiving and knowledge transfer

πŸ“ Project Structure

FinWorld/
β”œβ”€β”€ configs/                    # Configuration files
β”‚   β”œβ”€β”€ _asset_list_/          # Asset list configurations
β”‚   β”œβ”€β”€ agent/                 # Agent configurations
β”‚   β”œβ”€β”€ download/              # Data download configurations
β”‚   β”œβ”€β”€ finreasoner/           # Financial reasoning configs
β”‚   β”œβ”€β”€ ml_portfolio/          # ML portfolio configurations
β”‚   β”œβ”€β”€ ml_trading/            # ML trading configurations
β”‚   β”œβ”€β”€ process/               # Data processing configurations
β”‚   β”œβ”€β”€ rl_portfolio/          # RL portfolio configurations
β”‚   β”œβ”€β”€ rl_trading/            # RL trading configurations
β”‚   β”œβ”€β”€ rule_portfolio/        # Rule-based portfolio configs
β”‚   β”œβ”€β”€ rule_trading/          # Rule-based trading configs
β”‚   β”œβ”€β”€ storm/                 # Storm framework configs
β”‚   β”œβ”€β”€ time/                  # Time series model configs
β”‚   └── vae/                   # VAE model configurations
β”œβ”€β”€ finworld/                  # Core framework
β”‚   β”œβ”€β”€ agent/                 # Multi-agent system
β”‚   β”œβ”€β”€ base/                  # Base classes and utilities
β”‚   β”œβ”€β”€ calendar/              # Calendar management
β”‚   β”œβ”€β”€ config/                # Configuration management
β”‚   β”œβ”€β”€ data/                  # Data processing modules
β”‚   β”œβ”€β”€ diffusion/             # Diffusion models
β”‚   β”œβ”€β”€ downloader/            # Data downloaders
β”‚   β”œβ”€β”€ downstream/            # Downstream tasks
β”‚   β”œβ”€β”€ environment/           # Trading environments
β”‚   β”œβ”€β”€ evaluator/             # Evaluation metrics
β”‚   β”œβ”€β”€ exception/             # Exception handling
β”‚   β”œβ”€β”€ factor/                # Factor models
β”‚   β”œβ”€β”€ log/                   # Logging system
β”‚   β”œβ”€β”€ loss/                  # Loss functions
β”‚   β”œβ”€β”€ memory/                # Memory management
β”‚   β”œβ”€β”€ metric/                # Performance metrics
β”‚   β”œβ”€β”€ models/                # AI models and architectures
β”‚   β”œβ”€β”€ mverl/                 # Multi-agent VERL
β”‚   β”œβ”€β”€ optimizer/             # Optimization algorithms
β”‚   β”œβ”€β”€ plot/                  # Visualization tools
β”‚   β”œβ”€β”€ processor/             # Data processors
β”‚   β”œβ”€β”€ proxy/                 # Proxy management
β”‚   β”œβ”€β”€ reducer/               # Dimensionality reduction
β”‚   β”œβ”€β”€ scheduler/             # Task scheduling
β”‚   β”œβ”€β”€ task/                  # Task definitions
β”‚   β”œβ”€β”€ tools/                 # Utility tools and integrations
β”‚   β”œβ”€β”€ trainer/               # Training frameworks
β”‚   β”œβ”€β”€ trajectory/            # Trajectory management
β”‚   β”œβ”€β”€ utils/                 # Utility functions
β”‚   └── verify/                # Verification tools
β”œβ”€β”€ scripts/                   # Training and execution scripts
β”œβ”€β”€ examples/                  # Example usage scripts
β”œβ”€β”€ tests/                     # Unit tests
β”œβ”€β”€ libs/                      # External libraries (VERL)
β”œβ”€β”€ res/                       # Resources and assets
└── tools/                     # Development tools

🎯 Supported Markets & Data Sources

Markets

  • US Markets: DJ30 (Dow Jones 30), SP500 (S&P 500)
  • Chinese Markets: SSE50 (Shanghai Stock Exchange 50), HS300 (CSI 300)

Data Sources

  • FMP: Financial Modeling Prep - Comprehensive financial data
  • Alpaca: US market data and news feeds
  • AKShare: Chinese market data and financial information
  • TuShare: Chinese financial data and research tools

Data Types

  • Price Data: OHLCV at daily and minute frequencies
  • News Data: Financial news and sentiment analysis
  • Technical Indicators: Alpha158 factors and custom indicators
  • LLM Reasoning: Financial QA datasets and reasoning benchmarks

πŸ”§ Configuration System

FinWorld uses a flexible configuration system based on YAML files and the mmengine framework:

  • Experiment Management: Centralized configuration for reproducibility
  • Component Registry: Flexible component instantiation and management
  • Inheritance Support: Configuration inheritance and override mechanisms
  • Validation: Automatic configuration validation and error checking

πŸ“Š Performance Metrics

Trading & Portfolio Metrics

  • ARR: Annualized Rate of Return
  • SR: Sharpe Ratio
  • MDD: Maximum Drawdown
  • CR: Calmar Ratio
  • SoR: Sortino Ratio
  • VOL: Volatility

Forecasting Metrics

  • MAE: Mean Absolute Error
  • MSE: Mean Squared Error
  • RankIC: Rank Information Coefficient
  • RankICIR: Rank Information Coefficient Information Ratio

LLM Metrics

  • Accuracy: Task-specific accuracy scores
  • F1 Score: Precision and recall balance
  • Financial Reasoning: Domain-specific evaluation metrics

🀝 Contributing

We welcome contributions from the research community! Please follow these guidelines:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Development Guidelines

  • Follow the existing code style and architecture patterns
  • Add comprehensive tests for new functionality
  • Update documentation for new features
  • Ensure compatibility with existing components

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

We thank the following organizations and tools for their contributions:

  • VERL: Reinforcement learning framework
  • FMP: Financial data API
  • Alpaca: Market data provider
  • AKShare: Chinese financial data
  • MMEngine: Training framework

πŸ“ž Support & Citation

Support

For questions, issues, or contributions:

  • Open an issue on GitHub
  • Visit our Project Website for interactive demo and detailed results
  • Check the documentation in the docs/ directory
  • Review example scripts in the examples/ directory

Citation

If you find FinWorld useful in your research, please cite our paper:

@article{zhang2025finworld,
  title={FinWorld: An All-in-One Open-Source Platform for End-to-End Financial AI Research and Deployment},
  author={Zhang, Wentao and Zhao, Yilei and Zong, Chuqiao and Wang, Xinrun and An, Bo},
  journal={arXiv preprint arXiv:2508.02292},
  year={2025}
}

πŸ“š Related Papers

🌐 Project Website

Visit our interactive project website for detailed results, visualizations, and comprehensive documentation:

https://dvampire.github.io/FinWorld/

The website features:

  • Interactive architecture overview
  • Detailed empirical results with visualizations
  • Platform comparison tables
  • Quick start guides and examples

FinWorld - Empowering Financial AI Research and Applications

Built with ❀️ by the FinWorld Team

About

FinWorld: An All-in-One Open-Source Platform for End-to-End Financial AI Research and Deployment

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published