AlphaZero for Reversi (Multi-Player, Custom Maps)

This project implements AlphaZero to learn and play a custom variant of the Reversi (Othello) board game using self-play and Monte Carlo Tree Search (MCTS). It supports:

Arbitrary map sizes and shapes
Multiple players (not just 2!)
Integration with a fast C++ Reversi backend
Parallel self-play for efficient training

💡 The actual Reversi game logic is implemented in C++ and exposed via Python. For details, see the Reversi C++ Engine readme.

🧠 Core Idea

This project reimplements the AlphaZero algorithm for general multi-player Reversi using:

A custom PyTorch ResNet with two heads:
- A policy head predicting the move distribution
- A value head estimating each player's final score
Dirichlet noise for exploration
Temperature-controlled sampling to encourage diverse openings
Self-play to generate training data with MCTS-based moves
Support for custom board generators (maps, sizes, and obstacles)

🚀 Training

Train AlphaZero from scratch with:

python train.py

You can configure: Number of self-play games MCTS simulations Batch size and epochs Parallelism Checkpoints are saved in models/checkpoints/.

Evaluate

Evaluate AlphaZero with the evaluate script.

📊 Features

✅ Multi-player support (not just 2 players) ✅ Pluggable board generation ✅ Parallel self-play for speedup ✅ Full AlphaZero training loop ✅ Temperature annealing for more deterministic endgames ✅ Modular and extensible code

🧩 Reversi C++ Engine

All game logic (valid moves, scoring, disqualification, next-player logic) is implemented in optimized C++. To understand the board format, blocked fields, and player handling, see

📈 Future Improvements

Add GUI with PyGame or Web Interface Visualization of MCTS tree More advanced heuristics for initial value network bootstrap League training (train against past models)

🧠 Acknowledgements

DeepMind AlphaZero Paper Leela Chess Zero for inspiration on training infrastructure Your own sweat and GPU hours.

📜 License

MIT License – use freely we do not care but credit would be nice.

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
.vscode		.vscode
maps		maps
models/checkpoints		models/checkpoints
reversi_zero		reversi_zero
weight_transfer		weight_transfer
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AlphaZero for Reversi (Multi-Player, Custom Maps)

🧠 Core Idea

🚀 Training

Evaluate

📊 Features

🧩 Reversi C++ Engine

📈 Future Improvements

🧠 Acknowledgements

📜 License

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

lorenzbranner/reversi

Folders and files

Latest commit

History

Repository files navigation

AlphaZero for Reversi (Multi-Player, Custom Maps)

🧠 Core Idea

🚀 Training

Evaluate

📊 Features

🧩 Reversi C++ Engine

📈 Future Improvements

🧠 Acknowledgements

📜 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages