Home

Awesome

ChineseChess-AlphaZero

About

Chinese Chess reinforcement learning by AlphaZero methods.

This project is based on these main resources:

  1. DeepMind's Oct 19th publication: Mastering the Game of Go without Human Knowledge.
  2. The great Reversi/Chess/Chinese chess development of the DeepMind ideas that @mokemokechicken/@Akababa/@TDteach did in their repo: https://github.com/mokemokechicken/reversi-alpha-zero, https://github.com/Akababa/Chess-Zero, https://github.com/TDteach/AlphaZero_ChineseChess
  3. A Chinese chess engine with gui: https://github.com/mm12432/MyChess

Note: The distributed version is in distributed branch.

Note: This repo is still under contruction. There is also a slower implementation of MCTS player (previous version), see branch old.

Contributing

In order to build a strong chinese chess AI following the same type of techniques as AlphaZero, we need to do this with a distributed project, as it requires a huge amount of computations.

If you want to join us to build the best chinese chess AI in the world:

Environment

Modules

Reinforcement Learning

This AlphaZero implementation consists of two workers: self and opt.

For the sake of faster training (since I don't have 5000 TPUs), another two workers are involved:

GUI

Requirement: pygame

python cchess_alphazero/run.py play

Screenshots

board

You can choose different board/piece styles and sides, see play with human.

How to use

Setup

install libraries

pip install -r requirements.txt

If you want to use CPU only, replace tensorflow-gpu with tensorflow in requirements.txt.

Make sure Keras is using Tensorflow and you have Python 3.6.3+.

Configuration

PlayDataConfig

PlayConfig, PlayWithHumanConfig

Basic Usage

Self-Play

python cchess_alphazero/run.py self

When executed, Self-Play will start using BestModel. If the BestModel does not exist, new random model will be created and become BestModel.

options

Trainer

python cchess_alphazero/run.py opt

When executed, Training will start. The current BestModel will be loaded. Trained model will be saved every epoch as new BestModel.

options

View training log in Tensorboard

tensorboard --logdir logs/

And access http://<The Machine IP>:6006/.

Play with human

python cchess_alphazero/run.py play

When executed, the BestModel will be loaded to play against human.

options

Note: Before you start, you need to download/find a font file (.ttc) and rename it as PingFang.ttc, then put it into cchess_alphazero/play_games. I have removed the font file from this repo because it's too big, but you can download it from here.

Evaluator

python cchess_alphazero/run.py eval

When executed, evaluate the NextGenerationModel with the current BestModel. If the NextGenerationModel does not exist, worker will wait until it exists and check every 5 minutes.

options

Supervised Learning

python cchess_alphazero/run.py sl

When executed, Training will start. The current SLBestModel will be loaded. Tranined model will be saved every epoch as new SLBestModel.

About the data

I have two data sources, one is downloaded from https://wx.jcloud.com/market/packet/10479 ; the other is crawled from http://game.onegreen.net/chess/Index.html (with option --onegreen).

options