Imitation Learning with Dataset Aggregation (DAGGER) on Torcs Env
This repository implements a simple algorithm for imitation learning: DAGGER.
In this example, the agent only learns to control the steer [-1, 1], the speed is computed
automatically in gym_torcs.TorcsEnv
- Ubuntu (I only test on this)
- Python 3
- TensorLayer and TensorFlow
- Gym-Torcs
Setting Up
It is a little bit boring to set up the environment, but any incorrect configurations will lead to FAILURE. After installing Gym-Torcs, please follow the instructions to confirm everything work well:
Open a terminal:
- Run
sudo torcs -vision
to start a game Race --> Practice --> Configure Race
: set the driver toscr_server 1
instead ofplayer
- Open Torcs server by selecting
Race --> Practice --> New Race
: This should result that Torcs keeps a blue screen with several text information.
- Run
Open another terminal:
- Run
on another terminal, it will shows how the fake AI control the car. - Press F2 to see the driver view.
- Run
Set image size to 64x64x3:
- The model is trained on 64x64 RGB observation.
- Run
sudo torcs -vision
to start a game Options --> Display --> select 64x64 --> Apply
Make sure everything above work well and then run:
It will start a Torcs server at the beginning of every episode, and terminate the server when the car crashs or the speed is too low.
Note that, the self-contained
is modified from Gym-Torcs, you can try different settings (like default speed, terminated speed) by modifying it.
After Episode 1, the car crashes after 315 steps.
<!--- After Episode 2, crashes after 151 steps ![]( After Episode 3, crashes after 395 steps ![]( After Episode 4, the car does not crash anymore: [gif]( -->After Episode 3, the car does not crash anymore !!!
The number of steps and episodes might vary depending on the parameters initialization.
<!--- Note: The images fed to the model are 64x64, the images shown above have been resized to 256x256 for viewing purposes. -->