Awesome

A step towards procedural terrain generation with GANs

Authors: Christopher Beckham, Christopher Pal

Procedural generation in video games is the algorithmic generation of content intended to increase replay value through interleaving the gameplay with elements of unpredictability. This is in contrast to the more traditional, `handcrafty' generation of content, which is generally of higher quality but with the added expense of labour. A prominent game whose premise is almost entirely based on procedural terrain generation is Minecraft, a game where the player can explore a vast open world whose terrain is based entirely on voxels ('volumetric pixels'), allowing the player to manipulate the terrain (i.e. dig tunnels, build walls) and explore interesting landscapes (i.e. beaches, jungles, caves).

So far, terrains have been procedurally generated through a host of algorithms designed to mimic real-life terrain. Some prominent examples of this include Perlin noise and diamond square, in which a greyscale image (a heightmap) is generated from a noise source, which, when rendered in 3D as a mesh, produces a terrain. While these methods are quite fast, they generate terrains which are quite simple in nature. Software such as L3DT employ sophisticated algorithms which let the user control what kind of terrain they desire, (e.g. mountains, lakes, valleys), and while these can produce very impressive terrains it would still seem like an exciting endeavour to leverage the power of generative networks in deep learning (such as the GAN) to learn algorithms to automatically generate terrain directly through learning the raw data without the need to manually write algorithms to generate them.

Datasets

In this work, we leverage extremely high-resolution terrain and heightmap data provided by the NASA Visible Earth project in conjunction with generative adversarial networks (GANs) to create a two-stage pipeline in which heightmaps can be randomly generated as well as a texture map that is inferred from the heightmap. Concretely, we synthesise 512px height and texture maps using random 512px crops from the original NASA images (of size 21600px x 10800px), as seen in the below images. (Note, per-pixel resolution is 25km, so a 512px crop corresponds to ~13k square km.)

<a href="https://eoimages.gsfc.nasa.gov/images/imagerecords/73000/73934/gebco_08_rev_elev_21600x10800.png"> <img src="https://github.com/christopher-beckham/gan-heightmaps/raw/master/md/earth_heightmap.png" /> </a>   <a href="https://eoimages.gsfc.nasa.gov/images/imagerecords/74000/74218/world.200412.3x21600x10800.jpg"> <img src="https://github.com/christopher-beckham/gan-heightmaps/raw/master/md/earth_texture.jpg" /> </a>

Results

We show some heightmaps and texture maps generated by the model. Note that we refer to the heightmap generation as the 'DCGAN' (since it is essentially the DCGAN paper), and the texture generation as 'pix2pix' (which is based on the conditional image-to-image translation GAN paper).

DCGAN

Here are some heightmaps generated at roughly ~590 epochs for the DCGAN part of the network.

Click here to see a video showing linear interpolations between the 100 different randomly generated heightmaps (and their corresponding textures):

pix2pix

Generation of texture maps from ground truth heightmaps ~600 epochs in (generating texture maps from generated heightmaps performed similarly):

(TODO: texture outputs can be patchy, probably because the discriminator is actually a PatchGAN. I will need to run some experiments using a 1x1 discriminator instead.)

Here is one of the generated heightmaps + corresponding texture map rendered in Unity:

Running the code

First we need to download the data. You can find the h5 file here. A quick notebook to visualise the data can also be found in notebooks/visualise_data.ipynb.

To run an experiment, you need to specify a particular experiment. These are defined in experiments.py. I recommend you use the experiment called test1_nobn_bilin_both. The experiments.py file is what you run to launch experiments, and it expects two command-line arguments in the form <experiment name> <mode>. For example, to run test1_nobn_bilin_both we simply do:

python experiments.py test1_nobn_bilin_both train

(NOTE: you will need to modify this file and change the URL to point to where your .h5 file is.)

For all the Theano bells and whistles, I create bash script like so:

#!/bin/bash
env
PYTHONUNBUFFERED=1 \
THEANO_FLAGS=mode=FAST_RUN,device=cuda,floatX=float32,nvcc.fastmath=True,dnn.conv.algo_fwd=time_once,dnn.conv.algo_bwd_filter=time_once,dnn.conv.algo_bwd_data=time_once \
  python experiments.py test1_nobn_bilin_both train

Various things will be dumped into the results folder (for the aforementioned experiment, this is output/test1_nobn_bilin_both) when it is training. The files you can expect to see are:

results.txt: various metrics in .csv format that you can plot to examine training.
gen_dcgan.png,disc_dcgan.png: architecture diagrams for the DCGAN generator and discriminator.
gen_p2p.png,disc_p2p.png: architecture diagrams for the P2P generator and discriminator.
out_*.png: outputs from the P2P GAN, showing the predicted Y from ground truth X.
dump_a/*.png: outputs from the DCGAN, in which an X's are synthesised from z's drawn from the prior p(z).
dump_train/*.a.png,dump_train/*.b.png: high-res ground truth X's along with their predicted Y's for the training set.
dump_valid/*.a.png,dump_valid/*.b.png: high-res ground truth X's along with their predicted Y's for the validation set.

This code was inspired by -- and uses code from -- Costa et al's vess2ret, which is a pix2pix implementation used for retinal image synthesis. Some code was also used from the keras-adversarial library.