Home

Awesome

Tensorflow TCN

The explanation and graph in this README.md refers to Keras-TCN.

Temporal Convolutional Network with tensorflow 1.13 (eager execution)

Why Temporal Convolutional Network?

<p align="center"> <img src="misc/Dilated_Conv.png"> <b>Visualization of a stack of dilated causal convolutional layers (Wavenet, 2016)</b><br><br> </p>

API

Arguments

tcn = TemporalConvNet(num_channels, kernel_size, dropout)

Input shape

3D tensor with shape (batch_size, timesteps, input_dim).

Output shape

It depends on the task (cf. below for examples):

Receptive field

<p align="center"> <img src="https://user-images.githubusercontent.com/40159126/41830054-10e56fda-7871-11e8-8591-4fa46680c17f.png"> <b>ks = 2, dilations = [1, 2, 4, 8], 1 block</b><br><br> </p> <p align="center"> <img src="https://user-images.githubusercontent.com/40159126/41830618-a8f82a8a-7874-11e8-9d4f-2ebb70a31465.jpg"> <b>ks = 2, dilations = [1, 2, 4, 8], 2 blocks</b><br><br> </p> <p align="center"> <img src="https://user-images.githubusercontent.com/40159126/41830628-ae6e73d4-7874-11e8-8ecd-cea37efa33f1.jpg"> <b>ks = 2, dilations = [1, 2, 4, 8], 3 blocks</b><br><br> </p>

Run

Each task has a separate folder. Enter each folder one can usually find utils.py, model.py and train.py. The utils.py generates data, and model.py builds the TCN model. You should run train.py to train the model. The hyper-parameters in train.py are set by argparse. The pre-trained models are saved in weights/.

cd adding_problem/
python train.py # run adding problem task

cd copy_memory/
python train.py # run copy memory task

cd mnist_pixel/
python train.py # run sequential mnist pixel task

cd word_ptb/
python train.py # run PennTreebank word-level language model task

The training detail of each task is in README.md in each folder.

Tasks

Adding Task

The task consists of feeding a large array of decimal numbers to the network, along with a boolean array of the same length. The objective is to sum the two decimals where the boolean array contain the two 1s.

<p align="center"> <img src="misc/Adding_Task.png"> <b>Adding Problem Task</b><br><br> </p>

Copy Memory Task

The copy memory consists of a very large array:

The idea is to copy the content of the vector x to the end of the large array. The task is made sufficiently complex by increasing the number of 0s in the middle.

<p align="center"> <img src="misc/Copy_Memory_Task.png"> <b>Copy Memory Task</b><br><br> </p>

Sequential MNIST

The idea here is to consider MNIST images as 1-D sequences and feed them to the network. This task is particularly hard because sequences are 28*28 = 784 elements. In order to classify correctly, the network has to remember all the sequence. Usual LSTM are unable to perform well on this task.

<p align="center"> <img src="misc/Sequential_MNIST_Task.png"> <b>Sequential MNIST</b><br><br> </p>

PennTreebank

In word-level language modeling tasks, each element of the sequence is a word, where the model is expected to predict the next incoming word in the text. We evaluate the temporal convolutional network as a word-level language model on PennTreebank.

References