Awesome
RNNProteins
World record on cb513 dataset using cullpdb+profile_6133_filtered (68.9% on Q8 with best single performing model), available at: http://www.princeton.edu/~jzthree/datasets/ICML2014/
By Alexander Rosenberg Johansen
previous best single model results: 68.3% Q8 by: Deep CNF
Reproducing results
Installation
Please refer to lasagne's for installation and setup of GPU environment on an Ubuntu 14.04 machine.
Getting repository
Go to desired folder for repo and type in terminal.
Training models
Use train.py when data is unzipped in data folder, i.e.
python train.py baseline_L2
Best network elaborated
https://github.com/alrojo/RNNProteins/blob/master/configurations/avg1.py
- InputLayer
- 3x ConvLayer(InputLayer, filter_size=3-5-7) + Batch Normalization
- DenseLayer1([ConcatLayer, InputLayer]) + Batch Normalization
- LSTMLayerF(DenseLayer1, Forward)
- LSTMLayerB([DenseLayer1, LSTMLayerF], Backward)
- DenseLayer2([LSTMLayerF, LSTMLayerB], dropout=0.5)
- OutLayer(DenseLayer2)
Gradients are further normalized if too large and probabilities cutted. RMSProps is used and L2=0.0001
Project elaboration: Start Juli 2015 - still ongoing
This project is a continuation of Søren Sønderby's (github.com/skaae) previous results on CB513: http://arxiv.org/abs/1412.7828, supervised under Ole Winther (cogsys.imm.dtu.dk/staff/winther/).
My project was to recreate Søren's results and test: Convolutional layers across time, L2, "vertical" links (feeding forward LSTM to backwards LSTM), batchnormalization, different optimizers etc.
It took me approximately 3 months (with grid search of 200-300 models) before I managed to achieve similar validation results to Søren (apperently a DropoutLayer in the DenseLayer before the first LSTM messed with the model performance, which is why it took so long to get Søren's results).
After achieving similar performance I started applying the various "new" techniques to my neural network. It took another 100-150 models gridsearching various combination which led to the model in "Best network elaborated". Running final test (which will be reported in a methods paper later) led to a performance increase by 1.5% compared to Sørens results. Notice the use of skip layer and that the three convolutional layers are all on the input.
Other possible routes for improved results could be found using batch normalized RNNs by baidu (http://arxiv.org/abs/1512.02595) and used Batch Normalization after the LSTM when lasagne starts to support masked Batch Norms.
The article will be submitted as a Methods paper to Nature together with some other research from Ole Winther's lab. My first draft is in the article folder.
Next up: Make a 10 model average and finish the ImageNet12 article like drawings of the final network.