Home

Awesome

LSTM Implementation in Caffe

Note that the master branch of Caffe supports LSTM now. (Jeff Donahue's implementation has been merged.) <br /> This repo is no longer maintained. <br />

Speed comparison (Titan X, 3-layer LSTM with 2048 units)

Jeff's code is more modularized, whereas this code is optimized for LSTM. <br /> This code computes gradient w.r.t. recurrent weights with a single matrix computation. <br />

CodeForward(ms)Backward(ms)Total (ms)
This code248291539
Jeff's code264462726
CodeForward(ms)Backward(ms)Total (ms)
This code131118249
Jeff's code140290430
CodeForward(ms)Backward(ms)Total (ms)
This code4959108
Jeff's code5292144
CodeForward(ms)Backward(ms)Total (ms)
This code292655
Jeff's code306191

Example

An example code is in /examples/lstm_sequence/. <br /> In this code, LSTM network is trained to generate a predefined sequence without any inputs. <br /> This experiment was introduced by Clockwork RNN. <br /> Four different LSTM networks and shell scripts(.sh) for training are provided. <br /> Each script generates a log file containing the predicted sequence and the true sequence. <br /> You can use plot_result.m to visualize the result. <br /> The result of four LSTM networks will be as follows: