Awesome
tf-lstm-char-cnn
Tensorflow port of Yoon Kim's Torch7 code. See also similar project here that failed to reproduce Kim's results and was apparently abandoned by the author. Many pieces of code are borrowed from it.
Installation
You need tensorflow version 1.0 and python (obviously).
Running
python train.py -h
python evaluate.py -h
Training
python train.py
will train large model from Yoon Kim's paper.
Evaluate
python evaluate.py --load_model cv/epoch024_4.4962.model
evaluates this model on the test dataset
Generate
python generate.py --load_model cv/epoch024_4.4962.model
generates random text from the loaded model
Model
Model_1
graph is used for inference (computing validation loss and perplexity during training)
Model
graph is used for training.
Pre-trained model
Here is a ZIP file containing model files (created with TF 1.0). This model was trained with the default parameters and acheved the accuracy of the published result.
Pre-trained model 60Mb
Results
Learning rate | Train/Valid/Test loss | Train/Valid/Test perplexity |
---|---|---|
1.0 | 3.815 / 4.407 / 4.369 | 35.40 / 82.02 / 79.03 |
Note that model DOES reproduce the published result.
Training times (legacy, with TF 0.12)
Timings were recorded on AWS EC2 instancies:
c4.8xlarge
- 32 CPUs, no GPUsg2.2xlarge
- 8 CPU, 1 GPU (K520)p2.xlarge
- 4 CPU, 1 GPU (K80)
Timing | c4.8xlarge | g2.2xlarge | p2.xlarge |
---|---|---|---|
Secs per batch | 0.98 | 2.85 | 0.32 |
Secs per epoch | 1404 | 3779 | 428 |
Takes 3 hours to complete training (25 epochs) on p2.xlarge
machine
Generating random text
default charges during the straight summer of the collapse of it works that it <unk> to the u.s. is very <unk>
mr. boyd 's remarks are crucial to the champion to a close <unk> in N he says
explains it 's only one of the many really identified
it did n't end a deal we do n't know what 's no longer viable
<unk> <unk> <unk> 's <unk> hearing his head
nora more than a$ N million has had no control in N it would do that could win any new other <unk> corporate and real-estate <unk> a future number of names <unk> at telling an <unk> outside the international institution
the extensive approach is a real form of unpublished football <unk> as the <unk> bells on negotiation refund
in other words the others who paid $ N a figure
what 's really the <unk> who can use a vast hundred the transaction would give it a better use of the <unk> and
the demise of the topic is fatal than we can do anything that is like the change in reality says justice <unk> <unk> of <unk> & <unk> a direct mail marketing firm
when when lawyer <unk> <unk> <unk> as corporate crime remains a distant younger woman says i 've seen however an injection in the new york money fund
some hearings have detectors with <unk> have survived mr. achenbaum 's veto incorporated the <unk> delicate center which <unk> a heavy medium that lawyers call are confusing big inquiries for the forecasting two outlets to had the claims
the afghan people have a series that 's really heard to share legislation for <unk> without all their racial cosmetic ties
he is going to hold his ...