Awesome
Torch Starter
This is a simple Torch7 starter package. It can be used as a simplified kickoff point for a Torch project.
<img src='http://i.imgur.com/3a5fAAy.png' width='500'>I pieced together this package largely from Torch7 resources online. I mostly just copied the code, and stripped a lot of extra functionality out, to make it easier to hack on.
If something is not clear, or could be made more simple, please let me know. The goal is to be simple.
Installation
If you are at CSAIL, you can use my Torch installation:
. /data/vision/torralba/commonsense/torch/install/bin/torch-activate
export LD_LIBRARY_PATH=/data/vision/torralba/commonsense/cudnnv5/cuda/lib64:$LD_LIBRARY_PATH
Otherwise, installation is fairly simple. You need to install:
- Torch7
- cunn for training on GPU
- cudnn for faster training on GPU
- tds for some data structures
- display for graphs
You can install all of these with the commands:
# install torch first
git clone https://github.com/torch/distro.git ~/torch --recursive
cd ~/torch; bash install-deps;
./install.sh
# install libraries
luarocks install cunn
luarocks install cudnn
luarocks install tds
luarocks install https://raw.githubusercontent.com/szym/display/master/display-scm-0.rockspec
Learning Resources
Model
I trained an AlexNet-esque network on Places365 with this code, which you can download here. This model obtains 50% top-1 accuracy on the validation set. This is slightly worse than the published result because we didn't do averaging over 10 crops.
If you use this model, please cite the Places2 paper (of which I am not affiliated). Note this model is slightly different from the AlexNet in Caffe. Notable differences: no groups in the convolutions, no spatial normalization, batch normalizaiton, trained with Adam instead of SGD, and sampling with replacement. It is unclear to me whether these changes have a significant impact on performance.
Data Setup
By default, we assume you have a text file that lists your dataset. This text does not store your dataset; it just lists filepaths to it, and any meta data. Each line in this text file represents one training example, and its associated category ID. The syntax of the line should be:
<filename><tab><number>
For example:
bedroom/IMG_050283.jpg 5
bedroom/IMG_237761.jpg 5
office/IMG_838222.jpg 10
The <number>
should start counting at 1.
After you create this file, open main.lua
and change data_list
to point to this file. You can specify a data_root
too, which will be prepended to each filename.
Training
Define your model in the net
variable. By default, it is AlexNet. To learn more about the modules you can use, see nn. You can also adjust your loss with the criterion
variable.
Remember to also adjust any options in opt
, such as the learning rate and the number of classes. Setting these hyperparameters is a bit of an art, but generally it is recommended to use a learning rate of 0.001
and batch size of at least 64
, but 128
or 256
may be better if you have the memory. For a systematic study, see this paper.
Finally, to start training, just do:
$ CUDA_VISIBLE_DEVICES=0 th main.lua
where you replace the number after CUDA_VISIBLE_DEVICES
with the GPU you want to run on.
You can find which GPU to use with $ nvidia-smi
on our GPU cluster. Note: this number is 0-indexed, unlike the rest of Torch!
During training, it will dump snapshots to the checkpoints/
directory every epoch. Each time you start a new experiment, you should change the name
(in opt
), to avoid overwriting previous experiments. The code will not warn you about this (to keep things simple).
Evaluation
To evaluate your model, you can use the eval.lua
script. It mostly follows the same format as main.lua
. It reads your validation/testing dataset from a file similar to before, and sequentially runs through it, calculating both the top-1 and top-5 accuracy.
Graphics, Logs
If you want to see graphics and the loss over time, in a different shell on the same machine, run this command:
$ th -ldisplay.start 8000 0.0.0.0
then navigate to http://HOST.csail.mit.edu:8000
in your browser. Every 10th iteration it will
push graphs.
On the CSAIL vision cluster, you can run this code out-of-the-box, and it will start to train AlexNet on the Places2 database.