Home

Awesome

Self-Tuning Networks

This repository contains the code used for the paper Self-Tuning Networks: Bilevel Optimization of Hyperparameters using Structured Best-Response Functions (ICLR 2019).

Requirements

Setup

The following is an example of how to create an environment with the appropriate versions of the dependencies:

conda create -n stn-env python=3.6
source activate stn-env
conda install pytorch=0.4.1 cuda80 -c pytorch
conda install torchvision -c pytorch
pip install -r requirements.txt

Experiments

CNN Experiments

The CNN code in this reposistory is built on the Cutout codebase. These commands should be run from inside the cnn folder.

To train a Self-Tuning CNN:

python hypertrain.py --tune_all --tune_scales --entropy_weight=1e-3 --save

To train a baseline CNN:

python train_basic.py

LSTM Experiments

The LSTM code in this repository is built on the AWD-LSTM codebase. The commands for the LSTM experiments should be run from inside the lstm folder.

First, download the PTB dataset:

./getdata.sh

Schedule Experiments

The commands in this section can be used to obtain results for Table 1 in the paper.

Schedules with Different Hyperparameter Initializations

Tuning Multiple LSTM Hyperparameters

Run the following command to tune the input/hidden/output/embedding dropout, weight DropConnect, and the coefficients of activation regularization (alpha) and temporal activation regularization (beta):

python train.py --seed=3 --tune_all --save_dir=st-lstm

Project Structure

.
├── README.md
├── cnn
│   ├── datasets
│   │   ├── __init__.py
│   │   ├── cifar.py
│   │   └── loaders.py
│   ├── hypermodels
│   │   ├── __init__.py
│   │   ├── alexnet.py
│   │   ├── hyperconv2d.py
│   │   ├── hyperlinear.py
│   │   └── small.py
│   ├── hypertrain.py
│   ├── logger.py
│   ├── models
│   │   ├── __init__.py
│   │   ├── alexnet.py
│   │   └── small.py
│   ├── train_basic.py
│   └── util
│       ├── __init__.py
│       ├── cutout.py
│       ├── dropout.py
│       └── hyperparameter.py
├── lstm
│   ├── data.py
│   ├── embed_regularize.py
│   ├── getdata.sh
│   ├── hyperlstm.py
│   ├── locked_dropout.py
│   ├── logger.py
│   ├── model_basic.py
│   ├── save_dropouto_schedule_plot.py
│   ├── train.py
│   ├── train_basic.py
│   ├── utils.py
│   └── weight_drop.py
├── requirements.txt
└── stn_utils
    ├── __init__.py
    └── hyperparameter.py

7 directories, 34 files

Code Contributors

Citation

If you use this code, please cite:

@inproceedings{STN2019,
  title={Self-Tuning Networks: Bilevel Optimization of Hyperparameters using Structured Best-Response Functions},
  author={Matthew MacKay and Paul Vicol and Jonathan Lorraine and David Duvenaud and Roger Grosse},
  booktitle={{International Conference on Learning Representations (ICLR)}},
  year={2019}
}