Home

Awesome

WoodFisher

This repository accompanies the NeurIPS 2020 paper on WoodFisher: Efficient Second-Order Approximation for Neural Network Compression (Singh & Alistarh, 2020).

<img src="https://github.com/IST-DASLab/WoodFisher/blob/main/woodfisher_camera_ready_uncompressed.png" width="500">

The code is based on PyTorch and has been tested on version 1.2.0+cu92.

Full-WoodFisher pruners

Blockwise-WoodFisher pruners

Additional Blockwise-WoodFisher-based pruners

Baseline pruners

Structure of the repo

Args and config.yaml description

We use a yaml config file to specify all things related to training and pruning: choice of pruners and optimizers, scheduling, and so on. Writing this file is best explained with an example, which you can find in configs/example_config.yaml. As shown, the file should be split into two top-level sections: pruners and trainers (in any order). Second-level entries for both pruners and trainers list names of instances that are used to refer to them from the runner scripts. These names can be arbitrarily chosen by the user; however, if you are using only one model for training and pruning (which is most often the case), it is easiest to name your single trainer default_trainer to avoid having to specify a custom name for the runner script. For example, example_config.yaml has two pruners, pruner_1 and pruner_2, and a default_trainer.

Also to allow an easier way to use, you can update the config file on the fly via command line arguments. In order to do so, --update-config must be passed, and besides that whatever argument that you might want to override, e.g., setting weight decay to zero --prune-wdecay 0, etc.

Pruning Schedule

For the results, in the paper, we fix a budget of 100 epochs for the entire gradual pruning process (i.e., including finetuning). However, you might want to play around with this schedule depending on your needs, to either compress or lengthen the schedule. This can be achieved by using the flags --prune-start, --prune-freq, --prune-end, --epochs.

More details on these flags can be found in the argument descriptions contained in the options.py file.

Other useful flags

--recompute-num 2 --recompute-schedule poly --recompute-degree 2

Memory-management

At some point if you are pruning moderate to large-sized networks, there are some additional things you might want to take care off, in order to be able to run WoodFisher-based pruning variants. The below-mentioned flags will likely be useful in this scenario:

--fittable-params: This is one of the key flags that controls the block/chunk size to consider along the diagonal of the hessian. If this chunk size is c, then the memory consumed is of the order O(cd), where d denotes the total number of parameters in the model. Hence choose c based on the model size and the amount of memory available at hand. E.g., for ResNet50 which has ~25M params, a chunk size of 1000-2000should be good, while forMobileNetV1which has~4.5Mparams, a decent chunk size would be between5000-20000`.

--offload-inv: This offloads the hessian inverse, computed during an intermediate step during pruning, on to the CPU.

--offload-grads: This offloads the loss gradients, computed over the various samples, on to the CPU.

--fisher-cpu: Shifts even the computation of some parts onto the CPU.

--fisher-split-grads: Tradeoff speed and memory while doing the Woodburry updates on the GPU. This is not required if you are using the CPU anyways. The flag expects an integer argument via an additional flag --fisher-parts, and reduces the memory requirement from 2 * Hessian sizeto(1+ 1/k)* Hesian sizewherek` stands for the value of this fisher-parts flag (default value is 5).

Setup

First, clone the repository by running the following command:

$ git clone https://github.com/IST-DASLab/WoodFisher

After that, do not forget to run source setup.sh.

Tensorboard Support

First of all ensure that your torch package has version 1.1.0 or above. Then install the nightly release of tensorboard:

$ pip install tb-nightly

After that ensure that future package is installed or invoke installation process by typing the following command in terminal:

$ pip install future

Extensions and Pull Requests

We welcome the contributions of community in further enriching the current codebase, from the standpoint of improving efficiency to adding support for additional network types as well as to matters concerning the aesthetics. Feel free to send a pull request in such a scenario, possibly alongside the csv file generated by running the tests contained in the tests/ folder. Also, an example results.csv with which you can match your results to see if everything is still alright (these results should be rough match as the exact numbers depend on the PyTorch versions, the inherent randomness across platforms, etc.).

Some example's of pull requests are labelled in the code as PR.

Acknowledgements

We thank Alex Shevchenko, Ksenia Korovina for providing an initial framework that we could re-purpose for the implementation of our work.

Reference

This codebase corresponds to the paper: WoodFisher: Efficient Second-Order Approximation for Neural Network Compression. If you use any of the code or provided models for your research, please consider citing the paper as.

@inproceedings{NEURIPS2020_d1ff1ec8,
 author = {Singh, Sidak Pal and Alistarh, Dan},
 booktitle = {Advances in Neural Information Processing Systems},
 editor = {H. Larochelle and M. Ranzato and R. Hadsell and M. F. Balcan and H. Lin},
 pages = {18098--18109},
 publisher = {Curran Associates, Inc.},
 title = {WoodFisher: Efficient Second-Order Approximation for Neural Network Compression},
 url = {https://proceedings.neurips.cc/paper/2020/file/d1ff1ec86b62cd5f3903ff19c3a326b2-Paper.pdf},
 volume = {33},
 year = {2020}
}