Awesome
How many degrees of freedom do we need to train deep networks?
This repository contains source code for the ICLR 2022 paper How many degrees of freedom do we need to train deep networks: a loss landscape perspective by Brett W. Larsen, Sanislav Fort, Nic Becker, and Surya Ganguli (arXiv version).
This code was developed and tested using JAX v0.1.74
, JAXlib v0.1.52
, and Flax v0.2.0
. The authors intend to update the repository in the future with additional versions of the script that work with the flax.linen
module.
Top-Level Scripts
burn_in_subspace.py
: Script for random affine subspace and burn-in affine subspace experiments. To use random affine subspaces, set the parameterinit_iters
to 0.lottery_subspace.py
: Script for lottery subspace experimentslottery_ticket.py
: Script for lottery ticket experiments
Sub-Functions
architectures.py
: Model filesdata_utils.py
: Functions for saving out datagenerate_data.py
: Functions to setup datasets for traininglogging_tools.py
: Setup for logger; generates automatic experiment name with timestamptraining_utils.py
: Functions related to projecting to and training in a subspace
Citation
@inproceedings{LaFoBeGa22,
title={How many degrees of freedom do we need to train deep networks: a loss landscape perspective},
author={Brett W. Larsen and Stanislav Fort and Nic Becker and Surya Ganguli},
booktitle={International Conference on Learning Representations},
year={2022},
url={https://openreview.net/forum?id=ChMLTGRjFcU}
}