Home

Awesome

State Transition of Dendritic Spines Improves Learning of Sparse Spiking Neural Networks

This repo contains the code reproducing the results of STDS (State Transition of Dendritic Spines) in this paper, which is modified based on the open-source code of SEW ResNet.

Directory Tree

.
├── CIFAR10
│   ├── model.py
│   ├── optim.py
│   ├── train.py
│   └── logs
└── ImageNet
    ├── optim.py
    ├── sew_resnet.py
    ├── train.py
    ├── utils.py
    └── logs
        ├── linear
        └── sine

Dependency

The major dependencies of this code are list as below

# Name                    Version
cudatoolkit               10.2.89
cudnn                     8.2.1.32
cupy                      9.6.0
numpy                     1.21.4
python                    3.7.11 
pytorch                   1.9.1
spikingjelly              <Specific Version>
tensorboard               2.7.0
torchvision               0.10.1

Note: the version of spikingjelly will be clarified in usage part.

Environment

The running of code requires NVIDIA GPU and has been tested on CUDA 10.2 and Ubuntu 16.04. The hardware platform used in experiments is shown below.

Each trial on ImageNet requires 8 GPUs. For CIFAR-10, each trial requires only a single GPU.

Usage

This code requires a specified version of an open-source SNN framework SpikingJelly. To get this framework installed, first clone the repo from GitHub:

$ git clone https://github.com/fangwei123456/spikingjelly.git

Then, checkout the version we use in these experiments and install it.

$ cd spikingjelly
$ git checkout d8cc6a5
$ python setup.py install

With dependency mentioned above installed, you should be able to run the following commands:

ImageNet

Dense training:

$ python -m torch.distributed.launch --nproc_per_node=8 --use_env train.py --cos_lr_T 320 --model sew_resnet18 -b 32 --output-dir <log dir> --tb --print-freq 4096 --amp --connect_f ADD --T 4 --lr 0.1 --epoch 320 --data-path <dataset path> --sparse-function identity

Our proposed algorithm:

$ python -m torch.distributed.launch --nproc_per_node=8 --use_env train.py --cos_lr_T 320 --model sew_resnet18 -b 32 --output-dir <log dir> --tb --print-freq 4096 --amp --connect_f ADD --T 4 --lr 0.1 --epoch 320 --data-path <dataset path> --sparse-function stmod --flat-width <D> --gradual <scheduler type>

Grad R:

$ python -m torch.distributed.launch --nproc_per_node=8 --use_env train.py --cos_lr_T 320 --model sew_resnet18 -b 32 --output-dir <log dir> --tb --print-freq 4096 --amp --connect_f ADD --T 4 --lr 0.1 --epoch 320 --alpha-gr <alpha in Grad R> --data-path <dataset path> --sparse-function stmod --flat-width <mu in Grad R>

The TensorBoard logs and checkpoints will be placed in two separate directories in ./logs.

Running Arguments

ArgumentsDescriptionsValueType
--cos_lr_TTotal steps of Cosine Annealing scheduler of learning rate320int
-b,--batch-sizeTraining batch size32int
--alpha-grHyperparameter $\alpha$ in Grad RNonefloat
--data-pathPath of datasetsstr
--output-dirPath for dumping models and logsstr
--print-freqFrequency of print of status during training4096int
--ampWhether to use mixed precision trainingbool
--connect_fConnection function of SEW ResNetADDstr
-TSimulation time-steps of SNNs4int
--lrLearning rate0.1float
--epochNumber of training epochs320int
--sparse-functionReparameterization function'stmod' for pruning, 'identity' for training dense modelstr
--flat-widthHyperparameter $D$ in our work and $\mu$ in Grad Rfloat
--gradualScheduler type'sine', 'linear'str

CIFAR-10

Dense training:

$ python train.py --dataset-dir <dataset path> --dump-dir . --sparse-function identity --amp

Our proposed algorithm:

$ python train.py --dataset-dir <dataset path> --dump-dir . --sparse-function stmod --gradual <scheduler type> --flat-width <D> --amp

Running Arguments

ArgumentsDescriptionsValueType
-b, --batch-sizeTraining batch size16int
--lrLearning rate1e-4float
--dataset-dirPath of datasetsstr
--dump-dirPath for dumping models and logsstr
-TSimulation time-steps of SNNs8int
-N, --epochNumber of training epochs2048int
-testWhether test onlybool
--ampWhether to use mixed precision trainingbool
--sparse-functionReparameterization function'stmod' for pruning, 'identity' for training dense modelstr
--flat-widthHyperparameter $D$ in our work and $\mu$ in Grad Rfloat
--gradualScheduler type'sine', 'linear'str

Citation

Please refer to the following citation if this work is useful for your research.

@InProceedings{pmlr-v162-chen22ac,
  title = 	 {State Transition of Dendritic Spines Improves Learning of Sparse Spiking Neural Networks},
  author =       {Chen, Yanqi and Yu, Zhaofei and Fang, Wei and Ma, Zhengyu and Huang, Tiejun and Tian, Yonghong},
  booktitle = 	 {Proceedings of the 39th International Conference on Machine Learning},
  pages = 	 {3701--3715},
  year = 	 {2022},
  editor = 	 {Chaudhuri, Kamalika and Jegelka, Stefanie and Song, Le and Szepesvari, Csaba and Niu, Gang and Sabato, Sivan},
  volume = 	 {162},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {17--23 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v162/chen22ac/chen22ac.pdf},
  url = 	 {https://proceedings.mlr.press/v162/chen22ac.html}
}