Home

Awesome

Deep Hyperspherical Learning

By Weiyang Liu, Yan-Ming Zhang, Xingguo Li, Zhiding Yu, Bo Dai, Tuo Zhao, Le Song

License

SphereNet is released under the MIT License (refer to the LICENSE file for details).

Updates

Contents

  1. Introduction
  2. Citation
  3. Requirements
  4. Usage
  5. Results
  6. Notes
  7. Third-party re-implementation
  8. Contact

Introduction

The repository contains an example Tensorflow implementation for SphereNets. SphereNets are introduced in the NIPS 2017 paper "Deep Hyperspherical Learning" (arXiv). SphereNets are able to converge faster and more stably than its CNN counterparts, while yielding to comparable or even better classification accuracy.

Hyperspherical learning is inspired by an interesting obvervation of the 2D Fourier transform. From the image below, we could see that magnitude information is not crucial for recognizing the identity, but phase information is very important for recognition. By droping the magnitude information, SphereNets can reduce the learning space and therefore gain more convergence speed. Hypersphereical learning provides a new framework to improve the convolutional neural networks.

<img src="asserts/2dfourier.png" width="52%" height="52%">

The features learned by SphereNets are also very interesting. The 2D features of SphereNets learned on MNIST are more compact and have larger margin between classes. From the image below, we can see that local behavior of convolutions could lead to dramatic difference in final features, even if they are supervised by the same standard softmax loss. Hypersphereical learning provides a new perspective to think about convolutions and deep feature learning.

<img src="asserts/MNIST_featvis.jpg" width="51%" height="51%">

Besides, the hyperspherical learning also leads to a well-performing normalization technique, SphereNorm. SphereNorm basically can be viewed as SphereConv operator in our implementation.

Citation

If you find our work useful in your research, please consider to cite:

@inproceedings{liu2017deep,
    title={Deep Hyperspherical Learning},
    author={Liu, Weiyang and Zhang, Yan-Ming and Li, Xingguo and Yu, Zhiding and Dai, Bo and Zhao, Tuo and Song, Le},
    booktitle={Advances in Neural Information Processing Systems},
    pages={3953--3963},
    year={2017}
}

Requirements

  1. Python 2.7
  2. TensorFlow (Tested on version 1.01)
  3. numpy

Usage

Part 1: Setup

Part 2: Train Baseline/SphereNets

Part 3: Train Baseline/SphereResNets

Configuration

The default setting of SphereNet is Cosine SphereConv + Standard Softmax Loss. To change the type of SphereConv, please open the spherenet.py and change the norm variable.

The w_norm variable can also be changed similarly in order to use the weight-normalized softmax loss (combined with different SphereConv). By setting w_norm to none, we will use the standard softmax loss.

There are some examples of setting these two variables provided in the examples/ foloder.

Results

Part 1: Convergence

The convergence curves for baseline CNN and several types of SphereNets are given as follows. <img src="asserts/convergence.jpg" width="52%" height="52%">

Part 2: Best testing accuracy on CIFAR-10 (SphereNet-9)

Part 3: Best testing accuracy on CIFAR-10+ (SphereResNet-32)

Part 4: Training log (SphereNet-9)

Part 5: Training log (SphereResNet-32)

Notes

Third-party re-implementation

Contact