Home

Awesome

Almost-Orthogonal Lipschitz (AOL) Layers

Code for paper Almost-Orthogonal Layers for Efficient General-Purpose Lipschitz Networks. It includes code for the layers we propose, AOL-Dense and AOL-Conv2D as well as code for models, our proposed loss function and metrics.

Requirements:

How to train an AOL-Network

Specify the setting using the Run-ID. For example in order to train AOL-Small on CIFAR10 use

python train_aol_network.py 0

Run-IDs correspond to the following settings:

Run IDModelDatasetLoss Offset
0AOL-SmallCIFAR10$\sqrt{2}$
1AOL-MediumCIFAR10$\sqrt{2}$
2AOL-LargeCIFAR10$\sqrt{2}$
10AOL-FCCIFAR10$\sqrt{2}$
11AOL-CONVCIFAR10$\sqrt{2}$
12AOL-STDCIFAR10$\sqrt{2}$
20AOL-SmallCIFAR10$\sqrt{2}$/16
21AOL-SmallCIFAR10$\sqrt{2}$/4
22AOL-SmallCIFAR10$\sqrt{2}$
23AOL-SmallCIFAR104 $\sqrt{2}$
24AOL-SmallCIFAR1016 $\sqrt{2}$
30AOL-SmallCIFAR100$\sqrt{2}$
31AOL-MediumCIFAR100$\sqrt{2}$
32AOL-LargeCIFAR100$\sqrt{2}$

Results

The results for Cifar10 and Cifar100 can be found in the tables below. We report the standard accuracy (Std Acc) as well as the Certified Robust Accuracy (CRA) for different amounts of input perturbations.

CIFAR 10Std AccCRA 36/255CRA 72/255CRA 108/255CRA 1
AOL-Small70.3%62.9%55.0%47.7%22.3%
AOL-Medium71.3%64.0%56.1%49.1%23.3%
AOL-Large71.4%64.1%56.5%49.4%23.8%

Results on CIFAR100:

CIFAR 100Std AccCRA 36/255CRA 72/255CRA 108/255CRA 1
AOL-Small42.4%32.5%24.8%19.2%6.7%
AOL-Medium43.2%33.7%26.0%20.2%7.2%
AOL-Large43.7%33.7%26.3%20.7%7.8%

Ablation Studies

We also report the results for the two ablation studies. We trained the AOL-Small model on CIFAR10 with different offset for the loss function. (We also rescaled the temperature parameter proportionally.) The results are in the table below.

Loss OffsetStd AccCRA 36/255CRA 72/255CRA 108/255CRA 1
$\sqrt{2}$/1680.3%46.4%17.5%3.6%0.0%
$\sqrt{2}$/477.8%63.0%47.5%32.8%2.2%
$\sqrt{2}$70.6%62.8%54.5%47.7%22.5%
4 $\sqrt{2}$59.9%55.0%50.6%46.3%31.1%
16 $\sqrt{2}$48.5%45.4%42.4%39.7%29.0%

We also report the results for different models, including a network that consisting purely of fully-connected layers (AOL-FC), a relatively standard convolutional achitecture where the number of channels is multiplied by 2 whenever the resolutions decreases (AOL-STD), as well as a convolutional networks that multiplies the number of channels by 4 whenever the spatial resolution is decreases in order to keep the number of activations constant for the first few layers (AOL-Conv):

CIFAR 10Std AccCRA 36/255CRA 72/255CRA 108/255CRA 1
AOL-FC67.1%58.5%50.3%42.4%17.6%
AOL-STD65.4%56.9%48.3%40.5%16.2%
AOL-Conv68.5%60.2%52.3%45.2%19.5%

Citations