Awesome

Almost-Orthogonal Lipschitz (AOL) Layers

Code for paper Almost-Orthogonal Layers for Efficient General-Purpose Lipschitz Networks. It includes code for the layers we propose, AOL-Dense and AOL-Conv2D as well as code for models, our proposed loss function and metrics.

Requirements:

Python 3.8
Tensorflow 2.7

How to train an AOL-Network

Specify the setting using the Run-ID. For example in order to train AOL-Small on CIFAR10 use

python train_aol_network.py 0

Run-IDs correspond to the following settings:

Run ID	Model	Dataset	Loss Offset
0	AOL-Small	CIFAR10	$\sqrt{2}$
1	AOL-Medium	CIFAR10	$\sqrt{2}$
2	AOL-Large	CIFAR10	$\sqrt{2}$
10	AOL-FC	CIFAR10	$\sqrt{2}$
11	AOL-CONV	CIFAR10	$\sqrt{2}$
12	AOL-STD	CIFAR10	$\sqrt{2}$
20	AOL-Small	CIFAR10	$\sqrt{2}$/16
21	AOL-Small	CIFAR10	$\sqrt{2}$/4
22	AOL-Small	CIFAR10	$\sqrt{2}$
23	AOL-Small	CIFAR10	4 $\sqrt{2}$
24	AOL-Small	CIFAR10	16 $\sqrt{2}$
30	AOL-Small	CIFAR100	$\sqrt{2}$
31	AOL-Medium	CIFAR100	$\sqrt{2}$
32	AOL-Large	CIFAR100	$\sqrt{2}$

Results

The results for Cifar10 and Cifar100 can be found in the tables below. We report the standard accuracy (Std Acc) as well as the Certified Robust Accuracy (CRA) for different amounts of input perturbations.

CIFAR 10	Std Acc	CRA 36/255	CRA 72/255	CRA 108/255	CRA 1
AOL-Small	70.3%	62.9%	55.0%	47.7%	22.3%
AOL-Medium	71.3%	64.0%	56.1%	49.1%	23.3%
AOL-Large	71.4%	64.1%	56.5%	49.4%	23.8%

Results on CIFAR100:

CIFAR 100	Std Acc	CRA 36/255	CRA 72/255	CRA 108/255	CRA 1
AOL-Small	42.4%	32.5%	24.8%	19.2%	6.7%
AOL-Medium	43.2%	33.7%	26.0%	20.2%	7.2%
AOL-Large	43.7%	33.7%	26.3%	20.7%	7.8%

Ablation Studies

We also report the results for the two ablation studies. We trained the AOL-Small model on CIFAR10 with different offset for the loss function. (We also rescaled the temperature parameter proportionally.) The results are in the table below.

Loss Offset	Std Acc	CRA 36/255	CRA 72/255	CRA 108/255	CRA 1
$\sqrt{2}$/16	80.3%	46.4%	17.5%	3.6%	0.0%
$\sqrt{2}$/4	77.8%	63.0%	47.5%	32.8%	2.2%
$\sqrt{2}$	70.6%	62.8%	54.5%	47.7%	22.5%
4 $\sqrt{2}$	59.9%	55.0%	50.6%	46.3%	31.1%
16 $\sqrt{2}$	48.5%	45.4%	42.4%	39.7%	29.0%

We also report the results for different models, including a network that consisting purely of fully-connected layers (AOL-FC), a relatively standard convolutional achitecture where the number of channels is multiplied by 2 whenever the resolutions decreases (AOL-STD), as well as a convolutional networks that multiplies the number of channels by 4 whenever the spatial resolution is decreases in order to keep the number of activations constant for the first few layers (AOL-Conv):