Home

Awesome

PyTorch implementation of NovoGrad

Install

pip install novograd

Notice

When using NovoGrad, learning rate scheduler play an important role. Do not forget to set it.

Performance

MNIST

Under Trained 3 epochs, same Architecture Neural Netwrok.

Test Acc(%)lrlr schedulerbeta1beta2weight decay
Momentum SGD96.920.01None0.9N/A0.001
Adam96.720.001None0.90.9990.001
AdamW97.340.001None0.90.9990.001
NovoGrad97.550.01cosine0.950.980.001

Refference

Boris Ginsburg, Patrice Castonguay, Oleksii Hrinchuk, Oleksii Kuchaiev, Vitaly Lavrukhin, Ryan Leary, Jason Li, Huyen Nguyen, Jonathan M. Cohen, Stochastic Gradient Methods with Layer-wise Adaptive Moments for Training of Deep Networks, arXiv:1905.11286 [cs.LG], https://arxiv.org/pdf/1905.11286.pdf