Awesome
ADOPT: Modified Adam Can Converge with Any $β_2$ with the Optimal Rate
Official Implementation of "ADOPT: Modified Adam Can Converge with Any β<sub>2</sub> with the Optimal Rate", which is presented at NeurIPS 2024.
Requirements
ADOPT requires PyTorch 2.4.0 or later.
Usage
You can use ADOPT just like any other PyTorch optimizers by copying adopt.py
to your project.
When you replace the Adam
optimizer to our ADOPT
, you should just replace the optimizer as follows:
from adopt import ADOPT
# optimizer = Adam(model.parameters(), lr=1e-3)
optimizer = ADOPT(model.parameters(), lr=1e-3)
When you are using AdamW
as a default optimizer, you should set decoupled=True
for our ADOPT
:
# optimizer = AdamW(model.parameters(), lr=1e-3)
optimizer = ADOPT(model.parameters(), lr=1e-3, decoupled=True)
Citation
If you use ADOPT in your research, please cite the paper.
@inproceedings{taniguchi2024adopt,
author={Taniguchi, Shohei and Harada, Keno and Minegishi, Gouki and Oshima, Yuta and Jeong, Seong Cheol and Nagahara, Go and Iiyama, Tomoshi and Suzuki, Masahiro and Iwasawa, Yusuke and Matsuo, Yutaka},
booktitle = {Advances in Neural Information Processing Systems},
title = {ADOPT: Modified Adam Can Converge with Any β2 with the Optimal Rate},
year = {2024}
}