Home

Awesome

Discriminative learning rates PyTorch

Adaptation of discriminative learning rates from the Fastai library for standard PyTorch.

This is an adaptation of the functions from the fastai library to be used with standard PyTorch. Please see the fastai github for original implementation.

Example of use

#Discriminative learning rates using resnet50 with SGD and CyclicLR

model = torchvision.models.detection.maskrcnn_resnet50_fpn(pretrained=True)
params, lr_arr, _ = discriminative_lr_params(model, slice(1e-5, 1e-3))
optim = torch.optim.SGD(params, lr=1e-3, momentum=0.9, weight_decay=1e-1)
lr_scheduler = torch.optim.lr_scheduler.CyclicLR(optim, base_lr=list(lr_arr), max_lr=list(lr_arr*100))

We can show the learning rates using optim.state_dict()

ParameterLr*
Group 01e-05
Group 11.06512-05
Group 21.13447e-05
Group 31.2085e-05
......
Group 700.00083
Group 710.00088
Group 720.00094
Group 730.00099

* Learning rates are rounded here.

Difference from the fastai version

The big difference is that we give one independant learning rate for each layer here. In the fastai implementation layers are regrouped in blocks.