Home

Awesome

Pretrained models for Pytorch (Work in progress)

The goal of this repo is:

News:

Summary

Installation

  1. python3 with anaconda
  2. pytorch with/out CUDA

Install from pip

  1. pip install pretrainedmodels

Install from repo

  1. git clone https://github.com/Cadene/pretrained-models.pytorch.git
  2. cd pretrained-models.pytorch
  3. python setup.py install

Quick examples

import pretrainedmodels
print(pretrainedmodels.model_names)
> ['fbresnet152', 'bninception', 'resnext101_32x4d', 'resnext101_64x4d', 'inceptionv4', 'inceptionresnetv2', 'alexnet', 'densenet121', 'densenet169', 'densenet201', 'densenet161', 'resnet18', 'resnet34', 'resnet50', 'resnet101', 'resnet152', 'inceptionv3', 'squeezenet1_0', 'squeezenet1_1', 'vgg11', 'vgg11_bn', 'vgg13', 'vgg13_bn', 'vgg16', 'vgg16_bn', 'vgg19_bn', 'vgg19', 'nasnetalarge']
print(pretrainedmodels.pretrained_settings['nasnetalarge'])
> {'imagenet': {'url': 'http://data.lip6.fr/cadene/pretrainedmodels/nasnetalarge-a1897284.pth', 'input_space': 'RGB', 'input_size': [3, 331, 331], 'input_range': [0, 1], 'mean': [0.5, 0.5, 0.5], 'std': [0.5, 0.5, 0.5], 'num_classes': 1000}, 'imagenet+background': {'url': 'http://data.lip6.fr/cadene/pretrainedmodels/nasnetalarge-a1897284.pth', 'input_space': 'RGB', 'input_size': [3, 331, 331], 'input_range': [0, 1], 'mean': [0.5, 0.5, 0.5], 'std': [0.5, 0.5, 0.5], 'num_classes': 1001}}
model_name = 'nasnetalarge' # could be fbresnet152 or inceptionresnetv2
model = pretrainedmodels.__dict__[model_name](num_classes=1000, pretrained='imagenet')
model.eval()

Note: By default, models will be downloaded to your $HOME/.torch folder. You can modify this behavior using the $TORCH_MODEL_ZOO variable as follow: export TORCH_MODEL_ZOO="/local/pretrainedmodels

import torch
import pretrainedmodels.utils as utils

load_img = utils.LoadImage()

# transformations depending on the model
# rescale, center crop, normalize, and others (ex: ToBGR, ToRange255)
tf_img = utils.TransformImage(model) 

path_img = 'data/cat.jpg'

input_img = load_img(path_img)
input_tensor = tf_img(input_img)         # 3x400x225 -> 3x299x299 size may differ
input_tensor = input_tensor.unsqueeze(0) # 3x299x299 -> 1x3x299x299
input = torch.autograd.Variable(input_tensor,
    requires_grad=False)

output_logits = model(input) # 1x1000
output_features = model.features(input) # 1x14x14x2048 size may differ
output_logits = model.logits(output_features) # 1x1000

Few use cases

Compute imagenet logits

$ python examples/imagenet_logits.py -h
> nasnetalarge, resnet152, inceptionresnetv2, inceptionv4, ...
$ python examples/imagenet_logits.py -a nasnetalarge --path_img data/cat.png
> 'nasnetalarge': data/cat.png' is a 'tiger cat' 

Compute imagenet evaluation metrics

$ python examples/imagenet_eval.py /local/common-data/imagenet_2012/images -a nasnetalarge -b 20 -e
> * Acc@1 92.693, Acc@5 96.13

Evaluation on imagenet

Accuracy on validation set (single model)

Results were obtained using (center cropped) images of the same size than during the training process.

ModelVersionAcc@1Acc@5
NASNet-A-MobileTensorflow74.091.6
NASNet-A-MobileOur porting74.0891.74
NASNet-A-LargeTensorflow82.69396.163
NASNet-A-LargeOur porting82.56696.086
InceptionResNetV2Tensorflow80.495.3
InceptionV4Tensorflow80.295.3
InceptionResNetV2Our porting80.17095.234
InceptionV4Our porting80.06294.926
DualPathNet107_5kOur porting79.74694.684
ResNeXt101_64x4dTorch779.694.7
DualPathNet131Our porting79.43294.574
DualPathNet92_5kOur porting79.40094.620
DualPathNet98Our porting79.22494.488
XceptionKeras79.00094.500
ResNeXt101_64x4dOur porting78.95694.252
XceptionOur porting78.88894.292
ResNeXt101_32x4dTorch778.894.4
ResNet152Pytorch78.42894.110
ResNeXt101_32x4dOur porting78.18893.886
FBResNet152Torch777.8493.84
DenseNet161Pytorch77.56093.798
ResNet101Pytorch77.43893.672
FBResNet152Our porting77.38693.594
InceptionV3Pytorch77.29493.454
DenseNet201Pytorch77.15293.548
DualPathNet68b_5kOur porting77.03493.590
DenseNet169Pytorch76.02692.992
ResNet50Pytorch76.00292.980
DualPathNet68Our porting75.86892.774
DenseNet121Pytorch74.64692.136
VGG19_BNPytorch74.26692.066
ResNet34Pytorch73.55491.456
BNInceptionOur porting73.52291.560
VGG16_BNPytorch73.51891.608
VGG19Pytorch72.08090.822
VGG16Pytorch71.63690.354
VGG13_BNPytorch71.50890.494
VGG11_BNPytorch70.45289.818
ResNet18Pytorch70.14289.274
VGG13Pytorch69.66289.264
VGG11Pytorch68.97088.746
SqueezeNet1_1Pytorch58.25080.800
SqueezeNet1_0Pytorch58.10880.428
AlexnetPytorch56.43279.194

Note: the Pytorch version of ResNet152 is not a porting of the Torch7 but has been retrained by facebook.

Beware, the accuracy reported here is not always representative of the transferable capacity of the network on other tasks and datasets. You must try them all! :P

Reproducing results

Please see Compute imagenet validation metrics

Documentation

Available models

NASNet*

Source: TensorFlow Slim repo

FaceBook ResNet*

Source: Torch7 repo of FaceBook

There are a bit different from the ResNet* of torchvision. ResNet152 is currently the only one available.

Inception*

Source: TensorFlow Slim repo and Pytorch/Vision repo for inceptionv3

BNInception

Source: Trained with Caffe by Xiong Yuanjun

ResNeXt*

Source: ResNeXt repo of FaceBook

DualPathNetworks

Source: MXNET repo of Chen Yunpeng

The porting has been made possible by Ross Wightman in his PyTorch repo.

As you can see here DualPathNetworks allows you to try different scales. The default one in this repo is 0.875 meaning that the original input size is 256 before croping to 224.

'imagenet+5k' means that the network has been pretrained on imagenet5k before being finetuned on imagenet1k.

Xception

Source: Keras repo

The porting has been made possible by T Standley.

TorchVision

Source: Pytorch/Vision repo

(inceptionv3 included in Inception*)

Model API

Once a pretrained model has been loaded, you can use it that way.

Important note: All image must be loaded using PIL which scales the pixel values between 0 and 1.

model.input_size

Attribut of type list composed of 3 numbers:

Example:

model.input_space

Attribut of type str representating the color space of the image. Can be RGB or BGR.

model.input_range

Attribut of type list composed of 2 numbers:

Example:

model.mean

Attribut of type list composed of 3 numbers which are used to normalize the input image (substract "color-channel-wise").

Example:

model.std

Attribut of type list composed of 3 numbers which are used to normalize the input image (divide "color-channel-wise").

Example:

model.features

/!\ work in progress (may not be available)

Method which is used to extract the features from the image.

Example when the model is loaded using fbresnet152:

print(input_224.size())            # (1,3,224,224)
output = model.features(input_224) 
print(output.size())               # (1,2048,1,1)

# print(input_448.size())          # (1,3,448,448)
output = model.features(input_448)
# print(output.size())             # (1,2048,7,7)

model.logits

/!\ work in progress (may not be available)

Method which is used to classify the features from the image.

Example when the model is loaded using fbresnet152:

output = model.features(input_224) 
print(output.size())               # (1,2048, 1, 1)
output = model.logits(output)
print(output.size())               # (1,1000)

model.forward

Method used to call model.features and model.logits. It can be overwritten as desired.

Note: A good practice is to use model.__call__ as your function of choice to forward an input to your model. See the example bellow.

# Without model.__call__
output = model.forward(input_224)
print(output.size())      # (1,1000)

# With model.__call__
output = model(input_224)
print(output.size())      # (1,1000)

model.last_linear

Attribut of type nn.Linear. This module is the last one to be called during the forward pass.

Example when the model is loaded using fbresnet152:

print(input_224.size())            # (1,3,224,224)
output = model.features(input_224) 
print(output.size())               # (1,2048,1,1)
output = model.logits(output)
print(output.size())               # (1,1000)

# fine tuning
dim_feats = model.last_linear.in_features # =2048
nb_classes = 4
model.last_linear = nn.Linear(dim_feats, nb_classes)
output = model(input_224)
print(output.size())               # (1,4)

# features extraction
model.last_linear = pretrained.utils.Identity()
output = model(input_224)
print(output.size())               # (1,2048)

Reproducing

Hand porting of ResNet152

th pretrainedmodels/fbresnet/resnet152_dump.lua
python pretrainedmodels/fbresnet/resnet152_load.py

Automatic porting of ResNeXt

https://github.com/clcarwin/convert_torch_to_pytorch

Hand porting of NASNet, InceptionV4 and InceptionResNetV2

https://github.com/Cadene/tensorflow-model-zoo.torch

Acknowledgement

Thanks to the deep learning community and especially to the contributers of the pytorch ecosystem.