Home

Awesome

Maintenance GitHub GitHub GitHub

Computer Vision Pretrained Models

CV logo

What is pre-trained Model?

A pre-trained model is a model created by some one else to solve a similar problem. Instead of building a model from scratch to solve a similar problem, we can use the model trained on other problem as a starting point. A pre-trained model may not be 100% accurate in your application.

For example, if you want to build a self learning car. You can spend years to build a decent image recognition algorithm from scratch or you can take inception model (a pre-trained model) from Google which was built on ImageNet data to identify images in those pictures.

Other Pre-trained Models

Model Deployment library

Framework

Model visualization

You can see visualizations of each model's network architecture by using Netron.

CV logo

Tensorflow <a name="tensorflow"/>

Model NameDescriptionFrameworkLicense
ObjectDetectionLocalizing and identifying multiple objects in a single image.TensorflowApache License
Mask R-CNNThe model generates bounding boxes and segmentation masks for each instance of an object in the image. It's based on Feature Pyramid Network (FPN) and a ResNet101 backbone.TensorflowThe MIT License (MIT)
Faster-RCNNThis is an experimental Tensorflow implementation of Faster RCNN - a convnet for object detection with a region proposal network.TensorflowMIT License
YOLO TensorFlowThis is tensorflow implementation of the YOLO:Real-Time Object Detection.TensorflowCustom
YOLO TensorFlow ++TensorFlow implementation of 'YOLO: Real-Time Object Detection', with training and an actual support for real-time running on mobile devices.TensorflowGNU GENERAL PUBLIC LICENSE
MobileNetMobileNets trade off between latency, size and accuracy while comparing favorably with popular models from the literature.TensorflowThe MIT License (MIT)
DeepLabDeep labeling for semantic image segmentation.TensorflowApache License
ColornetNeural Network to colorize grayscale images.TensorflowNot Found
SRGANPhoto-Realistic Single Image Super-Resolution Using a Generative Adversarial Network.TensorflowNot Found
DeepOSMTrain TensorFlow neural nets with OpenStreetMap features and satellite imagery.TensorflowThe MIT License (MIT)
Domain Transfer NetworkImplementation of Unsupervised Cross-Domain Image Generation.TensorflowMIT License
Show, Attend and TellAttention Based Image Caption Generator.TensorflowMIT License
android-yoloReal-time object detection on Android using the YOLO network, powered by TensorFlow.TensorflowApache License
DCSCN Super ResolutionThis is a tensorflow implementation of "Fast and Accurate Image Super Resolution by Deep CNN with Skip Connection and Network in Network", a deep learning based Single-Image Super-Resolution (SISR) model.TensorflowNot Found
GAN-CLSThis is an experimental tensorflow implementation of synthesizing images.TensorflowNot Found
U-NetFor Brain Tumor Segmentation.TensorflowNot Found
Improved CycleGANUnpaired Image to Image Translation.TensorflowMIT License
Im2txtImage-to-text neural network for image captioning.TensorflowApache License
SLIMImage classification models in TF-Slim.TensorflowApache License
DELFDeep local features for image matching and retrieval.TensorflowApache License
CompressionCompressing and decompressing images using a pre-trained Residual GRU network.TensorflowApache License
AttentionOCRA model for real-world image text extraction.TensorflowApache License
<div align="right"> <b><a href="#framework">↥ Back To Top</a></b> </div>

Keras <a name="keras"/>

Model NameDescriptionFrameworkLicense
Mask R-CNNThe model generates bounding boxes and segmentation masks for each instance of an object in the image. It's based on Feature Pyramid Network (FPN) and a ResNet101 backbone.KerasThe MIT License (MIT)
VGG16Very Deep Convolutional Networks for Large-Scale Image Recognition.KerasThe MIT License (MIT)
VGG19Very Deep Convolutional Networks for Large-Scale Image Recognition.KerasThe MIT License (MIT)
ResNetDeep Residual Learning for Image Recognition.KerasThe MIT License (MIT)
ResNet50Deep Residual Learning for Image Recognition.KerasThe MIT License (MIT)
NasnetNASNet refers to Neural Architecture Search Network, a family of models that were designed automatically by learning the model architectures directly on the dataset of interest.KerasThe MIT License (MIT)
MobileNetMobileNet v1 models for Keras.KerasThe MIT License (MIT)
MobileNet V2MobileNet v2 models for Keras.KerasThe MIT License (MIT)
MobileNet V3MobileNet v3 models for Keras.KerasThe MIT License (MIT)
efficientnetRethinking Model Scaling for Convolutional Neural Networks.KerasThe MIT License (MIT)
Image analogiesGenerate image analogies using neural matching and blending.KerasThe MIT License (MIT)
Popular Image Segmentation ModelsImplementation of Segnet, FCN, UNet and other models in Keras.KerasMIT License
Ultrasound nerve segmentationThis tutorial shows how to use Keras library to build deep neural network for ultrasound image nerve segmentation.KerasMIT License
DeepMask object segmentationThis is a Keras-based Python implementation of DeepMask- a complex deep neural network for learning object segmentation masks.KerasNot Found
Monolingual and Multilingual Image CaptioningThis is the source code that accompanies Multilingual Image Description with Neural Sequence Models.KerasBSD-3-Clause License
pix2pixKeras implementation of Image-to-Image Translation with Conditional Adversarial Networks by Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, Alexei A.KerasNot Found
Colorful Image colorizationB&W to color.KerasNot Found
CycleGANImplementation of Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks.KerasMIT License
DualGANImplementation of DualGAN: Unsupervised Dual Learning for Image-to-Image Translation.KerasMIT License
Super-Resolution GANImplementation of Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network.KerasMIT License
<div align="right"> <b><a href="#framework">↥ Back To Top</a></b> </div>

PyTorch <a name="pytorch"/>

Model NameDescriptionFrameworkLicense
detectron2Detectron2 is Facebook AI Research's next generation software system that implements state-of-the-art object detection algorithmsPyTorchApache License 2.0
FastPhotoStyleA Closed-form Solution to Photorealistic Image Stylization.PyTorchCreative Commons Attribution-NonCommercial-ShareAlike 4.0 International Public Licens
pytorch-CycleGAN-and-pix2pixA Closed-form Solution to Photorealistic Image Stylization.PyTorchBSD License
maskrcnn-benchmarkFast, modular reference implementation of Instance Segmentation and Object Detection algorithms in PyTorch.PyTorchMIT License
deep-image-priorImage restoration with neural networks but without learning.PyTorchApache License 2.0
StarGANStarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation.PyTorchMIT License
faster-rcnn.pytorchThis project is a faster faster R-CNN implementation, aimed to accelerating the training of faster R-CNN object detection models.PyTorchMIT License
pix2pixHDSynthesizing and manipulating 2048x1024 images with conditional GANs.PyTorchBSD License
AugmentorImage augmentation library in Python for machine learning.PyTorchMIT License
albumentationsFast image augmentation library.PyTorchMIT License
Deep Video AnalyticsDeep Video Analytics is a platform for indexing and extracting information from videos and imagesPyTorchCustom
semantic-segmentation-pytorchPytorch implementation for Semantic Segmentation/Scene Parsing on MIT ADE20K dataset.PyTorchBSD 3-Clause License
An End-to-End Trainable Neural Network for Image-based Sequence RecognitionThis software implements the Convolutional Recurrent Neural Network (CRNN), a combination of CNN, RNN and CTC loss for image-based sequence recognition tasks, such as scene text recognition and OCR.PyTorchThe MIT License (MIT)
UNITPyTorch Implementation of our Coupled VAE-GAN algorithm for Unsupervised Image-to-Image Translation.PyTorchCreative Commons Attribution-NonCommercial-ShareAlike 4.0 International Public License
Neural Sequence labeling modelSequence labeling models are quite popular in many NLP tasks, such as Named Entity Recognition (NER), part-of-speech (POS) tagging and word segmentation.PyTorchApache License
faster rcnnThis is a PyTorch implementation of Faster RCNN. This project is mainly based on py-faster-rcnn and TFFRCNN. For details about R-CNN please refer to the paper Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks by Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun.PyTorchMIT License
pytorch-semantic-segmentationPyTorch for Semantic Segmentation.PyTorchMIT License
EDSR-PyTorchPyTorch version of the paper 'Enhanced Deep Residual Networks for Single Image Super-Resolution'.PyTorchMIT License
image-classification-mobileCollection of classification models pretrained on the ImageNet-1K.PyTorchMIT License
FaderNetworksFader Networks: Manipulating Images by Sliding Attributes - NIPS 2017.PyTorchCreative Commons Attribution-NonCommercial 4.0 International Public License
neuraltalk2-pytorchImage captioning model in pytorch (finetunable cnn in branch with_finetune).PyTorchMIT License
RandWireNNImplementation of: "Exploring Randomly Wired Neural Networks for Image Recognition".PyTorchNot Found
stackGAN-v2Pytorch implementation for reproducing StackGAN_v2 results in the paper StackGAN++.PyTorchMIT License
Detectron models for Object DetectionThis code allows to use some of the Detectron models for object detection from Facebook AI Research with PyTorch.PyTorchApache License
DEXTR-PyTorchThis paper explores the use of extreme points in an object (left-most, right-most, top, bottom pixels) as input to obtain precise object segmentation for images and videos.PyTorchGNU GENERAL PUBLIC LICENSE
pointnet.pytorchPytorch implementation for "PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation.PyTorchMIT License
self-critical.pytorchThis repository includes the unofficial implementation Self-critical Sequence Training for Image Captioning and Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering.PyTorchMIT License
vnet.pytorchA Pytorch implementation for V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation.PyTorchBSD 3-Clause License
piwisePixel-wise segmentation on VOC2012 dataset using pytorch.PyTorchBSD 3-Clause License
pspnet-pytorchPyTorch implementation of PSPNet segmentation network.PyTorchNot Found
pytorch-SRResNetPytorch implementation for Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network.PyTorchThe MIT License (MIT)
PNASNet.pytorchPyTorch implementation of PNASNet-5 on ImageNet.PyTorchApache License
img_classification_pk_pytorchQuickly comparing your image classification models with the state-of-the-art models.PyTorchNot Found
Deep Neural Networks are Easily FooledHigh Confidence Predictions for Unrecognizable Images.PyTorchMIT License
pix2pix-pytorchPyTorch implementation of "Image-to-Image Translation Using Conditional Adversarial Networks".PyTorchNot Found
NVIDIA/semantic-segmentationA PyTorch Implementation of Improving Semantic Segmentation via Video Propagation and Label Relaxation, In CVPR2019.PyTorchCC BY-NC-SA 4.0 license
Neural-IMage-AssessmentA PyTorch Implementation of Neural IMage Assessment.PyTorchNot Found
torchxrayvisionPretrained models for chest X-ray (CXR) pathology predictions. Medical, Healthcare, RadiologyPyTorchApache License
pytorch-image-modelsPyTorch image models, scripts, pretrained weights -- (SE)ResNet/ResNeXT, DPN, EfficientNet, MixNet, MobileNet-V3/V2, MNASNet, Single-Path NAS, FBNet, and morePyTorchApache License 2.0
<div align="right"> <b><a href="#framework">↥ Back To Top</a></b> </div>

Caffe <a name="caffe"/>

Model NameDescriptionFrameworkLicense
OpenPoseOpenPose represents the first real-time multi-person system to jointly detect human body, hand, and facial keypoints (in total 130 keypoints) on single images.CaffeCustom
Fully Convolutional Networks for Semantic SegmentationFully Convolutional Models for Semantic Segmentation.CaffeNot Found
Colorful Image ColorizationColorful Image Colorization.CaffeBSD-2-Clause License
R-FCNR-FCN: Object Detection via Region-based Fully Convolutional Networks.CaffeMIT License
cnn-visInspired by Google's recent Inceptionism blog post, cnn-vis is an open-source tool that lets you use convolutional neural networks to generate images.CaffeThe MIT License (MIT)
DeconvNetLearning Deconvolution Network for Semantic Segmentation.CaffeCustom
<div align="right"> <b><a href="#framework">↥ Back To Top</a></b> </div>

MXNet <a name="mxnet"/>

Model NameDescriptionFrameworkLicense
Faster RCNNRegion Proposal Network solves object detection as a regression problem.MXNetApache License, Version 2.0
SSDSSD is an unified framework for object detection with a single network.MXNetMIT License
Faster RCNN+Focal LossThe code is unofficial version for focal loss for Dense Object Detection.MXNetNot Found
CNN-LSTM-CTCI realize three different models for text recognition, and all of them consist of CTC loss layer to realize no segmentation for text images.MXNetNot Found
Faster_RCNN_for_DOTAThis is the official repo of paper DOTA: A Large-scale Dataset for Object Detection in Aerial Images.MXNetApache License
RetinaNetFocal loss for Dense Object Detection.MXNetNot Found
MobileNetV2This is a MXNet implementation of MobileNetV2 architecture as described in the paper Inverted Residuals and Linear Bottlenecks: Mobile Networks for Classification, Detection and Segmentation.MXNetApache License
neuron-selectivity-transferThis code is a re-implementation of the imagenet classification experiments in the paper Like What You Like: Knowledge Distill via Neuron Selectivity Transfer.MXNetApache License
MobileNetV2This is a Gluon implementation of MobileNetV2 architecture as described in the paper Inverted Residuals and Linear Bottlenecks: Mobile Networks for Classification, Detection and Segmentation.MXNetApache License
sparse-structure-selectionThis code is a re-implementation of the imagenet classification experiments in the paper Data-Driven Sparse Structure Selection for Deep Neural Networks.MXNetApache License
FastPhotoStyleA Closed-form Solution to Photorealistic Image Stylization.MXNetCreative Commons Attribution-NonCommercial-ShareAlike 4.0 International Public License
<div align="right"> <b><a href="#framework">↥ Back To Top</a></b> </div>

Contributions

Your contributions are always welcome!! Please have a look at contributing.md

License

MIT License