Awesome

Awesome Computer Vision

A curated list of awesome libraries, frameworks, tools, datasets, and research papers in computer vision, covering topics such as object detection, image segmentation, 3D vision, and more.

Libraries and Frameworks
Tools and Applications
Object Detection
Image Segmentation
3D Computer Vision
Face Recognition
Datasets
Research Papers
Learning Resources
Books
Community
Contribute
License

Libraries and Frameworks

OpenCV - An open-source library providing computer vision and machine learning algorithms for image and video analysis.
Detectron2 - A high-performance object detection library developed by Facebook AI Research.
Dlib - A modern C++ toolkit with machine learning algorithms and tools for computer vision.
Scikit-Image - A Python library for image processing, built on top of SciPy.
Mediapipe - A cross-platform framework by Google for building multimodal ML solutions, including face detection and pose estimation.
DeepLabV3 - A deep learning model for semantic image segmentation.
YOLO (You Only Look Once) - A real-time object detection system.
MMDetection - An open-source object detection toolbox based on PyTorch.

Tools and Applications

LabelImg - An open-source image annotation tool for labeling datasets.
OpenPose - A real-time multi-person detection library for human pose estimation.
VGG Image Annotator (VIA) - A lightweight tool for manual image annotation.
DeepFaceLab - A tool for creating deepfakes using face swapping.
ImageAI - A Python library built to empower developers to build applications using deep learning for computer vision.

Object Detection

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks (2015) - The paper introducing Faster R-CNN for object detection.
SSD: Single Shot MultiBox Detector (2016) - A method for real-time object detection.
YOLO: You Only Look Once - Unified, Real-Time Object Detection (2016) - A fast and accurate object detection algorithm.
EfficientDet: Scalable and Efficient Object Detection (2020) - A model architecture focused on balancing accuracy and efficiency.

Image Segmentation

U-Net: Convolutional Networks for Biomedical Image Segmentation (2015) - A convolutional neural network designed for biomedical image segmentation.
DeepLabV3+: Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation (2018) - An advanced model for semantic segmentation.
Mask R-CNN (2017) - An extension of Faster R-CNN for instance segmentation.
PSPNet: Pyramid Scene Parsing Network (2017) - A semantic segmentation model using a pyramid pooling module.

3D Computer Vision

PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation (2017) - A deep learning model for 3D point cloud processing.
NeRF: Neural Radiance Fields for View Synthesis (2020) - A model for representing 3D scenes using neural networks.
Open3D - An open-source library for 3D data processing and visualization.
Colmap - A general-purpose Structure-from-Motion (SfM) and Multi-View Stereo (MVS) pipeline.

Face Recognition

FaceNet: A Unified Embedding for Face Recognition and Clustering (2015) - A model for face recognition using deep learning.
DeepFace: Closing the Gap to Human-Level Performance in Face Verification (2014) - A method for facial recognition developed by Facebook.
OpenFace - An open-source deep learning model for face recognition.
DeepFaceLab - The leading software for creating deepfakes.

Datasets

ImageNet - A large-scale image dataset used for image classification and object detection.
COCO (Common Objects in Context) - A dataset for object detection, segmentation, and captioning tasks.
PASCAL VOC - A dataset for visual object category recognition and detection.
CelebA - A large-scale face attributes dataset.
KITTI - A dataset for autonomous driving research.

Research Papers

Visual Attention Mechanisms (2015) - Introduction of visual attention mechanisms in neural networks.
Self-Supervised Learning for Visual Representation (2020) - A study on self-supervised learning techniques for computer vision tasks.
BigGAN: Generative Adversarial Networks for Large-Scale Image Synthesis (2018) - A generative model for high-quality image synthesis.

Learning Resources

Stanford CS231n: Convolutional Neural Networks for Visual Recognition - A popular course on computer vision and convolutional networks.
Deep Learning for Computer Vision - Part of Andrew Ng’s deep learning specialization on Coursera.
PyImageSearch - A blog and resource hub for computer vision tutorials.
Kaggle: Computer Vision Datasets - A collection of computer vision datasets on Kaggle.

Books

Deep Learning for Computer Vision by Rajalingappaa Shanmugamani - A guide to deep learning techniques in computer vision.
Computer Vision: Algorithms and Applications by Richard Szeliski - A comprehensive book on computer vision algorithms.
Learning OpenCV by Gary Bradski and Adrian Kaehler - A practical guide to using the OpenCV library.

Community

Reddit: r/ComputerVision - A subreddit for computer vision discussions.
PyImageSearch Community - A forum for discussing computer vision and image processing.
CVPR Conference - The IEEE Conference on Computer Vision and Pattern Recognition.
Kaggle Computer Vision Forum - A community for computer vision discussions on Kaggle.

Contribute

Contributions are welcome!

Awesome

Awesome Computer Vision

Contents

Libraries and Frameworks

Tools and Applications

Object Detection

Image Segmentation

3D Computer Vision

Face Recognition

Datasets

Research Papers

Learning Resources

Books

Community

Contribute

License