Awesome
Awesome Computer Vision
A curated list of awesome libraries, frameworks, tools, datasets, and research papers in computer vision, covering topics such as object detection, image segmentation, 3D vision, and more.
Contents
- Libraries and Frameworks
- Tools and Applications
- Object Detection
- Image Segmentation
- 3D Computer Vision
- Face Recognition
- Datasets
- Research Papers
- Learning Resources
- Books
- Community
- Contribute
- License
Libraries and Frameworks
- OpenCV - An open-source library providing computer vision and machine learning algorithms for image and video analysis.
- Detectron2 - A high-performance object detection library developed by Facebook AI Research.
- Dlib - A modern C++ toolkit with machine learning algorithms and tools for computer vision.
- Scikit-Image - A Python library for image processing, built on top of SciPy.
- Mediapipe - A cross-platform framework by Google for building multimodal ML solutions, including face detection and pose estimation.
- DeepLabV3 - A deep learning model for semantic image segmentation.
- YOLO (You Only Look Once) - A real-time object detection system.
- MMDetection - An open-source object detection toolbox based on PyTorch.
Tools and Applications
- LabelImg - An open-source image annotation tool for labeling datasets.
- OpenPose - A real-time multi-person detection library for human pose estimation.
- VGG Image Annotator (VIA) - A lightweight tool for manual image annotation.
- DeepFaceLab - A tool for creating deepfakes using face swapping.
- ImageAI - A Python library built to empower developers to build applications using deep learning for computer vision.
Object Detection
- Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks (2015) - The paper introducing Faster R-CNN for object detection.
- SSD: Single Shot MultiBox Detector (2016) - A method for real-time object detection.
- YOLO: You Only Look Once - Unified, Real-Time Object Detection (2016) - A fast and accurate object detection algorithm.
- EfficientDet: Scalable and Efficient Object Detection (2020) - A model architecture focused on balancing accuracy and efficiency.
Image Segmentation
- U-Net: Convolutional Networks for Biomedical Image Segmentation (2015) - A convolutional neural network designed for biomedical image segmentation.
- DeepLabV3+: Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation (2018) - An advanced model for semantic segmentation.
- Mask R-CNN (2017) - An extension of Faster R-CNN for instance segmentation.
- PSPNet: Pyramid Scene Parsing Network (2017) - A semantic segmentation model using a pyramid pooling module.
3D Computer Vision
- PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation (2017) - A deep learning model for 3D point cloud processing.
- NeRF: Neural Radiance Fields for View Synthesis (2020) - A model for representing 3D scenes using neural networks.
- Open3D - An open-source library for 3D data processing and visualization.
- Colmap - A general-purpose Structure-from-Motion (SfM) and Multi-View Stereo (MVS) pipeline.
Face Recognition
- FaceNet: A Unified Embedding for Face Recognition and Clustering (2015) - A model for face recognition using deep learning.
- DeepFace: Closing the Gap to Human-Level Performance in Face Verification (2014) - A method for facial recognition developed by Facebook.
- OpenFace - An open-source deep learning model for face recognition.
- DeepFaceLab - The leading software for creating deepfakes.
Datasets
- ImageNet - A large-scale image dataset used for image classification and object detection.
- COCO (Common Objects in Context) - A dataset for object detection, segmentation, and captioning tasks.
- PASCAL VOC - A dataset for visual object category recognition and detection.
- CelebA - A large-scale face attributes dataset.
- KITTI - A dataset for autonomous driving research.
Research Papers
- Visual Attention Mechanisms (2015) - Introduction of visual attention mechanisms in neural networks.
- Self-Supervised Learning for Visual Representation (2020) - A study on self-supervised learning techniques for computer vision tasks.
- BigGAN: Generative Adversarial Networks for Large-Scale Image Synthesis (2018) - A generative model for high-quality image synthesis.
Learning Resources
- Stanford CS231n: Convolutional Neural Networks for Visual Recognition - A popular course on computer vision and convolutional networks.
- Deep Learning for Computer Vision - Part of Andrew Ng’s deep learning specialization on Coursera.
- PyImageSearch - A blog and resource hub for computer vision tutorials.
- Kaggle: Computer Vision Datasets - A collection of computer vision datasets on Kaggle.
Books
- Deep Learning for Computer Vision by Rajalingappaa Shanmugamani - A guide to deep learning techniques in computer vision.
- Computer Vision: Algorithms and Applications by Richard Szeliski - A comprehensive book on computer vision algorithms.
- Learning OpenCV by Gary Bradski and Adrian Kaehler - A practical guide to using the OpenCV library.
Community
- Reddit: r/ComputerVision - A subreddit for computer vision discussions.
- PyImageSearch Community - A forum for discussing computer vision and image processing.
- CVPR Conference - The IEEE Conference on Computer Vision and Pattern Recognition.
- Kaggle Computer Vision Forum - A community for computer vision discussions on Kaggle.
Contribute
Contributions are welcome!