Awesome

awesome-local-global-descriptor

This is my personal note about local and global descriptor. Trying to make anyone can get in to these fields more easily. If you find anything you want to add, feel free to post on issue or email me.

This repo is also a side product when I was doing the survey of our paper UR2KID. If you find this repo useful, please also consider to cite our paper.

@article{yang2020ur2kid,
  title={UR2KiD: Unifying Retrieval, Keypoint Detection, and Keypoint Description without Local Correspondence Supervision},
  author={Yang*, Tsun-Yi; Nguyen*, Duy-Kien; Heijnen, Huub; Balntas, Vassileios},
  journal={arXiv preprint arXiv:2001.07252},
  year={2020}
}

This repo will be constantly updated.

Author: Tsun-Yi Yang (shamangary@hotmail.com)

Online talks

Year	Topic	Link
[ECCV20]	MLAD Workshop	morning, afternoon
[3DV20]	3DGV Talk: Marc Pollefeys - 3D geometric vision	youtube
[CVPR20]	Image Matching Workshop	youtube
[CVPR20]	CVPR2020 tutorial: Local Features: From SIFT to Differentiable Methods	youtube
[CVPR20]	Deep Visual SLAM Frontends: SuperPoint, SuperGlue, and SuperMaps	youtube

Local matching pipeline

In this section, I focus on the review about the sparse keypoint matching and it's pipeline.

1. Keypoint detection

This subsection includes the review about keypoint detection and it's orientation, scale, or affine transformation estimation.

Year	Paper	Link	Code
[CVPR20]	Holistically-Attracted Wireframe Parsing	arXiv	github
[CVPR20]	KeyPose: Multi-View 3D Labeling and Keypoint Estimation for Transparent Objects	arXiv	link
[3DV19]	SIPs: Succinct Interest Points from Unsupervised Inlierness Probability Learning	arXiv	Github
[ICCV19]	Key.Net: Keypoint Detection by Handcrafted and Learned CNN Filters	PDF	Github
[ECCV18]	Repeatability Is Not Enough: Learning Discriminative Affine Regions via Discriminability	arXiv	Github
[CVPR17]	Learning Discriminative and Transformation Covariant Local Feature Detectors	PDF	Github
[CVPR17]	Quad-networks: unsupervised learning to rank for interest point detection	PDF	-
[CVPR16]	Learning to Assign Orientations to Feature Poitns	-	Github
[CVPR15]	TILDE: a Temporally Invariant Learned DEtector	arXiv	Github

Year	Paper	link	Code
[ECCV20]	DH3D: Deep Hierarchical 3D Descriptors for Robust Large-Scale 6DoF Relocalization	link	github
[ICCV19]	USIP: Unsupervised Stable Interest Point Detection from 3D Point Clouds	arXiv	Github
[arXiv19]	Self-Supervised 3D Keypoint Learning for Ego-motion Estimation	arXiv	Github

2. Keypoint description (local descriptor)

In the last few decades, people focus on the patch descriptor

Hand-crafted

Year	Paper	link	Code
[CVPR16]	Accumulated Stability Voting: A Robust Descriptor from Descriptors of Multiple Scales	PDF	Github
[CVPR15]	Domain-Size Pooling in Local Descriptors: DSP-SIFT	PDF	-
[CVPR15]	BOLD - Binary Online Learned Descriptor For Efficient Image Matching	PDF	Github
[CVPR13]	Boosting binary keypoint descriptors	-	-
[CVPR12]	Freak: Fast retina keypoint	-	-
[CVPR12]	Three things everyone should know to improve object retrieval	PDF	-
[IPOL11]	ASIFT: An Algorithm for Fully Affine Invariant Comparison	-	-
[ICCV11]	BRISK: Binary robust invariant scalable keypoints	-	-
[ICCV11]	Orb: An efficient alternative to sift or surf	-	-
[ICCV11]	Local inten-sity order pattern for feature description	-	-
[CVIU06]	Speeded-up robust features (SURF)	-	-
[ECCV06]	Surf:Speeded up robust features	-	-
[IJCV04]	Distinctive image features from scale-invariant keypoints	-	Github

Deep learning

Year	Paper	link	Code
[TIP19]	Learning Local Descriptors by Optimizing the Keypoint-Correspondence Criterion: Applications to Face Matching, Learning from Unlabeled Videos and 3D-Shape Retrieval	arXiv	Github
[ICCV19]	Beyond Cartesian Representations for Local Descriptors	PDF	-
[CVPR19]	SOSNet: Second Order Similarity Regularization for Local Descriptor Learning	arXiv,Page	Github
[ECCV18]	GeoDesc: Learning Local Descriptors by Integrating Geometry Constraints	-	Github
[CVPR18]	Local Descriptors Optimized for Average Precision	Page	-
[NIPS17]	Working hard to know your neighbor's margins: Local descriptor learning loss	arXiv	Github
[ICCV17]	DeepCD: Learning Deep Complementary Descriptors for Patch Representations	PDF	Github
[CVPR17]	L2-Net: Deep Learning of Discriminative Patch Descriptor in Euclidean Space	PDF	Github
[arXiv16]	PN-Net: Conjoined Triple Deep Network for Learning Local Image Descriptors	arXiv	Github
[BMVC16]	Learning local feature descriptors with triplets and shallow convolutional neural networks	PDF	Github
[ICCV15]	Discriminative Learning of Deep Convolutional Feature Point Descriptors	Page	Github
[CVPR15]	MatchNet: Unifying Feature and Metric Learning for Patch-Based Matching	PDF	-
[CVPR15]	Learning to compare image patches via convolutional neural networks	PDF	Github

Year	Paper	link	Code
[arXiv19]	DEEPPOINT3D: LEARNING DISCRIMINATIVE LOCAL DESCRIPTORS USING DEEP METRIC LEARNING ON 3D POINT CLOUDS	arXiv	-

3. End-to-end matching pipeline

Recently, more and more papers try to embed the whole matching pipeline (keypoint detection, keypoint description) into one framework.

Year	Paper	link	Code
[arXiv20]	Dense Semantic 3D Map Based Long-Term Visual Localization with Hybrid Features	arXiv	-
[arXiv20]	D2D: Learning to find good correspondences for image matching and manipulation	arXiv	-
[arXiv20]	DISK: Learning local features with policy gradient	arXiv	-
[arXiv20]	D2D: Keypoint Extraction with Describe to Detect Approach	arXiv	-
[arXiv20]	HDD-Net: Hybrid Detector Descriptor with Mutual Interactive Learning	arXiv	-
[arXiv20]	Learning Feature Descriptors using Camera Pose Supervision	arXiv	-
[arXiv20]	Efficient Neighbourhood Consensus Networks via Submanifold Sparse Convolutions	arXiv	github
[arXiv20]	S2DNet: Learning Accurate Correspondences for Sparse-to-Dense Feature Matching	arXiv	-
[CVPR20]	ASLFeat: Learning Local Features of Accurate Shape and Localization	arXiv	github,tfmatch
[CVPR20]	Reinforced Feature Points: Optimizing Feature Detection and Description for a High-Level Task	arXiv	-
[WACV19]	DGC-Net: Dense Geometric Correspondence Network	arXiv	github
[NIPS19]	R2D2: Repeatable and Reliable Detector and Descriptor	arXiv,Page	Github
[ICCV19]	ELF: Embedded Localisation of Features in Pre-Trained CNN	PDF	Github
[CVPR19]	RF-Net: An End-to-End Image Matching Network based on Receptive Field	arXiv	Github
[CVPR19]	D2-Net: A Trainable CNN for Joint Description and Detection of Local Features	arXiv,Page	Github
[BMVC19]	Matching Features without Descriptors: Implicitly Matched Interest Points	PDF	github
[CVPRW18]	SuperPoint: Self-Supervised Interest Point Detection and Description	arXiv	Github,3rd_party
[NIPS18]	LF-Net: Learning Local Features from Images	PDF	Github
[ECCV16]	LIFT: Learned Invariant Feature Points	-	Github

Year	Paper	link	Code
[CVPR20]	D3Feat: Joint Learning of Dense Detection and Description of 3D Local Features	arXiv	github
[arXiv20]	StickyPillars: Robust feature matching on point clouds using Graph Neural Networks	arXiv	-

3.5. Dense descriptor

Unlike local keypoint descriptor depends on keypoint, some works try to get the whole dense descriptor representation.

Year	Paper	link	Code
[ICRA20]	GN-Net: The Gauss-Newton Loss for Multi-Weather Relocalization	arXiv, MyNote	Web
[ICCV17]	CLKN: Cascaded Lucas-Kanade Networks for Image Alignment	PDF	-

4. Geometric verification or learning based matcher

After the matching, standard RANSAC and it's variants are usually adopted for outlier removal.

Algorithm based

Year	Paper	link	Code
[ECCV20]	Making Affine Correspondences Work in Camera Geometry Computation	arXiv	github
[arXiv20]	AdaLAM: Revisiting Handcrafted Outlier Detection	arXiv	github
[arXiv20]	Multi-View Optimization of Local Feature Geometry	arXiv	-
[CVPR19]	MAGSAC: Marginalizing Sample Consensus	PDF	Github
[CVPR16]	Progressive Feature Matching with Alternate Descriptor Selection and Correspondence Enrichment	PDF	-
[CVPR13]	Robust Feature Matching with Alternate Hough and Inverted Hough Transforms	PDF	-
[ECCV12]	Improving Image-Based Localization by Active Correspondence Search	PDF	-
[CVPR05]	Matching with PROSAC – Progressive Sample Consensus	PDF	-
[CVPR05]	Two-View Geometry Estimation Unaffected by a Dominant Plane	PDF	Github

Deep learning based

Year	Paper	link	Code
[ECCV20]	Online Invariance Selection for Local Feature Descriptors	arXiv	github
[CVPR20]	SuperGlue: Learning Feature Matching with Graph Neural Networks	arXiv	Github
[CVPR20]	High-dimensional Convolutional Networks for Geometric Pattern Recognition	arXiv, youtube	-
[CVPR20]	ACNe: Attentive Context Normalization for Robust Permutation-Equivariant Learning	arXiv	github
[arXiv20]	RANSAC-Flow: generic two-stage image alignment	arXiv, youtube	page,Github
[ICCV19]	NG-RANSAC for Epipolar Geometry from Sparse Correspondences	arXiv	Github
[ICCV19]	Learning Two-View Correspondences and Geometry Using Order-Aware Network	arXiv	Github
[CVPR18]	Learning to Find Good Correspondences	-	Github

Image registration

Year	Paper	link	Code
[arXiv20]	Deep Global Registration	arXiv, youtube	-
[Access18]	Multi-Temporal Remote Sensing Image Registration Using Deep Convolutional Features	PDF	Github

Global retrieval

Consider global retrieval usually targets on a lot of candidates, there are several way to generate one single description for one image.

1. Feature aggregation

Hand-crafted

When there is only hand-crafted local descriptors, people usually uses feature aggregation from a set of local descriptors and output a single description.

Year	Paper	link	Code
[ICCV13] <br> [IJCV15]	To aggregate or not to aggregate: Selective match kernels for image search <br> Image search with selective match kernels: aggregation across single and multiple images	ICCV <br> IJCV	Official : matlab, from DELF (tensorflow)
[CVPR13]	All about VLAD	PDF	-
[ECCV10]	Improving the fisher kernel for large-scale image classification	PDF	-
[CVPR07]	Object retrieval with large vocabularies and fast spatial matching	PDF	-
[CVPR06]	Fisher kenrels on visual vocabularies for image categorizaton	PDF	-

Deep learning

Similar idea but use deep learning to adapt classical algorithm

Year	Paper	link	Code
[ECCV16]	CNN Image Retrieval Learns from BoW: Unsupervised Fine-Tuning with Hard Examples.	PDF	-
[CVPR16]	NetVLAD: CNN architecture for weakly supervised place recognition	Page	Github

2. Real-valued descriptor

One single representation from the image.

Year	Paper	link	Code
[ECCV20]	Learning and aggregating deep local descriptors for instance-level recognition	arXiv	github
[ECCV20]	Predicting Visual Overlap of Images Through Interpretable Non-Metric Box Embeddings	arXiv	github
[ECCV20]	Smooth-AP: Smoothing the Path Towards Large-Scale Image Retrieval	arXiv	github
[ECCV20]	SOLAR: Second-Order Loss and Attention for Image Retrieval	arXiv	-
[ECCV20]	Unifying Deep Local and Global Features for Efficient Image Search	arXiv	-
[arXiv19]	ACTNET: end-to-end learning of feature activations and multi-stream aggregation for effective instance image retrieval	arXiv	-
[TIP19]	REMAP: Multi-layer entropy-guided pooling of dense CNN features for image retrieval	arXiv	-
[ICCV19]	Learning with Average Precision: Training Image Retrieval with a Listwise Loss	arXiv	Github
[CVPR19]	Detect-to-Retrieve: Efficient Regional Aggregation for Image Search	PDF	Github
[TPAMI18]	Fine-tuning CNN Image Retrieval with No Human Annotation	arXiv	Github
[IJCV17]	End-to-end Learning of Deep Visual Representations for Image Retrieval	arXiv	Github
[ICCV17]	Large-Scale Image Retrieval with Attentive Deep Local Features	-	Github
[ECCV16]	CNN Image Retrieval Learns from BoW: Unsupervised Fine-Tuning with Hard Examples	arXiv	Github

3. Binary descriptor and quantization

For more compact representation, a binary descriptor can be generated from hashing or thresholding. Quantization is also very popular in large-scale image retrieval.

Year	Paper	link	Code
[ICCVW19]	DAME WEB: DynAmic MEan with Whitening Ensemble Binarization for Landmark Retrieval without Human Annotation	PDF	Github
[CVPR19]	FastAP: Deep Metric Learning to Rank	PDF	Github
[CVPR18]	Hashing as Tie-Aware Learning to Rank	PDF	Github
[AAAI18]	Deep Region Hashing for Generic Instance Search from Image	-	-
[TPAMI18]	Supervised Learning of Semantics-Preserving Hash via Deep Convolutional NeuralNetworks	-	-
[TPAMI13]	Iterative Quantization: A Procrustean Approach to Learning Binary Codes for Large-Scale Image Retrieval	PDF	-
[TPAMI10]	Product quantization for nearest neighbor search	PDF	-

4. Pre-processing/Post-processing

Anything can boost the performance in the pre/post-processing stage such as rectification/re-ranking/query expansion.

Year	Paper	link	Code
[arXiv20]	Image Stylization for Robust Features	arXiv	-
[ECCV20]	Single-Image Depth Prediction Makes Feature Matching Easier	arXiv	github
[CVPR19]	Local features and visual words emerge in activations	PDF	-
[CVPR12]	Object retrieval and localization with spatially-constrained similarity measure and k-NN re-ranking	PDF	-

5. 3d point cloud

Year	Paper	link	Code
[CVPR18]	PointNetVLAD: Deep Point Cloud Based Retrieval for Large-Scale Place Recognition	arXiv	Github

Multi-tasking local and global descriptors

Some works try to cover both local descriptor and global retrieval due to the shared similarity about the activation and the applications.

Year	Paper	link	Code
[arXiv20]	UR2KiD: Unifying Retrieval, Keypoint Detection, and Keypoint Description without Local Correspondence Supervision	arXiv	-
[CVPR19]	ContextDesc: Local Descriptor Augmentation with Cross-Modality Context	-	Github
[CVPR19]	From Coarse to Fine: Robust Hierarchical Localization at Large Scale with HF-Net	arXiv	Github
[ICCV17]	Large-Scale Image Retrieval with Attentive Deep Local Features (DELF)	-	Github

Reivew type paper

Year	Paper	link	Code
[arXiv18]	From handcrafted to deep local features	arXiv	-
[CVPR17]	Comparative Evaluation of Hand-Crafted and Learned Local Features	PDF	-

Metric learning

Year	Paper	link	Code
[arXiv20]	Metric learning: cross-entropy vs. pairwise losses	arXiv	-
[arXiv19]	A Metric Learning Reality Check	arXiv	-

SfM

Year	Paper	link	Code
[arXiv29]	Reducing Drift in Structure from Motion using Extended Features	arXiv	-

MVS

Year	Paper	link	Code
[CVPR20]	Fast-MVSNet: Sparse-to-Dense Multi-View Stereo With Learned Propagation and Gauss-Newton Refinement	arXiv	github
[CVPR20]	BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Networks	arXiv	github

View Synthesis/Novel view/Image completion

Year	Paper	link	Code
[ECCV20]	Flow-edge Guided Video Completion	arXiv	link
[arXiv20]	Reference Pose Generation for Visual Localization via Learned Features and View Synthesis	arXiv	-
[CVPR20]	BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Networks	arXiv	github

Segmentation localization

Year	Paper	link	Code
[ICCV19]	Fine-Grained Segmentation Networks: Self-Supervised Segmentation for Improved Long-Term Visual Localization	arXiv	github

Benchmarks

Local matching

Year	Paper	link	Code	Note
[arXiv2020]	Image Matching across Wide Baselines: From Paper to Practice	arXiv	github
[CVPR17]	HPatches: A benchmark and evaluation of handcrafted and learned local descriptors	arXiv	Github	Hpatches
[TPAMI11]	Discriminative learning of local image descriptors	Page	-	UBC/Brown dataset (subsets:Liberty (New York), Notre Dame (Paris) and Half Dome (Yosemite))
[CVPR08]	On Benchmarking Camera Calibration and MultiView Stereo for High Resolution Imagery

Global retrieval

Year	Paper	link	Code	Note
[CVPR18]	Revisiting Oxford and Paris: Large-Scale Image Retrieval Benchmarking	Page	Github	ROxford5k, RParis6k
[CVPR07]	Object retrieval with large vocabularies and fast spatial matching	Page	-	Oxford5k
[CVPR08]	Lost in Quantization: Improving Particular Object Retrieval in Large Scale Image Databases	Page	-	Paris6k

Localization (both local matching and global retrieval)

Year	Paper	link	Code	Note
[ECCV20]	Map-based Localization for Autonomous Driving	web	github1, github2	-
[CVPR18]	Benchmarking 6DOF Outdoor Visual Localization in Changing Conditions	PDF,Page	Github	Aachen-day-night, Robotcar, CMU-seasons

Toolbox

Year	Paper	link
[2020]	Kapture	github
[2020]	hloc - the hierarchical localization toolbox	github
[2020]	pyslamv2	github