Awesome
<div align="center"> <img src="https://github.com/spaceml-org/Image-Similarity-Search/blob/master/icons/image-sim-complex-header.jpg" > <p align="center"> Published by <a href="http://spaceml.org/">SpaceML</a> • <a href="https://arxiv.org/abs/2012.10610">About SpaceML</a> </p> </div>What is Image Similarity Search?
Similarity search operations involve applying operations such as Nearest Neighbour search on latent embeddings of a dataset.
Image Similarity Search is an app that helps perform super fast image retrieval on PyTorch models for better embedding space interpretability. It takes a pre-trained PyTorch model, a dataset, and a query image, and retrives similar examples within the dataset for the given query image, using the pretrained model.
How does it work?
<a href="https://ibb.co/MNT50qW"><img src="https://i.ibb.co/tCfPryk/Screenshot-2021-09-01-at-8-23-35-PM.png" alt="Screenshot-2021-09-01-at-8-23-35-PM" border="0"></a>
That's it really. There are functions in the two files provided that:
- Generate embeddings from your model based on your Dataset.
- Index your Image embeddings to an FAISS index file.
- Performs Similarity Search to retrieve N closest images to your query.
- Visually shows the nearest images, and the app allows you to search with the index file several times.
Usage
The app was built with Streamlit, and it can be run locally by launching a Streamlit server from the repository directory. If you do not have Streamlit installed, follow the steps under the Dependencies section before you get started.
Dependencies
Install the necessary packages from requirements.txt using
pip install -r requirements.txt
The app uses Facebook AI's FAISS package to perform similarity search. Install that using the instructions given here based on your hardware.
The app is supported on both CPU and CUDA enabled devices.
<a href="https://ibb.co/Ct0Wxrz"><img src="https://i.ibb.co/PYCtBSN/Screenshot-2021-09-24-at-12-43-18-PM.png" alt="Screenshot-2021-09-24-at-12-43-18-PM" border="0" width="100%"></a>
streamlit run app.py
Steps
(Ref. image above)
- Upload the model file in .pt or .pth format. (Ignore default file limit) Note : State dicts are not supported due to the underlying class dependency.
- Enter absolute path of dataset to be indexed. Note : Dataset must be in PyTorch
ImageFolder
format. - Enter the output embedding size of the model. Eg. Global Average Pooling layer of a ResNet outputs 2048 dim vectors.
- Enter the number of neighbours to be displayed for the given query image.
- Please wait while the index is generated. It is stored as
index.bin
in your working directory. - Upload the query image and view similar images from your dataset as indexed by your model.
Samples
The repository contains a sample model trained on the UC Merced LandUse Dataset for quick demonstration of Image Similarity Search. Use the model under samples/uc_merced.pt
and download the dataset using the command
wget http://weegee.vision.ucmerced.edu/datasets/UCMerced_LandUse.zip
unzip -qq UCMerced_LandUse.zip
The dataset is downloaded and unzipped in your present working directory.
TODO
- Add FAISS Support
- Command Line Tool -> Streamlit App
- Allow uploading existing indices
- Enable Interactive TSNE plots
Pull requests are more than welcomed!
Citation
If you find Image Similarity Search useful in your research, please consider citing the github code for this tool:
@code{
title={Image Similarity Search,
},
url={https://github.com/spaceml-org/Image-Similarity-Search/},
year={2021}
}