Awesome
VISTA: Vision Transformer enhanced by U-Net and Image Colorfulness Frame Filtration for Automatic Retail Checkout
Code for our CVPR 2022 Workshop paper VISTA: Vision Transformer enhanced by U-Net and Image Colorfulness Frame Filtration for Automatic Retail Checkout. The method described achieves 3rd place in the AI City Challenge 2022 Track 4: Multi-Class Product Counting & Recognition for Automated Retail Checkout. See here for details.
[arXiv]
<p align="center"> <a href="#"><img src="./media/vista.png"/></a> <br /> <em> Figure 1. Illustration of the overall segmentation and classification pipeline </em> </p>1. Specification of dependencies
This code requires Python 3.8.12 and PyTorch 1.8.2. Run pip install -r requirements.txt
to install all the dependencies.
2a. Segmentation training code
See training/segmentation
for details.
2b. Classification training code
See training/classification
for details.
3. Inference code
After steps 2a and 2b, make sure both segmentation and classification models are present in the test/models
directory. Then see README.md
for details.
4. Citation
will be added here.
Acknowledgements
We thank AICITY 22 organizers for making data available for use. We also thank Giga Tech Ltd. for providing funding for this work.