Home

Awesome

Creating a Forensic Database of Shoeprints from Online Shoe-Tread Photos

This is the official project page for our paper:

Creating a Forensic Database of Shoeprints from Online Shoe-Tread Photos

Samia Shafique, Bailey Kong, Shu Kong*, and Charless Fowlkes*

WACV 2023

project page, dataset, code, pdf, poster, video

Abstract

<p align="justify"> Shoe-tread impressions are one of the most common types of evidence left at crime scenes. However, the utility of such evidence is limited by the lack of databases of footwear prints that cover the large and growing number of distinct shoe models. Moreover, the database is preferred to contain the 3D shape, or depth, of shoe-tread photos so as to allow for extracting shoeprints to match a query (crime-scene) print. We propose to address this gap by leveraging shoe-tread photos collected by online retailers. The core challenge is to predict depth maps for these photos. As they do not have ground-truth 3D shapes allowing for training depth predictors, we exploit synthetic data that does. We develop a method, termed ShoeRinsics, that learns to predict depth from fully supervised synthetic data and unsupervised retail image data. In particular, we find domain adaptation and intrinsic image decomposition techniques effectively mitigate the synthetic-real domain gap and yield significantly better depth predictions. To validate our method, we introduce 2 validation sets consisting of shoe-tread image and print pairs and define a benchmarking protocol to quantify the quality of predicted depth. On this benchmark, ShoeRinsics outperforms existing methods of depth prediction and synthetic-to-real domain adaptation. </p>

Keywords: Shoeprints, Forensic Evidence, Depth Prediction, Intrinsic Decomposition, and Domain Adaptation

Overview

<p align="center"> <img src='git/figures/architecture.png' width='500'/> </p> <p align="justify"> Predicting depth for shoe-tread images (collected by online retailers) is the core challenge in constructing a shoeprint database for forensic use. We develop a method termed ShoeRinsics to learn depth predictors. The flowchart depicts how we train ShoeRinsics using annotated synthetic and un-annotated real images. We use domain adaptation (via image translators G<sub>S→R</sub> and G<sub>R→S</sub>) and intrinsic image decomposition (via decomposer F and renderer R) techniques to mitigate synthetic-real domain gaps. Our method achieves significantly better depth prediction on real shoe-tread images than the prior art. </p>

Datasets

<p align="center"> <img src='git/figures/all_dataset_samples.png' width='500'/> </p> <p align="justify"> We introduce 2 training datasets (<i>syn-train</i> and <i>real-train</i>) and 2 validation datasets (<i>real-val</i> and <i>real-FID-val</i>) in this work. The figure above shown example shoe-treads from each dataset. Note that to analyze the models’ robustness to novel shoe types, we constrain our training sets to contain only brand-new athletic shoes while letting real-val also include formal and used (worn) shoes. </p>

The details and download links of each dataset are as follows:

  1. <b>Syn-train</b>: Our synthetic dataset (<i>syn-train</i>) contains synthetic shoe-tread images and their intrinsic annotations (depth, albedo, normal, and lighting). We synthesize a shoe-tread image with a given depth map, an albedo map, and a lighting environment. We pass these to a physically-based rendering engine (Mitsuba) to generate the synthetic image. The final syn-train set contains 88,408 shoe-treads with paired ground-truth intrinsic images. Download

  2. <b>Real-train</b>: Online retailers adopt photos of shoes for advertisement, which include shoe-tread images. <i>Real-train</i> consists of 3,543 such shoe-tread images and their masks (computed by a simple network to segment out the shoe-treads). This dataset does not contain any ground-truth and consists only of new, athletic shoes. Download

  3. <b>Real-val</b>: This dataset contains a total of 36 sets of shoe-tread images, ground-truth shoeprints, and masks. We create this dataset by collecting shoes, photographing them, and collecting their prints using the <i>block printing technique</i>. There are 3 different shoe categories present in this dataset - 22 new-athletic, 6 new-formal, and 8 used-athletic. Further details are provided in the README file. Download

  4. <b>Real-FID-val</b>: This dataset contains 41 sets of shoe-tread images, shoeprints, and masks. Note that the shoeprints in this dataset come from the FID300 dataset while the shoe-tread images are separately downloaded from online retailers (i.e., these images are disjoint from those in the real-train set). We find matched FID300 prints (used as the ground-truth) and the downloaded shoe-tread images, and align them manually. Real-FID-val contains 41 new, athletic shoe-tread images with corresponding ground-truth shoeprints and masks to segment out the shoe-treads. Download

You can view and download all the datasets together here.

Pretrained Models

Our pretrained model is available for download here. We additionally provide pretrained versions of our supporting models (translator, renderer). All pretrained models can be downloaded together here.

Testing

Use the following command to generate predictions using our pretrained model and test them with our proposed metric:

test.py --weights_decomposer=../models/decomposer_best_state.t7    --dataroot=../data/     --val_dataset_dir=real_val
<p align="justify"> Note that weights_decomposer should specify the path to the pretrained model. Dataroot should specify the path to the root directory which holds all datasets used in the experiments. Val_dataset_dir should name the directory for the validation dataset used (real_val or real_FID_val). </p>

Our predictions on real-val and real-FID-val are available here.

Training

<p align="justify"> We train our network in stages. We outline each step below. Note that paths should be set appropriately. Dataroot specifies the path to the root directory containing datasets. Syn_train_dataset_dir and real_train_dataset_dir are the names of the synthetic and real training dataset directories. Weights_translator, weights_renderer, and weights_decomposer should specify path to corresponding saved models. </p>
  1. Train the decomposer with synthetic data.
python train.py  --dataroot=../data/ --syn_train_dataset_dir=syn_train --real_train_dataset_dir=real_train  --train_net
  1. Train the renderer with synthetic data.
python train.py     --dataroot=../data/  --syn_train_dataset_dir=syn_train --real_train_dataset_dir=real_train  --train_renderer
  1. Train the translator using code from CycleGAN's official release. Our pretrained translator can be downloaded here.

  2. Finetune the renderer with translated synthetic data.

python train.py  --dataroot=../data/  --weights_translator=../models/translator_best_state.t7   --weights_renderer=../models/renderer_best_state.t7  --syn_train_dataset_dir=syn_train --real_train_dataset_dir=real_train  --train_renderer
  1. Finetune the decomposer using the full pipeline.
python train.py  --weights_decomposer=../models/decomposer_best_state.t7       --dataroot=../data/  --weights_translator=../models/translator_best_state.t7   --weights_renderer=../models/renderer_best_state.t7  --syn_train_dataset_dir=syn_train --real_train_dataset_dir=real_train  --train_discriminator --train_net

Reference

If you find our work useful in your research, please consider citing our paper:

@inproceedings{shafique2023creating,
  title={Creating a Forensic Database of Shoeprints From Online Shoe-Tread Photos},
  author={Shafique, Samia and Kong, Bailey and Kong, Shu and Fowlkes, Charless},
  booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision},
  pages={858--868},
  year={2023}
}

Questions

Please feel free to email me at (sshafiqu [at] ics [dot] uci [dot] edu) if you have any questions.

Acknowledgements

This work was funded (or partially funded) by the Center for Statistics and Applications in Forensic Evidence (CSAFE) through Cooperative Agreements 70NANB15H176 and 70NANB20H019 between NIST and Iowa State University, which includes activities carried out at Carnegie Mellon University, Duke University, University of California Irvine, University of Virginia, West Virginia University, University of Pennsylvania, Swarthmore College and University of Nebraska, Lincoln.