Awesome

improved-precision-and-recall-metric-pytorch

Improved Precision and Recall Metric for Assessing Generative Models - Unofficial Pytorch Implementation

Usage

given two directories containing real and fake images

python improved_precision_recall.py [path_real] [path_fake]

pre-compute real manifold and save to a file

python improved_precision_recall.py [path_real] [dummy_str] --fname_precalc [filename_dest]

if the images are already on memory

ipr = IPR(args.batch_size, args.k, args.num_samples)
ipr.compute_manifold_ref(args.path_real)  # args.path_real can be either directory or pre-computed manifold file
metric = ipr.precision_and_recall(images)
print('precision =', metric.precision)
print('recall =', metric.recall)

realism score

realism_score = ipr.realism(image_in_tensor)

Discussions

Corner case
- For A = {999 samples from uniform(0,1)} + {2} and B = {999 samples from uniform(2,3)} + {1}, precision = 1 and recall = 1. 
- Outliers can be handled by estimating the quality of individual samples and pruning out.
Number of samples
- For A = 1000 real images from celeba_hq and B = 4 images among A, precision = 1 and recall = 0.638. Wow, 4 images cover 64% of 1000 images! 
- Manifold estimate becomes inaccurate when number of samples is small.
Not getting close to 1 given two sets of real images
- For A = 1000 real images from celeba_hq and B = another 1000 real images from the same dataset, precision = 0.639 and recall = 0.661. They are not close to 1 even though both set are sampled from the same distribution (=dataset). 
- It happens because image data in general is extremely sparse.

We thank Tuomas for enjoyable discussion.

Link to official repo

https://github.com/kynkaat/improved-precision-and-recall-metric