Home

Awesome

Probabilistic Contrastive Learning Recovers the Correct Aleatoric Uncertainty of Ambiguous Inputs

Michael Kirchhof, Enkelejda Kasneci, Seong Joon Oh


Deterministic

Contrastively trained encoders have recently been proven to invert the data-generating process: they encode each input, e.g., an image, into the true latent vector that generated the image (Zimmermann et al., 2021). However, real-world observations often have inherent ambiguities. For instance, images may be blurred or only show a 2D view of a 3D object, so multiple latents could have generated them. This makes the true posterior for the latent vector probabilistic with heteroscedastic uncertainty. In this setup, we extend the common InfoNCE objective and encoders to predict latent distributions instead of points. We prove that these distributions recover the correct posteriors of the data-generating process, including its level of aleatoric uncertainty, up to a rotation of the latent space. In addition to providing calibrated uncertainty estimates, these posteriors allow the computation of credible intervals in image retrieval. They comprise images with the same latent as a given query, subject to its uncertainty.

Link: https://arxiv.org/abs/2302.02865


Installation

This code was tested on Python 3.8. Use the code below to create a fitting conda environment.

conda create --name probcontrlearning python=3.8
conda install pytorch torchvision torchaudio pytorch-cuda=11.7 -c pytorch -c nvidia
conda install tqdm scipy matplotlib argparse
pip install wandb tueplots

If you want to do experiments on CIFAR-10H, you need to download the pretrained ResNet18 and the CIFAR-10H labels. Download the ResNet weights, unzip and copy them into models/state_dicts/resnet18.pt. Then, download the CIFAR-10H labels and copy them into data/cifar10h-probs.npy. The CIFAR-10 data itself is downloaded automatically.


Reproducing Paper Results

The experiment_scripts folder contains all shell files to reproduce our results. Here's an example:

python main.py --loss MCInfoNCE --g_dim_z 10 --g_dim_x 10 --e_dim_z 10 \
--g_pos_kappa 20 --g_post_kappa_min 16 --g_post_kappa_max 32 \
--n_phases 1 --n_batches_per_half_phase 50000 --bs 512 \
--l_n_samples 512 --n_neg 32 --use_wandb False --seed 4

These flags mean the following (parameters.py contains descriptions of all parameters):


Applying MCInfoNCE to Your Own Problem

If you want obtain probabilistic embeddings for your own contrastive learning problem, you need two things:

  1. Copy-paste the MCInfoNCE() loss from utils/losses.py into your project. The most important hyperparameters to tune are kappa_init and n_neg. We found 16 to be a solid starting value for both.

  2. Make your encoder output both

    1. mean (your typical penultimate-layer embedding, normalized to an $L_2$ norm of 1) and
    2. kappa (scalar value indicating the certainty).

    You can use an explicit network to predict kappa, as for example in models/encoder.py, but you can also implicitly parameterize it via the norm of your embedding, as in models/encoder_resnet.py. The latter has been confirmed to work plug-and-play with ResNet and VGG architectures. We'd be happy to learn whether it also works on yours.


How to Cite:

@article{kirchhof2023probabilistic,
  author={Kirchhof, Michael and Kasneci, Enkelejda and Oh, Seong Joon},
  title={Probabilistic Contrastive Learning Recovers the Correct Aleatoric Uncertainty of Ambiguous Inputs},
  journal={arXiv preprint arXiv:2302.02865},
  year={2023}
}