Home

Awesome

Approximate full Conformal Prediction

This repository contains the Python implementation of Approximating Full Conformal Prediction at Scale via Influence Functions.

Overview

alt text

Approximate full Conformal Prediction (ACP) outputs a prediction set that contains the true label with at least a probability specified by the practicioner. In large datasets, ACP inherits the statistical power of the highly efficient full Conformal Prediction. The method works as a wrapper for any differentiable ML model.

Contents

This repository is organized as follows. In the folder src/acp you can find the following modules:

The folder src/third_party/ contains additional third-party software.

Third-party software

We include the following third-party packages for comparison with ACP:

Usage

Requirements

For experiments.py and ACP_Tutorial.ipynb:

Installation

ACP can be utilized as a fully-independent pip package. You can download the framework by running the following command in the terminal:

pip install approx-cp

In order to use ACP in your own models, just include the following imports in your file:

from acp.wrapper import ACP_D, ACP_O #Deleted scheme (ACP_D) and ordinary scheme (ACP_O)

Alternatively, you can clone this repo by running:

git clone https://github.com/cambridge-mlg/acp
cd acp

And install the ACP Python package in a customizable conda environment:

conda create -n myenv python=3.9
conda activate myenv
pip install --upgrade pip
pip install -e .         

Now, just include the import:

from acp.wrapper import ACP_D, ACP_O

Constructing prediction sets with ACP

ACP works as a wrapper for any PyTorch model with .fit() and .predict() methods. Once you instantiate your model, you can generate tight prediction sets that contain the true label with a specified probability. Here is an example with synthetic data:

from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from acp.models import NeuralNetwork
from acp.wrapper import ACP_D

X, Y = make_classification(n_samples = 1100, n_features = 10, n_classes = 2, n_clusters_per_class = 1, n_informative = 3, random_state = 42)
Xtrain, Xtest, Ytrain, Ytest = train_test_split(X, Y, test_size = 100, random_state = 42)
model = NeuralNetwork(input_size = 10, num_neurons = [20, 10], out_size = 2, seed = 42, l2_reg = 0.01)

ACP = ACP_D(Xtrain, Ytrain, model, seed = 42, verbose = True)
sets = ACP.predict(Xtest, epsilon = 0.1, out_file = "results/test")

Tutorial Notebook

For a tutorial on how to use ACP and how to create the plots in the paper, see the following notebook:

Experiments

To easily run experiments with ACP and the comparing methods, use python3 experiments.py <function> <dataset> <model>.

The first argument, <function>, specifies the CP function. It should be one of the following:

The second argument, <dataset>, specifies the dataset. It should be one of the following:

The third argument, <model>, specifies the model. It should be one of the following:

For all options, see python3 experiments.py --help:

usage: experiments.py [-h] [--reg REG] [--seed SEED] [--test TEST] [--dir DIR] 
                      [--embedding_size EMBEDDING_SIZE] [--validation_split VALIDATION_SPLIT] 
                      [--epsilon EPSILON] function dataset model

positional arguments:
  function              CP function to run (full_CP, ACP_D, ordinary_full_CP, ACP_O, SCP, RAPS, APS, CV_plus, JK_plus)
  dataset               dataset (synthetic, MNIST, US_Census, CIFAR-10)
  model                 Neural Network A, B, C, LR or CNN

optional arguments:
  -h, --help            show this help message and exit
  --reg REG             value l2 regularization term
  --seed SEED           initial seed
  --test TEST           test set size
  --dir DIR             output dir
  --embedding_size EMBEDDING_SIZE
                        embedding size for the autoencoder
  --validation_split VALIDATION_SPLIT
                        split for calibration set in SCP
  --epsilon EPSILON     value of epsilon for RAPS, APS, JK+ or CV+

Reference

J. Abad Martinez, U. Bhatt, A. Weller and G. Cherubin. Approximating Full Conformal Prediction at Scale via Influence Functions. Association for the Advancement of Artificial Intelligence Conference on Artificial Intelligence (AAAI), 2023.

BiBTeX:

@inproceedings{martinez2023approximating,
  title={Approximating Full Conformal Prediction at Scale via Influence Functions},
  author={Martinez, Javier Abad and Bhatt, Umang and Weller, Adrian and Cherubin, Giovanni},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={37},
  number={6},
  pages={6631--6639},
  year={2023}
}