Home

Awesome

Conformal Approach To Gaussian Process Surrogate Evaluation With Coverage Guarantees

What is done in this repo

đź”— Requirements

Python 3.7+

OpenTURNS is a C++ library made, hence one can need to install gcc to be able to run the library

Ubuntu:

$ sudo apt update
$ sudo apt install build-essential

OSX:

$ brew install gcc

Windows: Install MinGW (a Windows distribution of gcc) or Microsoft’s Visual C

Install the required packages:

$ pip install -r requirements.txt
$ conda install -f environment.yml

đź›  Installation

Clone the repo and run the following command in the conformalized_gp directory to install the code

$ pip install .

⚡️ Quickstart

Here is a @quickstart to use the Jackknife+GP method on any regression dataset. Here, the goal is the compare visually the results given by the standard Jackknife+ method, the Credibility Intervals and our methodology. The notebook from which this quickstart is inspired can be found here

We first start to import the necessary packages

import matplotlib.pyplot as plt
import numpy as np
import scipy
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.model_selection import train_test_split

from mapie.conformity_scores.residual_conformity_scores import GPConformityScore
from mapie.regression import MapieRegressor

BLUE = np.array([[26, 54, 105]]) / 255
ORANGE = np.array([[223, 84, 49]]) / 255
YELLOW = np.array([[242, 188, 64]]) / 255

$$g(x) = 3x\sin(x) - 2x\cos(x) + \frac{x^3}{40} - \frac{x^2}{2} - 10x$$

def g(x):
    return (3 * x * np.sin(x) - 2 * x * np.cos(x) + ( x ** 3) / 40 - .5 * x ** 2 - 10 * x)

x_mesh = np.linspace(-40, 60, 5000)
plt.plot(x_mesh, g(x_mesh))
plt.xlabel("$x$")
plt.ylabel("$g(x)$")

toy function

X_train, X_test, y_train, y_test = train_test_split(x_mesh, g(x_mesh), test_size=.98, random_state=42)
X_train = X_train.reshape(-1, 1)
X_test = X_test.reshape(-1, 1)
gp = GaussianProcessRegressor(normalize_y=True)
gp.fit(X_train, y_train)
mapie_j_plus_gp = MapieRegressor(
    estimator=gp,
    cv=-1,
    method="plus",
    conformity_score=GPConformityScore(),
    model_has_std=True,
    random_state=42
)

mapie_j_plus = MapieRegressor(
    estimator=gp,
    cv=-1,
    method="plus",
    conformity_score=None,
    model_has_std=False,
    random_state=42
)


mapie_j_plus_gp.fit(X_train, y_train)
mapie_j_plus.fit(X_train, y_train)
ALPHA = .1

_, y_pss_j_plus_gp = mapie_j_plus_gp.predict(x_mesh.reshape(-1, 1), alpha=ALPHA)
_, y_pss_j_plus = mapie_j_plus.predict(x_mesh.reshape(-1, 1), alpha=ALPHA)

y_mean, y_std = gp.predict(x_mesh.reshape(-1, 1), return_std=True)

q_alpha_min = scipy.stats.norm.ppf(ALPHA / 2)
q_alpha_max = scipy.stats.norm.ppf(1 - ALPHA / 2)

f, ax = plt.subplots(1, 1, figsize=(20, 10))
ax.scatter(X_train, y_train, c=BLUE)


ax.plot(x_mesh, g(x_mesh), c=BLUE)
ax.plot(x_mesh, y_mean, c=YELLOW)
ax.fill_between(
        x_mesh,
        y_mean + y_std * q_alpha_min,
        y_mean + y_std * q_alpha_max,
        alpha=0.3,
        color=YELLOW,
        label=r"$\pm$ 1 std. dev.",
    )


ax.fill_between(
        x_mesh,
        y_pss_j_plus_gp[:, 0, 0],
        y_pss_j_plus_gp[:, 1, 0],
        alpha=.6,
        color=ORANGE,
        label=r"$\pm$ 1 std. dev.",
    )

ax.fill_between(
        x_mesh,
        y_pss_j_plus[:, 0, 0],
        y_pss_j_plus[:, 1, 0],
        alpha=.3,
        color="g",
        label=r"$\pm$ 1 std. dev.",
    )
ax.legend(
    [
        "Training Points",
        "True function", "Mean of posterior GP",
        "Posterior GP Credibility Interval",
        "Prediction Interval J+GP",
         "Prediction Interval J+", 
    ]
)
ax.set_xlabel("$x$")
ax.set_ylabel("$g(x)$")

toy function intervals

🔌 Plug OpenTURNS GP into MAPIE

If you wish to use our code with an OpenTURNS model, we have implemented a simple wrapper around the model so that it can be used very easily:

from wrappers import GpOTtoSklearnStd

nu = 5/2  # Hyperparameter of the Matérn Kernel
noise = None  # Standard deviation of the nugget effect. If None, no nugget effect is applied.
gp_estimator = GpOTtoSklearnStd(scale=1, amplitude=1, nu=nu, noise=None)

This estimator is now fully compatible with MAPIE as it comes with it .fit and .predict methods.