Home

Awesome

Conformal prediction

Uncertainty quantification for distribution-free and data-agnostic problems is applied in terms of conformal prediction (CP) methodologies. In particular, this is applied to regression problems involving both exchangeable and time-series data. This work was carried out as part of (UB 2023) MSc thesis development.

Note the related media can be found at the author's corresponding repository, which deploys a GitHub page so that the thesis and the presentation are publicly accessible.

Exchangeable data

Toy problem

A toy problem is proposed, before dealing with more complex datasets, according to this Kaggle discussion.

Regression problem

The same dataset as the mapie's CQR tutorial is proposed: the sklearn built-in California Housing dataset.

Chosen in view of being simple and reproducible, in particular no feature engineering is needed; it is composed of 20,640 samples of the following 8 different features:

Non-exchangeable data

Time series problem

The same dataset as the mapie's time series tutorial was chosen: the Victoria electricity demand dataset, used in the book “Forecasting: Principles and Practice” [1].

It contains a total of 1340 samples and deals with an electricity demand forecasting problem: which not only features daily and weekly seasonality, but it is also impacted by temperature. Thus, apart from the electricty demand lagged up to 7 days (and other time features), temperature will be used as exogenous variable.

[1] Forecasting: principles and practice. Hyndman, R.J. and Athanasopoulos, G. ISBN: 9780987507105. 2014. OTexts. Link.