Home

Awesome

The provided codes consist of codes for reproducing all experiments in paper "Online Missing Value Imputation and Change Point Detection with the Gaussian Copula". All used synthetic and real world datasets are also provided.

Software preparation

For Python, run pip install gcimpute==0.0.1 to install the gcimpute package that implements all Gaussian copula algorithms.

For Matlab, add the current fold and all its subfolders to your working path. The online KFMC code and grouse code are slightly modified from authors' provided codes.

Bayesian online change point detection implementattion requires R package ocp.

Reproduction

To reproduce the results for offline synthetic experiments, use EM_sim_offline.py and grouse_KFMC_sim_offline.m.

To reproduce the results for online synthetic experiments, use EM_sim_online.py and grouse_KFMC_sim_offline.m,.

To reproduce the results for offline read-data experiments, use EM_movielens.py and grouse_KFMC_movielens.m.

To reproduce the results for online real-data experiments, use EM_stocks.py and grouse_KFMC_stocks.m.

Datasets

All used datasets are saved in SimData and RealData. The simulation data can be re-generated by running python write_data.py in the terminal. The stocks data can be re-generated by running the R script get_stocks_data.R.

Results

All results are saved under directory Results. The users can replicate those results using the instructuions in Section Reproduction. The notebook plots.ipynb reproduces plots in the paper.