Awesome
Code to the multi-omics benchmark study by Herrmann et al.
This repo allows to
- download the (preprocessed) data used in the study
- reproduce the results (table, figures etc.) presented in the paper
- rerun the entire benchmark experiment
If you use the code or data please cite:
Moritz Herrmann, Philipp Probst, Roman Hornung, Vindi Jurinovic, Anne-Laure Boulesteix, Large-scale benchmark study of survival prediction methods using multi-omics data, Briefings in Bioinformatics, Volume 22, Issue 3, May 2021, bbaa167, https://doi.org/10.1093/bib/bbaa167
To download the data:
- The preprocessed data (described in the study) is available via OpenML
- The OpenML dataset IDs can be found in
data/datset_ids.txt
ordata/datset_ids.RData
- Note that the datasets had to be split into two to three parts in order to be uploaded to OpenML
- R users can use the code in
R/bench_experiment.R
(lines 44-81) to directly download the data (and convert it tomlr
tasks)
To reprocude the results (in R):
- to only reproduce the tables, figures etc. displayed in the paper without rerunning the benchmark experiments use
reproduce_table-and-figures.Rmd
- to rerun the full experiments (this takes several days or weeks, depending on the available resources) use
R/bench_experiment.R
- see the instructions in
R/packages.R
! - make sure the required packages are installed
- make sure to use correct package versions via checkpoint
- not all packages are covered by checkpoint, this is specifically relevant for
mlr
(s.R/packages.R
)!
- see the instructions in
- to merge the benchmark results use
R/merge_bmr_results.R
Note, mlr
has deprecated (https://github.com/mlr-org/mlr) in the meantime. There is now the new framework mlr3
(https://mlr3.mlr-org.com/).