Awesome
Warpformer
This is an offical implementation of Warpformer: A Multi-scale Modeling Approach for Irregular Clinical Time Series.
UPDATE (2024/06/09)
If you ran this script before 2024/06/09 and were unable to reproduce the results from the paper on the PhysioNet dataset, please follow these steps:
- Remove the
/path/to/datasets/physionet/
folder. - Set the
--task
inrun.sh
tophysio
. - Re-run the
run.sh
script.
Setup
Requirements
Your local system should have the following executables:
Create conda environment
All instructions below should be executed from a terminal.
- clone this repository and run
cd Warpformer
- create an environment
warpformer
and activate it.
conda create -n warpformer python=3.7
conda activate warpformer
- install the required Python modules using file requirements.txt.
pip install -r requirement.txt
Tips: If you cannot install psycopg2 successfully, please try sudo apt-get install libpq-dev
first and then run the command pip install psycopg2
.
Download Clinical Data & Task Building
Our benchmark includes a diverse set of clinical tasks covering different clinical scenarios, with 61 common physiology signals and 42 widely used interventions in intensive care units. The following table summarizes the statistics of these tasks.
Task (Abbr.) | Type | # Train | # Val. | # Test | Clinical Scenario |
---|---|---|---|---|---|
In-hospital Mortality (MOR) | BC | 39, 449 | 4, 939 | 4, 970 | Early warning |
Decompensation (DEC) | BC | 249, 045 | 31, 896 | 30, 220 | Outcome pred. |
Length Of Stay (LOS) | MC | 249, 572 | 31, 970 | 30, 283 | Outcome pred. |
Next Timepoint Will Be Measured (WBM) | ML | 223, 867 | 28, 754 | 27, 038 | Treatment recom. |
Clinical Intervention Prediction (CIP) | MC | 223, 913 | 28, 069 | 27, 285 | Treatment recom. |
I. Access to MIMIC-III data
- First you need to have an access to MIMIC-III Dataset, which can be requested here. The database version we used here is v1.4.
- Download the MIMIC-III Clinical Database and place the MIMIC-III Clinical Database as either .csv or .csv.gz files somewhere on your local computer.
II. Generate datasets
-
Modify the
mimic_data_dir
variable to the path of MIMIC-III folder in the./preprocess/preprocess_mimic_iii_large.py
file, and runcd preprocess python preprocess_mimic_iii_large.py
-
(opt.) If you want to build the CIP task, it is necessary to obtain a file all_hourly_data.h5 first. Otherwise, you can skip this step.
- Start by referring to the MIT-LCP repository to create a database where you can host the MIMIC-III data.
- Next, you will need to generate the file named
all_hourly_data.h5
using the MIMIC_Extract. Once generated, place theall_hourly_data.h5
file in thesave_data_folder/tmp/
directory. You can locate theinterv_outPath
variable in thesplit_data_preprocessing_large.py
file and ensure that it points to the correct destination, i.e., thesave_data_folder/tmp/
folder.
-
Modify the
data_root_folder
andmimic_data_dir
variables to the MIMIC-III folder path in thesplit_data_preprocessing_large.py
, and run the following command for downstream tasks generation:python split_data_preprocessing_large.py
Run the Model
Please refer to the script run.sh for specific configurations used to obtain the main experimental results in the paper. To reproduce the results, please uncomment the corresponding commands and run:
bash run.sh
Example:
python Main_warp.py --task {task_name} \
--data_path {path_to_data_folder} \
--log {log_path} \
--save_path {save_path}
--epoch {epoch} \
--seed {seed} \
--lr 0.001 \
--batch {batch_size} \
--warp_num {warp_num} \
--dp_flag
task
: the downstram task name, select from[mor, decom, cip, wbm, los, active, physio]
.seed
: the seed for parameter initialization.warp_num
: customize $L^{(n)}$ for each warp layer. The first layer is always 0, and each layer is split by '_', e.g.,'0_12_6'
(unnormalized version) or'0_0.2_1.2'
(normalized version).dp_flag
: use DataParallel for training or not.
For more details, please refer to run.sh and Main_warp.py.
License
The original MIMIC-III database is hosted and maintained on PhysioNet under PhysioNet Credentialed Health Data License 1.5.0, and is publicly accessible at https://physionet.org/content/mimiciii/1.4/.
Our code in this repository is licensed under the MIT license.