Home

Awesome

CPLLM: Clinical Prediction with Large Language Models

This repository contains the code and resources for the paper titled "CPLLM: Clinical Prediction with Large Language Models."

If you use CPLLM or find this repository useful for your research or work, please cite us using the following citation:

@article{shoham2023cpllm,
  title={CPLLM: Clinical Prediction with Large Language Models},
  author={Shoham, Ofir Ben and Rappoport, Nadav},
  journal={arXiv preprint arXiv:2309.11295},
  year={2023}
}

Getting Started

To get started with CPLLM, follow these steps:

1. Install Conda Environment

Use the provided environment.yml file to create a Conda environment with the necessary dependencies. Run the following command to create the environment:

conda env create -f environment.yml
conda activate cpllm-env

2. Data Extraction

You can use the provided Jupyter notebooks to create the data required for fine-tuning the model. We have two notebooks for data extraction:

2.1) Data Extraction for Next Diagnosis Prediction Use the medbert-fine-tuning-data-extraction-eicu_crd.ipynb notebook to extract data for next diagnosis prediction.

2.2) Data Extraction for Next Visit Diagnosis Prediction Use the medbert-fine-tuning-data-extraction-mimic-iv.ipynb notebook to extract data for next visit diagnosis prediction.

3. Fine-Tuning

After extracting the required data, you can fine-tune the CPLLM model. Make sure to modify the configuration variables in the cpllm.py code to suit your specific use case.

Run the training of CPLLM: python cpllm.py