Awesome
CPLLM: Clinical Prediction with Large Language Models
This repository contains the code and resources for the paper titled "CPLLM: Clinical Prediction with Large Language Models."
If you use CPLLM or find this repository useful for your research or work, please cite us using the following citation:
@article{shoham2023cpllm,
title={CPLLM: Clinical Prediction with Large Language Models},
author={Shoham, Ofir Ben and Rappoport, Nadav},
journal={arXiv preprint arXiv:2309.11295},
year={2023}
}
Getting Started
To get started with CPLLM, follow these steps:
1. Install Conda Environment
Use the provided environment.yml
file to create a Conda environment with the necessary dependencies. Run the following command to create the environment:
conda env create -f environment.yml
conda activate cpllm-env
2. Data Extraction
You can use the provided Jupyter notebooks to create the data required for fine-tuning the model. We have two notebooks for data extraction:
2.1) Data Extraction for Next Diagnosis Prediction
Use the medbert-fine-tuning-data-extraction-eicu_crd.ipynb
notebook to extract data for next diagnosis prediction.
2.2) Data Extraction for Next Visit Diagnosis Prediction
Use the medbert-fine-tuning-data-extraction-mimic-iv.ipynb
notebook to extract data for next visit diagnosis prediction.
3. Fine-Tuning
After extracting the required data, you can fine-tune the CPLLM model. Make sure to modify the configuration variables in the cpllm.py
code to suit your specific use case.
Run the training of CPLLM:
python cpllm.py