Home

Awesome

Simulator

logo

<a target="_blank" href="https://colab.research.google.com/github/OpenBioLink/SimulateGPT/blob/main/SimulateGPT.ipynb"> <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/> </a>

Computational simulation of biological processes can be a valuable tool in accelerating biomedical research, but usually requires a high level of domain knowledge and extensive manual adaptations. Recently, large language models (LLMs) – such as GPT-4 have proven surprisingly successful in solving complex tasks across diverse fields by emulating human language generation at a very large scale. Here we explore the potential of leveraging LLMs as simulators of biological systems. We establish proof-of-concept of a text-based simulator, SimulateGPT, that leverages LLM reasoning. We demonstrate good prediction performance across diverse biomedical use cases without explicit domain knowledge or manual tuning. Our results show that LLMs can be used as versatile and broadly applicable biological simulators.

graphical abstract

Citation

If you find our work useful in your research, please cite:

Moritz Schaefer*, Stephan Reichl*, Rob ter Horst*, Adele M. Nicolas, Thomas Krausgruber, Francesco Piras, Peter Stepper, Christoph Bock#, Matthias Samwald#. (2024). GPT-4 as a biomedical simulator. Computers in Biology and Medicine, 178, 108796. doi: 10.1016/j.compbiomed.2024.108796.

BioRxiv Preprint (2023)

Moritz Schaefer*, Stephan Reichl*, Rob ter Horst*, Adele M. Nicolas, Thomas Krausgruber, Francesco Piras, Peter Stepper, Christoph Bock#, Matthias Samwald#. (2023). Large language models are universal biomedical simulators doi: 10.1101/2023.06.16.545235v1

Repository structure

Folders:

Experiments

Each experiment is kept in a separate folder containing:

Using Snakemake to run experiments

Simply run snakemake -c1 -k --config experiment_name=<your_experiment_name> (1 core, continue with undone jobs if a job failed). If you want to use my conda env, add --use-conda.

The pipeline generates the files according to the schema indicated above.

Run all experiments

To run all experiments, call snakemake like so:

for experiment_name in $(ls experiments); do snakemake -c1 --config experiment_name=$experiment_name; done

Code files

src/utils.py

The top-level utils file provides 'everything you need' to run your prompts in an automated fashion. The functions are simple, documented and reflect the defined repository structure.

We streamlined our API access using snakemake.

Make sure to provide your private OPEN AI API key as argument (api_key), environment variable (OPENAI_API_KEY), or in the password store.

Notebook

The Simulator.ipynb notebook is configured to work within colab, but will also work on your local installation.

Human/Input prompt guidelines