

Saluki: Predicting mRNA half-life from mRNA sequence

This repository is intended to accompany our publication, primarily to enhance the reproducibility of our results. For more information please refer to:

Agarwal V, Kelley D. The genetic and biochemical determinants of mRNA degradation rates in mammals. 2022. bioRxiv.

Each folder is organized as follows:

If you find our code or predictions to be helpful for your work, please cite the paper above.

Dependencies for running entire pipeline:

Instructions for use

For R code to work properly, please copy the contents of .Rprofile in this folder to your local .Rprofile.

Users are advised to read the code closely and modify commented pieces as appropriate to acquire desired output for your environment. For example, you will need to download all of the additional R library and Python module dependencies for the code to work. This being said, if you find crucial files are missing, making the code unusable, or if you identify a major problem in the code, please raise a Github issue.

In each Figure's folder, change directories to it and please read the file "runme.sh" first as it provides a general overview of relevant commands that were used sequentially to pre-process the data and generate the figures.

To train the Saluki model, please visit this link within the basenji repo.

OPTIONAL: For full functionality and to fix symbolic links, download the associated datapack and save the "datasets" folder in the base Github directory:


The figures will link to this folder accordingly. Some of the files need to be decompressed, and not all files are provided due to minimize the package size. If you need additional files not provided for the purpose of reproduction, please contact Vikram Agarwal (vagar {at} calicolabs {dot} com).