Home

Awesome

<p align="center"> <img src="https://raw.githubusercontent.com/BorgwardtLab/proteinshake/main/docs/images/logo_subtitle.png#gh-light-mode-only" width="60%"> </p> <p align="center"> <img src="https://raw.githubusercontent.com/BorgwardtLab/proteinshake/main/docs/images/logo_subtitle_dark.png#gh-dark-mode-only" width="60%"> </p> <div align="center">

build pypi docs downloads codecov

</div> <p align="center">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <a href="https://borgwardtlab.github.io/proteinshake/#quickstart">Quickstart</a>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <a href="https://borgwardtlab.github.io/proteinshake">Website</a>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <a href="https://proteinshake.readthedocs.io/en/latest/?badge=latest">Documentation</a>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <a href="https://proceedings.neurips.cc/paper_files/paper/2023/file/b6167294ed3d6fc61e11e1592ce5cb77-Paper-Datasets_and_Benchmarks.pdf">Paper</a>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <a href="https://proteinshake.readthedocs.io/en/latest/notes/contribution.html">Contribute</a>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <a href="https://borgwardtlab.github.io/proteinshake/#leaderboard">Leaderboard</a>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <a href="https://proteinshake.readthedocs.io/en/latest/notebooks/dataset.html">Tutorials</a> </p> <div align="center">

ProteinShake provides one-liner imports of large scale, preprocessed protein structure datasets and tasks for various model types and frameworks.

We provide a collection of preprocessed and cleaned protein 3D structure datasets from RCSB and AlphaFoldDB, including annotations. Structures are easily converted to graphs, voxels, or point clouds and loaded natively into PyTorch, TensorFlow, NumPy, JAX, PyTorch Geometric, DGL and NetworkX. The task API enables standardized benchmarking on a variety of tasks on protein and residue level.

Find more information on the <a href="https://borgwardtlab.github.io/proteinshake">Website</a> and the <a href="https://proteinshake.readthedocs.io/en/latest/?badge=latest">Documentation</a>, or check out the <a href="https://proteinshake.readthedocs.io/en/latest/notebooks/dataset.html">Tutorials</a>. The results of the paper and the baseline models can be found in the <a href="https://github.com/BorgwardtLab/ProteinShake_eval">Evaluation Repository</a>. If you would like to create your own release, see the <a href="https://github.com/BorgwardtLab/proteinshake_release">Release Repository</a>.

</br>

Installation:

<div align="center">
- This is a pre-release version. There may be unannounced changes to the API and datasets. -
- We expect some bugs as well, please open an issue if you find one. -
</div> <div align="center">
+ COMING SOON: Customization and transforms +
+ Build your own representations, splitters, and preprocessors! +
</div> <div align="center">
pip install proteinshake
</div> </br> <div align="center">

Code in this repository is licensed under BSD-3, the dataset files on Zenodo are licensed under CC-BY-4.0.

To build ProteinShake, we obtained and modified data from various sources. Please see the documentation of the respective dataset classes for a reference to the original data, license, and paper.

</div>