

<div align="center"> <p align="center"> <a href="https://aphp.github.io/eds-scikit/"> <img src="https://github.com/aphp/eds-scikit/raw/main/docs/_static/scikit_logo_text.png" width="30%" onerror="this.style.display='none'"> </a> </p>

<p align="center">

Tests Documentation PyPI Supported Python Black Coverage DOI

</p> </div>

eds-scikit is a tool to assist data scientists working on the AP-HP's Clinical Data Warehouse. It is specifically targeted for OMOP-standardized data. It main goals are to:


This library is developed and maintained by the core team of AP-HP’s Clinical Data Warehouse (EDS) with the strong support of Inria's SODA team.

How to use

Please check the online documentation for more informations. You will find


eds-scikit stands on the shoulders of Spark 2.4 which requires:


You can install eds-scikit via pip:

pip install "eds-scikit[aphp]"

:warning: If you get an an error during installation, please try downgrading pip via pip install -U "pip<23" before install eds-scikit`

:warning: If you don't work in AP-HP's ecosystem (EDS), please install via:

pip install eds-scikit

You can now import the library via

import eds_scikit


Please check our contributing guidelines.


If you use eds-scikit, please cite us as below.

    author = {Petit-Jean, Thomas and Remaki, Adam and Maladière, Vincent and Varoquaux, Gaël and Bey, Romain},
    doi = {10.5281/zenodo.7401549},
    title = {eds-scikit: data analysis on OMOP databases},
    url = {https://github.com/aphp/eds-scikit}


We would like to thank the following funders: