Home

Awesome

Cell Painting Gallery

This page provides a guide to the datasets that are available in the Cell Painting Gallery, hosted by the AWS Registry of Open Data (RODA): https://registry.opendata.aws/cellpainting-gallery

Citation/license

All the data is released with CC0 1.0 Universal (CC0 1.0). Still, professional ethics require that you cite the appropriate resources/publications, listed below, when using individual datasets, along with our Nature Methods publication announcing the Cell Painting Gallery (Weisbart et al., 2024). For example,

We used the dataset cpg0000 (Chandrasekaran et al., 2022), available from the Cell Painting Gallery (Weisbart et al., 2024) on the Registry of Open Data on AWS (https://registry.opendata.aws/cellpainting-gallery/).

Documentation

Please see our documentation for extensive supporting information.

It includes:

Available datasets

All datasets are generated using the Cell Painting assay unless indicated otherwise. Several updates to that protocol exist (Cell Painting wiki).

The datasets are stored with the prefix indicated by the dataset name. e.g. the first dataset is located at s3://cellpainting-gallery/cpg0000-jump-pilot and can be listed using AWS CLI aws s3 ls --no-sign-request s3://cellpainting-gallery/cpg0000-jump-pilot/ (note the / at the end). See browsing data in our documentation for more information on viewing the gallery in a browser and examples of how to list files using AWS CLI or boto3.

The datasets' accession numbers are the first seven characters of the dataset name. e.g. the accession number of the first dataset is cpg0000.

Dataset nameDescriptionPublication to citeAssociated repositoriesTotal sizeImages sizeNumerical data sizeCell Painting protocolOther aliases
cpg0000-jump-pilot300+ compounds and 160+ genes (CRISPR knockout and overexpression) profiled in A549 and U2OS cells, at two timepoints(Chandrasekaran et al., 2024) Publication, Preprint, Description of Cell Painting v2.5.data12.3 TB6.1 TB6.1 TBv2.5
cpg0001-cellpainting-protocol300+ compounds profiled in U2OS cells using several different modifications of the Cell Painting protocol(Cimini et al., 2022) Publication, Preprint Description of Cell Painting v3.data40.3 TB18.7 TB21.6 TBv3 and experiments
cpg0002-jump-scope90 compounds (JUMP-MOA plate) profiled in U2OS using different microscopes and settings(Tromans-Coia and Jamali et al., 2023) Publication, Preprintdata, analysis16.7 TB12.5 TB4.2 TBv2.5
cpg0003-rosetta28,000+ genes and compounds profiled in Cell Painting and L1000 gene expression(Haghighi et al., 2022) Publication, Preprintdata8.5 GB08.5 GB
cpg0004-lincs1,571 compounds across 6 doses in A549 cells(Way et al., 2022) Publication, Preprintdata65.7 TB61.9 TB3.8 TBv2idr0125
cpg0010-caie-drugresponseMCF-7 breast cancer cells treated with 113 small molecules at eight concentrations.(Caie et al., 2010) Publication239.2 GB98.4 GB140.8 GBother variationBBBC021
cpg0011-lipocyteprofilerVariety of lipocytes in different metabolic states and with genetic and drug perturbations(Laber and Strobel et al., 2023) Publication, Preprint Description of Cell Painting lipocyte variant.analysis1.2 TB1.2 TB16 MBlipocyte
cpg0012-wawer-bioactivecompoundprofiling30,000 compound dataset in U2OS cells(Wawer et al., 2014) Publication Description of Cell Painting v1, (Bray et al., 2017) Publication Description of Cell Painting v2data10.7 TB3.1 TB7.6 TBv1idr0016, CDRP, BBBC036, BBBC047
cpg0015-heterogeneity2,200+ compounds and 200+ genes profiles in U2OS cells(Rohban et al., 2019) Publicationdata204 GB0204 GBidr0016, idr0036, idr0033
cpg0016-jump116,000+ compounds and 16+ genes (CRISPR knockout and overexpression) profiled in U2OS cells. Over 8 million images (>126 TB), over 1.5 billion cells of numerical data (>126TB), for over 250 TB data in total.(Chandrasekaran et al., 2023) Preprintresource358.4 TBv3
cpg0017-rohban-pathways323 genes overexpressed in U2OS cells. Original images re-profiled in 2023(Rohban et al, 2017) Publication, Preprintre-profiled data, original data321 GB189 GB132 GBv1BBBC037, TA-ORF
cpg0018-singh-seedseqU2OS cells treated with each of 315 unique shRNA sequences(Singh et al. 2013) Publication247.1 GB247.1 GB0
cpg0019-moshkov-deepprofiler8.3 million single cells from 232 plates, across 488 treatments from 5 public datasets, used for learning representations(Moshkov et al., 2022) Preprintdata, software522 GB482 GB40 GBdataset dependent
cpg0021-periscope30 million cells with 20,000 single-gene knockouts in pooled format. A549 cells and HeLa cells in two growth media(Ramezani, Bauman, Singh, and Weisbart et al., 2023) Preprint Description of Cell Painting pooled variant.analysis, data, data56.0 TB45.0 TB11.0 TBpooled
cpg0022-cmqtl297 iPSC lines(Tegtmeyer et al., 2024) Publication, Preprintdata3.7 TB2.8 TB945 GBv2.5
cpg0028-kelley-resistanceBortezomib resistant HCT116 clones(Kelley et al., 2023) Publicationdata4.1 TB1.9 TB2.2 TB
cpg0030-gustafsdottir-cellpaintingU2OS cells treated with each of 1600 known bioactive compounds. Description of Cell Painting v1.(Gustafsdottir et al., 2013) Publication234 GB234 GB.3 GBv1BBBC022, idr0036
cpg0031-caicedo-cmvipORF over-expression of 596 alleles of 53 genes in A549 cells(Caicedo et al., 2023) Publication, Preprintoriginal data, re-profiled data2.2 TB605 GB1.6 TBv1BBBC043, LUAD
cpg0036-EU-OS-bioactives2464 compounds from EU-OPENSCREEN Bioactive compound set, four imaging sites, two cell lines (HepG2 & U2OS)(Wolff et al., 2024) Preprintaggregated profiles, analysis scripts, compound information3.5 TB3.5 TBv1Bioactives, EU-OS-Bioactives