Home

Awesome

📦 cytocoarsening.py

We want to identify cell-types that are enriched for both phenotype (e.g. cell phenotype) and relate to the external information. Graph-based approaches for identifying these modules can suffer in the single-cell setting because there is an extremely large number of cells profiled per sample and we often profile multiple samples with multiple different experimental conditions or timepoints. Here is Cytocoarsening github link.

Overview

Installation

If you'd like to install by PYPI, please type this line on your command line(Cytocoarsening PYPI):

pip install cytocoarsening
git clone https://github.com/ChenCookie/cytocoarsening.git
cd cytocoarsening

If cytocoarsening appears error or warning, please make sure the version of scipy and networkx:

scipy==1.6.2
networkx==2.6.2

To reinstall the particular version of package

pip install --force-reinstall scipy==1.6.2
pip install --force-reinstall networkx==2.6.2

Data access

Take preeclampsia for example, if you'd want to list all of the publicly available files for download,

from lxml import html
import requests

r = requests.get(f'https://zenodo.org/record/6779483#.Yrygu-zMJhF')
content = html.fromstring(r.content)
hrefs = content.xpath('//a/@href')
files = [i for i in hrefs if i.endswith('?download=1')]
files = np.unique(files)
print(files)

If you'd like to download any preeclampsia dataset file from zenodo,

curl 'https://zenodo.org/record/6779483/files/Han-FCS_file_list.xlsx?download=1' --output Han-FCS_file_list.xlsx

Parameter Explanation

The function can be excute at one line.

coarsening_group,group_edge,result_dicts=cytocoarsening(cell_data,cell_label,multipass,k_nearest_neighbors)

input

output

Toy Example

from cytocoarsening.cytocoarsening import cytocoarsening
import numpy as np
import random

cell_data=[[random.random() for i in range(33)] for j in range(4500)]
cell_data=np.array(cell_data)

cell_label = np.array([0] * 1000 + [1] * (3500))
np.random.shuffle(cell_label)

group,edge,diccts=cytocoarsening(cell_data,cell_label,3,5)