Home

Awesome

Dimensionality Reduction Metrics

This repository contains R functions to evaluate the quality of projections obtained after using dimensionality reduction techniques. A nextjournal notebook is associated to this repository and uses the functions described in this README file to evaluate the quality of a molecular map of lung neuroendocrine tumors produced using the UMAP algorithm.

Sequence difference view (SD) metric

SD metric calculation for one sample compute_SD

Description

This function computes the sequence difference (SD) view metric value for a single given sample (i), following the equation 3 described by Martins et al. in 2015. This dissimilarity metric compares the k-neighborhood of a given sample in two different dimensional spaces. The lower is the SD value, the better is the neighborhood preservation.

Usage

compute_SD(dist_space1,dist_space2,k)

Arguments

Value

A numeric value corresponding to the SD value is returned.

SD metric calculation for all samples compute_SD_allSamples

Description

This function computes the SD metric for all samples included in the dimensionality reduction. The metric is computed to compare one or multiple comparison reduced spaces to a the reference space. The SD values are computed for several k values (number of neighbors to consider).

Usage

compute_SD_allSamples(distRef,List_projection,k_values,colnames_res_df, threads=2)

Arguments

Value

Visualizing the SD metric in a two dimensional map SD_map_f

Description

This function allows to display, on a two dimensional projection, the samples SD values averaged over different values of k (number of neighbors considered to compute the SD metric).

Usage

SD_map_f(SD_df, Coords_df, legend_pos = "right")

Arguments

Value

A list containing:

Spatial autocorrelation

Moran's Index (MI) computation moran_I_knn

Description

This function allows to compute the Moran’s Index autocorrelation coefficient for a given feature used in the dimensionality reduction technique, for different levels of the parameter k which corresponds to the number of samples to consider for the samples neighborhood definition. The MI values are computed using the Moran.I function from the R package ape.

Usage

moran_I_knn(expr_data , spatial_data, listK)

Arguments

Value