When different types of functional genomics data are generated on single cells from different samples of cells from the same heterogeneous population, the clustering of cells in the different samples should be coupled. We formulate this “coupled clustering” problem as an optimization problem, and propose the method of coupled nonnegative matrix factorizations (coupled NMF) for its solution. The method is illustrated by the integrative analysis of single cell RNA-seq and single cell ATAC-seq data.
Biological samples of interest in clinical or experimental studies are often heterogeneous mixtures of different types of cells. Suppose we have two single cell data sets, each providing information on a different feature of the cellular state, and each is generated on a different sample from this mixture. Then, the clustering of cells in the two cell samples should be coupled as both clustering are reflecting the underlying cell types in the same mixture. This “coupled clustering” problem is a new problem not covered by existing clustering methods. In this paper we develop an approach for its solution based the coupling of two nonnegative matrix factorizations. The method should be useful for integrative single cell genomics analysis tasks such as the joint analysis of single cell RNA-seq and single cell ATAC-seq data.
Release: Software Download
Any correpondences regarding the Coupled Clustering should be directed to Zhana Duren(firstname.lastname@example.org) and Prof. Wing Hung Wong (email@example.com).
The Coupled Clustering NMF model was developed by researchers at Stanford University in the Wong Lab. The Wong Lab and its research contribute to Stanford's Bio-X Initiative, which is aimed to bring clinicians, biomedical, and life science researchers together with engineers, physicists, and computational scientists to tackle the complexity of the body in health and disease.