Multi-ContrastiveVAE disentangles perturbation effects in single cell images from optical pooled screens

Jan 1, 2024·
Zitong Jerry Wang
,
Romain Lopez
,
Jan-Christian Hütter
,
Takamasa Kudo
,
Heming Yao
,
Philipp Hanslovsky
,
Burkhard Höckendorf
,
Aviv Regev
· 0 min read
Abstract
Optical pooled screens (OPS) enable unbiased and cost-effective interrogation of gene function by generating images of millions of cells across thousands of perturbations. However, the analysis of OPS data remains a hurdle because it still mainly relies on hand-crafted features, which can be difficult to deploy across complex data sets. Additionally, most unsupervised feature extraction methods based on neural networks (such as auto-encoders) have difficulty isolating the effect of perturbations from the natural variations among cells. We therefore propose a contrastive analysis framework that is more effective at disentangling the phenotypes induced by perturbation from natural cell-cell heterogeneity present in an unperturbed cell population. By analyzing a significant data set of over 30 million cells across more than 5, 000 genetic perturbations, we demonstrate that our method significantly outperforms traditional methods in generating biologically-informative embeddings and mitigating technical artifacts. Furthermore, the interpretable part of our model enables us to pinpoint perturbations that generate novel phenotypes from the ones that only shift the distribution of existing phenotypes. Our approach can be readily applied to other small-molecule and genetic perturbation data sets with highly multiplexed images, enhancing the efficiency and precision in identifying and interpreting perturbation-specific phenotypic patterns, paving the way for deeper insights and discoveries in OPS analysis.
Type
Publication
ICLR Workshop on Machine Learning for Genomics Explorations