Research | NYU Biological ML

We strive to address these challenges in our main areas of research laid out below.

Probabilistic Inference and Generative Models
Causal Structure Learning, Causal Inference and Identifiability Theory
Machine Learning for Single-Cell Omics Data Analysis
Single-cell Perturbation Data Modeling
Spatial Transcriptomics Data Analysis

Probabilistic Inference and Generative Models

We develop ML methodology for making generative models more interpretable and usable for downstream tasks such as decision-making and hypothesis testing. These models are particularly useful in handling high-dimensional, noisy, and incomplete data typical in applied scientific research.

Lopez, R., Regier, J., Jordan, M. I., & Yosef, N. (2018). "Information constraints on auto-encoding variational Bayes" Advances in Neural Information Processing Systems
Lopez, R., Boyeau, P., Yosef, N., Jordan, M. I., & Regier, J. (2020). "Decision-making with auto-encoding variational Bayes." Advances in Neural Information Processing Systems
Rohbeck, M., Bunne, C., Huetter, J-C., De Brouwer, E., Biton, A., Chen, K. Y., Regev*, A., & Lopez*, R. (2024). “Modeling Complex System Dynamics with Flow Matching Across Time and Conditions”. International Conference on Learning Representations.

Causal Structure Learning, Causal Inference and Identifiability Theory

We develop causal machine learning approaches that can leverage high-dimensional data. Towards this goal, we are interested in tractable approaches to causal structure learning that have the potential to scale to tens of thousands of variables. Additionally, we are interested in causal representation learning, where interventions are conducted on latent variables of a deep generative model.

Lopez, R., Huetter, JC., Pritchard, J., & Regev, A. (2022). "Large-scale differentiable causal discovery of factor graphs." Advances in Neural Information Processing Systems.
Sethuraman, M. G., Lopez, R., Mohan, R., Fekri, F., & Hajiramezanali, E. (2023). "NODAGS-Flow: Nonlinear cyclic causal structure learning." International Conference on Artificial Intelligence and Statistics.
Lopez, R., Huetter, JC., Hajiramezanali, E., Pritchard, J., & Regev, A. (2024). "Towards the Identifiability of Comparative Deep Generative Models." Conference on Causal Learning and Reasoning.

Machine Learning for Single-Cell Omics Data Analysis

Our lab develops advanced algorithms to analyze single-cell omics data, enhancing our understanding of cellular states and dynamics. We focus on improving methods for differential expression analysis, integration of multi-omics data, and robust modeling of cellular heterogeneity. These innovations are vital for deciphering the complexities of single-cell data and driving biological discoveries.

Lopez, R., Boyeau, P., Regier, J., Gayoso, A., Jordan, M. I., & Yosef, N. (2018). "Deep generative modeling for single-cell transcriptomics." Nature Methods.
Gayoso*, A., Lopez*, R., Xing*, G., Boyeau, P., Wu, K., Jayasuriya, M., Regier, J., & Yosef, N. (2022). "A Python library for probabilistic analysis of single-cell omics data." Nature Biotechnology.
Boyeau, P., Regier, J., Gayoso, A., Jordan, M. I., Lopez*, R., & Yosef*, N. (2023). "An empirical Bayes method for differential expression analysis of single cells with deep generative models." Proceedings of the National Academy of Sciences.

Single-cell Perturbation Data Modeling

We explore the effects of genetic and chemical perturbations at the single-cell level, developing models that can predict cellular responses to these perturbations. This research helps in understanding the mechanisms of action for various perturbations, aiding in drug discovery and therapeutic interventions. Our models aim to be robust, interpretable, and applicable across different biological contexts.

Lopez*, R., Tagasovska*, N., Ra, S., Cho, K., Pritchard, J. K., & Regev, A. (2023). "Learning causal representations of single cells via sparse mechanism shift modeling." Conference on Causal Learning and Reasoning.
Ryu, J., Bunne, C., Pinello, L., Regev*, A., & Lopez*, R. (2025). “Crossmodality Matching and Prediction of Perturbation Responses with Labeled Gromov-Wasserstein Optimal Transport”. International Conference on Artificial Intelligence and Statistics.

Spatial Transcriptomics Data Analysis

Leveraging spatial transcriptomics, we aim to map cellular organization within tissues, combining computational biology techniques with experimental data to uncover spatial patterns and interactions at the molecular level. Our research focuses on developing robust methods for analyzing spatially resolved transcriptomics data, leading to new insights into tissue architecture and cellular function.

Lopez*, R., Nazaret*, A., Langevin*, M., Samaran*, J., Regier*, J., Jordan, M. I., & Yosef, N. (2019). "A joint model of unpaired data from scRNA-seq and spatial transcriptomics for imputing missing gene expression measurements." ICML Workshop in Computational Biology.
Lopez*, R., Li*, B., Keren-Shaul*, H., Boyeau, P., Kedmi, M., Pilzer, D., et al. (2022). "DestVI identifies continuums of cell types in spatial transcriptomics data." Nature Biotechnology.