🔮 Our Mission

Advancing Machine Learning Research for Biological Sciences: We develop next-generation machine learning tools tailored for biological research. Our focus is on improving causality, interpretability, disentanglement, uncertainty quantification, and decision-making in machine learning models to enhance their robustness and scientific utility.

Designing Innovative Computational Tools for Molecular Biology Research: We leverage cutting-edge genetic engineering and high-throughput profiling technologies, such as CRISPR and single-cell RNA sequencing, to study complex diseases and drive drug discovery. By integrating computational methods with experimental biology, particularly in immunology, we aim to make significant advances in understanding cellular processes and disease mechanisms.

🚀 Our Research

We strive to address these challenges in our main areas of research laid out below.

Probabilistic Inference and Generative Models

We develop ML methodology for making generative models more interpretable and usable for downstream tasks such as decision-making and hypothesis testing. These models are particularly useful in handling high-dimensional, noisy, and incomplete data typical in applied scientific research.

  • Lopez, R., Regier, J., Jordan, M. I., & Yosef, N. (2018). "Information constraints on auto-encoding variational Bayes" Advances in Neural Information Processing Systems
  • Lopez, R., Boyeau, P., Yosef, N., Jordan, M. I., & Regier, J. (2020). "Decision-making with auto-encoding variational Bayes." Advances in Neural Information Processing Systems

Causal Structure Learning, Causal Inference and Identifiability Theory

We develop causal machine learning approaches that can leverage high-dimensional data. Towards this goal, we are interested in tractable approaches to causal structure learning that have the potential to scale to tens of thousands of variables. Additionally, we are interested in causal representation learning, where interventions are conducted on latent variables of a deep generative model.

  • Lopez, R., Huetter, JC., Pritchard, J., & Regev, A. (2022). "Large-scale differentiable causal discovery of factor graphs." Advances in Neural Information Processing Systems.
  • Sethuraman, M. G., Lopez, R., Mohan, R., Fekri, F., & Hajiramezanali, E. (2023). "NODAGS-Flow: Nonlinear cyclic causal structure learning." International Conference on Artificial Intelligence and Statistics.
  • Lopez, R., Huetter, JC., Hajiramezanali, E., Pritchard, J., & Regev, A. (2024). "Towards the Identifiability of Comparative Deep Generative Models." Conference on Causal Learning and Reasoning.

Machine Learning for Single-Cell Omics Data Analysis

Our lab develops advanced algorithms to analyze single-cell omics data, enhancing our understanding of cellular states and dynamics. We focus on improving methods for differential expression analysis, integration of multi-omics data, and robust modeling of cellular heterogeneity. These innovations are vital for deciphering the complexities of single-cell data and driving biological discoveries.

  • Lopez, R., Boyeau, P., Regier, J., Gayoso, A., Jordan, M. I., & Yosef, N. (2018). "Deep generative modeling for single-cell transcriptomics." Nature Methods.
  • Gayoso*, A., Lopez*, R., Xing*, G., Boyeau, P., Wu, K., Jayasuriya, M., Regier, J., & Yosef, N. (2022). "A Python library for probabilistic analysis of single-cell omics data." Nature Biotechnology.
  • Boyeau, P., Regier, J., Gayoso, A., Jordan, M. I., Lopez*, R., & Yosef*, N. (2023). "An empirical Bayes method for differential expression analysis of single cells with deep generative models." Proceedings of the National Academy of Sciences.

Spatial Transcriptomics Data Analysis

Leveraging spatial transcriptomics, we aim to map cellular organization within tissues, combining computational biology techniques with experimental data to uncover spatial patterns and interactions at the molecular level. Our research focuses on developing robust methods for analyzing spatially resolved transcriptomics data, leading to new insights into tissue architecture and cellular function.

  • Lopez*, R., Nazaret*, A., Langevin*, M., Samaran*, J., Regier*, J., Jordan, M. I., & Yosef, N. (2019). "A joint model of unpaired data from scRNA-seq and spatial transcriptomics for imputing missing gene expression measurements." ICML Workshop in Computational Biology.
  • Lopez*, R., Li*, B., Keren-Shaul*, H., Boyeau, P., Kedmi, M., Pilzer, D., et al. (2022). "DestVI identifies continuums of cell types in spatial transcriptomics data." Nature Biotechnology.

Single-cell Perturbation Data Modeling

We explore the effects of genetic and chemical perturbations at the single-cell level, developing models that can predict cellular responses to these perturbations. This research helps in understanding the mechanisms of action for various perturbations, aiding in drug discovery and therapeutic interventions. Our models aim to be robust, interpretable, and applicable across different biological contexts.

  • Lopez*, R., Tagasovska*, N., Ra, S., Cho, K., Pritchard, J. K., & Regev, A. (2023). "Learning causal representations of single cells via sparse mechanism shift modeling." Conference on Causal Learning and Reasoning.
  • Tu, X., Huetter, J., Wang, Z. J., Kudo, T., Regev, A., & Lopez, R. (2023). "A Supervised Contrastive Framework for Learning Disentangled Representations of Cell Perturbation Data." Machine Learning in Computational Biology.