2024 Kroc Lecture: Design for Inference in Biology

Guest Speaker: Aviv Regev, PhD

Executive Vice President Research and Early Development, Genentech

Tuesday, January 30, 2024, Chen 100

Faculty Hosts: Barbara Wold and Mitch Guttman

Abstract:

In many of the most pressing problems in biology, chemistry, and medicine – including the role of somatic mutations in cancer, genetic variants associated with risk of common disease, non-additive genetic interactions, possible antibody or functional protein sequences, or the space of drug-like chemical compounds – there is an enormous space of possibilities, even though only very few of them are realized. In each of these cases, the number of hypothetical possibilities exceeds by many orders of magnitude what can ever be tested in a lab, clinical trial or even an entire population. Historically, this challenge was tackled is by restricting the search space by prior knowledge, a practical approach which nevertheless severely limited our ability to make new discoveries and effective predictions.

The dramatic enhancements over the past decade in our ability to perform both comprehensive observations of biological systems, with massively parallel and high resolution lab methods and causal (genetic) interventions now offer the hope that it should be possible to tackle these challenges systematically. However, in principle, an astronomical gap remains between the scales of exhaustive experiments, and those that can be achieved in practice. In this talk, I describe a series of new, algorithmically-driven efficient strategies to design experiments for inference in biology and drug discovery, leveraging the inherent latent structure of biological and chemical systems and the data we measure about them. In particular, I'll show how random experiments can help us build oracles that predict expression from sequence and use them for design of sequences with desired functionality and for studies of comprehensive fitness landscape in evolution; how randomly sampled and composite (‘compressed') perturbation screens can help predict genetic (non-additive) interactions efficiently; and how large spaces of antibody sequences, antigen sequences, or drug-like small molecules can be encoded to drive a ‘lab in a loop' combining virtual and lab screens for new small molecules, antibody and cancer vaccine therapeutics. Our work charts a path to use computational insights for conceiving and applying to experiments to decipher fundamental circuits in cells and tissues and how to tackle their malfunction in disease.