I am a PhD candidate in biostatistics, with a designated emphasis in computational and genomic biology, working with Mark van der Laan and Alan Hubbard. I am a founding core developer of the tlverse, the software ecosystem for targeted learning, and a workshop instructor with Software Carpentry. At UC Berkeley, I am affiliated with the Center for Computational Biology and the NIH Biomedical Big Data initiative. I have also served in biostatistical collaborations with the Bill & Melinda Gates Foundation and the Kaiser Permanente Division of Research.
My research interests primarily concern the development of robust and efficient statistical methodologies that lie at the intersection of causal inference and statistical machine learning, with the aim of facilitating flexible estimation and inference for complex data from observational studies or randomized trials. My interests further span nonparametric estimation, high-dimensional inference, targeted learning, statistical computing, survival analysis, and computational biology. A few recent methodological interests have included stochastic treatment regimes, causal mediation analysis, robust inference with two-phase sampling, variance moderation of semiparametric estimators, and nonparametric conditional density estimation. I am also quite interested in the design of open source software and the use of automated testing practices for the promotion of reproducible applied statistics and replicable science.
PhD in Biostatistics, with a designated emphasis in Computational and Genomic Biology, 2016-present
University of California, Berkeley
MA in Biostatistics, 2017
University of California, Berkeley
BA with a triple major in Molecular and Cell Biology (em. Neurobiology), Psychology, and Public Health, 2015
University of California, Berkeley
Development of dimensionality reduction methods using contrastivity and sparsification.
Defining novel mediation effects and extensions using stochastic interventions.
Extensions and applications of causal inference based on stochastic interventions in complex settings.
Moderated variance estimators for use with semiparametric data-adaptive estimators in high-dimensional biology.
Identification of differentially methylated positions and regions based on targeted learning.
Software packages extending the R programming language.
The things that keep me from working.
Assorted notes on graduate school.
(see CV for a full list)
I am an active member of Software Carpentry and Data Carpentry, through which I engage in curriculum development, maintenance of lesson materials, and workshop delivery.
Software Carpentry: Shell, Git, and
R at the Berkeley Institute
for Data Science; 2019 Jan. 17-18; co-taught
with S. Peterson, N. Varoquaux.
Course materials
here | GitHub repository
here
Software Carpentry: Shell, Git, and
Python at the Berkeley Institute
for Data Science; 2018 Jul. 16-17; co-taught
with K. Marwaha.
Course materials
here | GitHub repository
here
Data Carpentry: Genomics at
Lawrence Berkeley National Laboratory; 2018 May
3-4; co-taught with A. Orr.
Course materials
here | GitHub
repository here