Statistical machine learning provides a diversity of flexible methods for performing robust and nonparametric estimation. Such techniques allow for the statistical estimation of causal target parameters, provide an avenue for pursuing statistical inference while avoiding parametric model misspecification, and constitute a class of model-free and robust procedures for estimation and inference. My interests in these areas encompass the theory and application of targeted minimum loss-based estimation, data-adaptive estimation for inference under model misspecification, cross-validation for data-adaptive parameter estimation, robust estimation methods (e.g., influence functions), randomization inference, permutation testing and confidence procedures, and missing data problems.
The breakneck pace of technological development in the biomedical sciences routinely generates new classes of data that pose novel issues for statistical inference. I am interested in the development and application of robust methods for analyzing data generated by emerging biotechnologies, in particular employing causal inference for variable importance analysis and data-adaptive estimation to high-dimensional Omics problems. Recently, my work has centered on the development of methodology inspired by statistical causal inference and applied to the analysis of data sets generated by methylation sequencing, RNA-seq, and CRISPR assays.
The analysis of longitudinal data has been the subject of much attention in medical research and forms one of the principal areas of intersection between medicine and statistics. I am interested in both developing and applying modern techniques to the analysis of longitudinal data (esp. survival analysis), including the study of nonparametric estimators, data-adaptive parameter identification and estimation, settings with competing risks, and flexible estimation paradigms incorporating permutation testing and nonparametric regression. Recently, I have been involved in work studying the breakdown of nonparametric estimators of survival and data-adaptive extensions of targeted learning methods for the analysis of vaccine sieve effects.
While computing is an essential part of the scientific process, many best practices common in areas such as software engineering have gone all but overlooked in modern scientific practice. This failure has at least in part contributed to a reproducibility crisis in modern science; moreover, the presence of such problems has made the use of computationally reproducible workflows an imperative. Consequently, I am highly interested in the development and promotion of tools for both scientific computing and reproducible research, including the use of open-source software and version control systems, publication and peer review in open access journals, and tools for collaborative and literate programming (GitHub, RMarkdown, Project Jupyter).