Selected Publications

We focus on variable importance analysis in high-dimensional biological data sets with modest sample sizes, using semiparametric statistical models. We present a method that is robust in small samples, but does not rely on arbitrary parametric assumptions, in the context of studies of gene expression and environmental exposures. Such analyses are faced not only with issues of multiple testing, but also the problem of teasing out the associations of biological expression measures with exposure, among numerous confounds such as age, race, and smoking. Specifically, we propose the use of targeted minimum loss-based estimation, coupled with generalizations of moderated empirical Bayes statistics, to obtain estimates of variable importance measures. The result is a data-adaptive approach that can estimate individual associations in high-dimensional data, even in the presence of relatively small samples.

origami is an R package that provides a general framework for the application of cross-validation schemes to particular functions. By allowing arbitrary lists of results, origami accommodates a range of cross-validation applications.
In JOSS, 2018

biotmle is an R package facilitating biomarker discovery by generalizing moderated statistics for use with targeted estimators of parameters with asymptotically linear respresentations.
In JOSS, 2017

Robust Nonparametric Inference for Stochastic Interventions Under Multi-Stage Sampling
Mon, Apr 2, 2018 4:00 PM
Efficient Estimation of Survival Prognosis Under Immortal Time Bias
Mon, Mar 12, 2018 2:15 PM
Data-Adaptive Estimation and Inference for Differential Methylation Analysis
Sat, Nov 18, 2017 11:15 AM
Finite-Sample Inference and Moderated Statistics for Asymptotically Linear Parameters
Mon, Mar 20, 2017 4:00 PM
Targeted Biomarker Discovery
Mon, Mar 20, 2017 1:00 PM


