R Packages

R/txshift

An R package for efficient estimation of and nonparametric inference on the effects of stochastic intervention, including in settings with multi-stage sampling designs. This provides a flexible way to perform nonparametric variable importance analysis on continuous quantities using the Targeted Learning framework.
View the package documentation and related information here.
Joint work with David Benkeser.
[GitHub]

R/sl3

An R package providing a modern re-implementation of the Super Learner algorithm for ensemble modeling and stacked regression based on machine learning pipelines.
View the package documentation and related information here.
Joint work with Jeremy Coyle, Ivana Malenica, and Oleg Sofrygin.
[GitHub]

R/origami

An R package providing a general framework for the application of various cross-validation schemes to arbitrary functions, facilitating the extension of cross-validation procedures to numerous applications.
View the package documentation and related information here.
Joint work with Jeremy Coyle.
[GitHub] | [CRAN] |

R/hal9001

An R package providing a fast and efficient implementation of the Highly Adaptive Lasso (HAL), a nonparametric regression estimator with optimality guarantees.
Joint work with Jeremy Coyle.
[GitHub]

R/survtmle

An R package providing facilities for estimation and inference in right-censored survival analysis settings with and without competing risks, including extensions for data-adaptive target parameters, using Targeted Learning.
View the package documentation and related information here.
Joint work with David Benkeser.
[GitHub] | [CRAN]

R/methyvim

An R package implementing a framework for using Targeted Learning to assess evidence for differential methylation across the genome by estimating variable importance measures at the level of CpG sites and related functional units.
View the package documentation and related information here.
Joint work with Mark van der Laan and Alan Hubbard.
[GitHub] | [Bioconductor]

R/biotmle

An R package implementing a set of techniques for discovering biomarkers from biological sequencing data using a combination of Targeted Learning and a generalization of moderated statistics for variance stabilization in finite samples.
View the package documentation and related information here.
Joint work with Alan Hubbard.
[GitHub] | [Bioconductor] |

R/adaptest

An R package for data-adaptive hypothesis testing in high-dimensional settings. The approach allows for effects to be discovered (“mined”) from data without loss of valid statistical inference using the framework of Targeted Learning.
Joint work with Weixin Cai and Alan Hubbard.
[GitHub] | [Bioconductor]

R/nima

An R package housing Nima’s personal R toolbox, largely containing miscellaneous convenience functions written to make statistical computing for research easier.
View the package documentation and related information here.
[GitHub] | [CRAN]

Publications

Talks

Ensemble (Machine) Learning with Super Learner and H2O in R
Tue, Dec 6, 2016 4:00 PM