Generally efficient nonparametric inference under two-phase sampling, with applications to stochastic interventions


The advent and subsequent widespread availability of preventive vaccines has altered the course of public health in the twentieth century. In spite of the overall success, vaccines are still lacking for many high-burden diseases, including HIV. An important step in the process of developing effective vaccines is identifying immune responses that are indicative of protective efficacy. In this work, we use a causal inference framework to propose a new approach to studying immune responses in the context of vaccines. We focus on causal quantities defined by stochastic interventions, which may be more relevant than alternative approaches for describing the effects of immune responses on risk of infection or disease. We propose methodology for efficiently estimating these quantities using data generated by preventive vaccine trials with two-phase sampling of immune responses. We propose and evaluate two strategies for estimating these quantities: an inverse probability weighting-based method and an augmented method. The latter method is shown to be nonparametric efficient and multiply robust to misspecification of nuisance estimators. We also provide methods for constructing confidence intervals and hypothesis tests, and provide an open source software implementation of the proposed methodology. We illustrate the methods using data from a recent preventive HIV vaccine trial.