haldensify: Highly adaptive lasso conditional density estimation in R

Abstract

The haldensify R package serves as a toolbox for nonparametric conditional density estimation based on the highly adaptive lasso, a flexible nonparametric algorithm for the estimation of functional statistical parameters (e.g., conditional mean, hazard, density). Building upon an earlier proposal (Dı́az and van der Laan, 2011), haldensify leverages the relationship between the hazard and density functions to estimate the latter by applying pooled hazard regression to a synthetic repeated measures dataset created from the input data, relying upon the framework of cross-validated loss-based estimation to yield an optimal estimator (Dudoit and van der Laan, 2005; van der Laan et al., 2004). While conditional density estimation is a fundamental problem in statistics, arising naturally in a variety of applications (including machine learning), it plays a critical role in estimating the causal effects of continuous- or ordinal-valued treatments. In such settings this covariate-conditional treatment density has been termed the generalized propensity score (Hirano and Imbens, 2004; Imai and Van Dyk, 2004), and, like its analog for binary treatments (Rosenbaum and Rubin, 1983), serves as a key ingredient in developing both inverse probability weighted and doubly robust estimators of causal effects (Dı́az and van der Laan, 2012, 2018; Haneuse and Rotnitzky, 2013; Hejazi et al., 2022).

Publication
In Journal of Open Source Software
Nima Hejazi
Nima Hejazi
Assistant Professor of Biostatistics

My research lies at the intersection of causal inference and machine learning, developing flexible methodology for statistical inference tailored to modern experiments and observational studies in the biomedical and public health sciences.