Nima Hejazi

Nima Hejazi

PhD Candidate in Biostatistics

University of California, Berkeley

Nima Hejazi's GitHub Activity

Biography

I will soon start as an NSF postdoctoral research fellow at Weill Cornell Medicine, working with Iván Díaz and collaborating with David Benkeser. I am completing my PhD in biostatistics at UC Berkeley, under the guidance of Mark van der Laan and Alan Hubbard. I am a founding core developer of the tlverse project, the software ecosystem for targeted learning. During my graduate studies, I enjoyed a variety of collaborations with the Bill & Melinda Gates Foundation, Fred Hutchinson Cancer Research Center, Kaiser Permanente Division of Research, SiriusXM, and Netflix.

My research interests sit at the intersection of nonparametric causal inference and machine learning, particularly in the development of statistical procedures tailored for efficient estimation and robust inference, in flexible statistical models. Broadly, I am motivated by methodological issues arising from high-dimensional inference, loss-based estimation, semiparametric theory, and complex study designs, usually inspired by applications in computational biology, epidemiology, and vaccine trials. I am also keenly interested in both high-performance statistical computing and the critical role that open source software plays in reproducible and replicable science.

Interests

  • Causal Inference and Censored Data Models
  • Nonparametric Estimation and Machine Learning
  • Semiparametric Theory and Robust Statistics
  • High-Dimensional and Computational Biology
  • Statistical Computing and Reproducible Research

Education

  • PhD in Biostatistics (designated emphasis in Computational & Genomic Biology), 2021

    University of California, Berkeley

  • MA in Biostatistics, 2017

    University of California, Berkeley

  • BA with a triple major in Molecular & Cell Biology (em. Neurobiology), Psychology, and Public Health, 2015

    University of California, Berkeley

Recent Publications

(see CV for a full list)

Recent & Upcoming Talks

Leveraging the Causal Effects of Stochastic Interventions to Evaluate Vaccine Efficacy in Two-phase Trials
Leveraging the Causal Effects of Stochastic Interventions to Evaluate Vaccine Efficacy in Two-phase Trials
Evaluating the Causal Impacts of Vaccine-induced Immune Responses in Two-phase Vaccine Efficacy Trials
Evaluating the Causal Impacts of Vaccine-induced Immune Responses in Two-phase Vaccine Efficacy Trials
Efficient Estimation of Stochastic Intervention Effects in Causal Mediation Analysis

Teaching

current courses

I will not be teaching during the 2021-2022 academic year. Check back later.

past courses

  • Public Health 290: Biomedical Big Data Capstone Seminar (Targeted Learning in Practice), as graduate student instructor with Prof. Mark van der Laan; Spring 2021 at the University of California, Berkeley.

  • Public Health 240B / Statistics 245B: Survival Analysis and Causality, as graduate student instructor with Prof. Mark van der Laan; Fall 2020 at the University of California, Berkeley.

  • Public Health 290: Biomedical Big Data Capstone Seminar, as graduate student instructor with Prof. Alan Hubbard; Spring 2020 at the University of California, Berkeley.

  • Public Health 242C / Statistics 247C: Longitudinal Data Analysis, as graduate student instructor with Prof. Alan Hubbard; Fall 2019 at the University of California, Berkeley.

  • Public Health 290: Targeted Learning in Biomedical Big Data, as graduate student instructor with Prof. Mark van der Laan; Spring 2018 at the University of California, Berkeley.

upcoming workshops

recent workshops

Carpentries workshops

I am a member of Software Carpentry and Data Carpentry, through which I work on curriculum development, maintenance of lesson materials, and workshop delivery.

Software

Collected collateral damage from doing statistics research, hopefully useful to others.

Targeted Learning with the tlverse

The tlverse is an ecosystem of R packages for Targeted Learning, of which I am a co-founder and core developer. A few of the tlverse packages to which I’ve made significant contributions include

Causal Inference Meets Machine Learning

A significant focus of my research program centers on the intersection of causal inference and statistical machine learning. I’ve (co-)developed R packages for a range of problems: causal mediation analysis, evaluating stochastic interventions under two-phase sampling, conditional density estimation, and survival analysis.

  • medshift: An R package for estimating the population intervention (in)direct effects based on stochastic interventions. Classical and efficient estimators are supported for the effects of incremental propensity score interventions and modified treatment policies. Joint work with Iván Díaz.
    [Docs] | [GitHub]

  • medoutcon: An R package for efficient estimation of interventional (in)direct effects subject to intermediate confounding, including one-step and targeted minimum loss estimators. Joint work with Iván Díaz and Kara Rudolph.
    [Docs] | [GitHub]

  • txshift: An R package for efficient estimation of and inference on causal effects of stochastic interventions on continuous-valued exposures. Robust estimation and efficient inference under two-phased sampling is supported. Joint work with David Benkeser.
    [Docs] | [GitHub] | [CRAN] | [Paper]

  • haldensify: An R package for nonparametric conditional density estimation based on the highly adaptive lasso, designed for estimating the generalized propensity score. Joint work with David Benkeser and Mark van der Laan.
    [Docs] | [GitHub] | [CRAN]

  • survtmle: An R package for the construction of targeted maximum likelihood estimates of marginal cumulative incidence in right-censored survival settings with and without competing risks, including estimation procedures that respect bounds. Joint work with David Benkeser.
    [Docs] | [GitHub] | [CRAN]

Computational Biology and Bioconductor

A parallel thread of my research concerns the development of novel statistical methodologies for application in high-dimensional and computational biology settings. Consequently, I have (co-)developed several R packages extending the Bioconductor Project.

Other Assorted Adventures

Contact