A Framework for Causal Segmentation Analysis with Machine Learning in Large-Scale Digital Experiments

Abstract

We present an end-to-end methodological framework for causal segment discovery that aims to uncover differential impacts of treatments across subgroups of users in large-scale digital experiments. Building on recent developments in causal inference and non/semi-parametric statistics, our approach unifies two objectives: (1) the discovery of user segments that stand to benefit from a candidate treatment based on subgroup-specific treatment effects, and (2) the evaluation of causal impacts of dynamically assigning units to a study’s treatment arm based on their predicted segment-specific benefit or harm. Our proposal is model-agnostic, capable of incorporating state-of-the-art machine learning algorithms into the estimation procedure, and is applicable in randomized A/B tests and quasi-experiments. An open source R package implementation, sherlock, is introduced.

Date
Fri, Nov 5, 2021 2:30 PM
Location
Boston, Massachusetts, United States (remote due to COVID-19)
Nima Hejazi
Nima Hejazi
NSF Postdoctoral Research Fellow in Biostatistics

My research interests lie at the intersection of causal inference and machine learning, especially as applied to the statistical analysis of complex data from observational studies and experiments in the biomedical and health sciences.