A Framework for Causal Segmentation Analysis with Machine Learning in Large-Scale Digital Experiments

Abstract

We present an end-to-end methodological framework for causal segment discovery that aims to uncover differential impacts of treatments across subgroups of users in large-scale digital experiments. Building on recent developments in causal inference and non/semi-parametric statistics, our approach unifies two objectives: (1) the discovery of user segments that stand to benefit from a candidate treatment based on subgroup-specific treatment effects, and (2) the evaluation of causal impacts of dynamically assigning units to a study’s treatment arm based on their predicted segment-specific benefit or harm. Our proposal is model-agnostic, capable of incorporating state-of-the-art machine learning algorithms into the estimation procedure, and is applicable in randomized A/B tests and quasi-experiments. An open source R package implementation, sherlock, is introduced.

Date
Fri, Nov 5, 2021 2:30 PM
Location
Boston, Massachusetts, United States (remote due to COVID-19)
Nima Hejazi
Nima Hejazi
Assistant Professor of Biostatistics

My research broadly concerns the intersection of causal inference and machine learning, including developing statistical methodology tailored to modern experiments and observational studies in the biomedical and public health sciences.