English 清华大学 旧版入口 人才招聘

论坛讲座

【系综合学术报告】2024年第40期 || Optimal Design for A/B Testing in Partially Observable Time Series Experiments

报告题目Optimal Design for A/B Testing in Partially Observable Time Series Experiments

报告人: Professor Chengchun Shi (London School of Economics and Political Science)

时间: 2024913(周五) 上午10:00-11:00

地点: 理科楼A404

摘要Time series experiments, in which experimental units receive a sequence of treatments over time, are prevalent in technological companies, including ride sharing platforms and trading companies. These companies frequently employ such experiments for A/B testing, to evaluate the performance of a newly developed policy, product, or treatment relative to a baseline control. Many existing solutions require that the experimental environment be fully observed to ensure

the data collected satisfies the Markov assumption. This condition, however, is often violated in real-world scenarios. Such gap between theoretical assumptions and practical realities challenges the reliability of existing approaches and calls for more rigorous investigations of A/B testing procedures.

In this paper, we study the optimal experimental design for A/B testing in partially observable environments. We introduce a controlled (vector) autoregressive moving average model to effectively capture a rich class of partially observable environments. Within this framework, we derive closed-form expressions, i.e., efficiency indicators, to assess the statistical efficiency of various sequential experimental designs in estimating the average treatment effect (ATE). A key innovation of our approach lies in the introduction of a weak signal assumption, which significantly simplifies the computation of the asymptotic mean squared errors of ATE estimators in time series experiments. We next proceed to develop two data-driven algorithms to estimate the optimal design: one utilizing constrained optimization, and the other employing reinforcement learning. We demonstrate the superior performance of our designs using a dispatch simulator and two real datasets from a ride-sharing company.

邀请人: 杨瑛