Conference Agenda

Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).

 
 
Session Overview
Session
MC11 - ML3: Prediction and regret
Time:
Monday, 27/June/2022:
MC 14:00-15:30

Session Chair: Omar Mouchtaki
Location: Forum 15


Show help for 'Increase or decrease the abstract text size'
Presentations

Regret bounds for risk-sensitive reinforcement learning

Osbert Bastani1, Jason Yecheng Ma1, Estelle Shen1, Wanqiao Xu2

1University of Pennsylvania, United States of America; 2Stanford University, United States of America

Reinforcement learning is a promising strategy for data-driven sequential decision-making. In many real-world applications, it is desirable to optimize objectives that account for risk in the achieved outcomes. We prove the first regret bounds for reinforcement learning algorithms targeting a broad class of risk-sensitive objectives, including the popular conditional value at risk (CVaR) objective. Our analysis relies on novel characterizations of the risk-sensitive objective and the optimal policy.



Prediction with missing data

Dimitris Bertsimas1, Arthur Delarue2, Jean Pauphilet3

1MIT Sloan School of Management, United States of America; 2Georgia Institute of Technology, United States of America; 3London Business School, United Kingdom

Missing information is inevitable in real-world data sets. While imputation is well-suited for statistical inference, its relevance for out-of-sample prediction remains unsettled. We analyze widely used data imputation methods and highlight their key deficiencies in making accurate predictions. Alternatively, we propose adaptive linear regression, a new class of models that can be directly trained and evaluated on partially observed data. We validate our findings on real-world data sets.



Data-driven newsvendor: operating in a heterogeneous environment

Omar Besbes, Will Ma, Omar Mouchtaki

Columbia University, New York

We study a newsvendor problem in which the decision-maker only observes historical demands. In contrast to the extant literature, we relax the i.i.d. assumption for past demands and assume instead that they are drawn from distributions within a distance r away from the future demand distribution. We establish an exact characterization of the worst-case regret of Sample Average Approximation. When r is small, we present a near-optimal algorithm which robustifies SAA by using less samples.