Markov decision processes are an effective tool in modeling decision-making in uncertain dynamic environments. The parameters of these models are often estimated from data, learned from experience, or designed by hand. It is therefore not surprising that the actual performance of a chosen strategy often significantly differs from the designer's initial expectations due to unavoidable modeling ambiguity. In this talk we address this uncertainty in the model parameters and its ramifications on decision making in dynamic environments. We start with highlighting the magnitude of the problem in a real-world data intensive decision problem. We then consider a methodological approach that enables the decision maker to take this uncertainty into account. By taking a Bayesian perspective we can consider a percentile optimization approach that allows the decision maker to naturally optimize a desired level of risk measured in terms of percentile performance. We show that some forms of this uncertainty can be efficiently solved and others are NP-hard. We then explain how to address a very high dimensional state space by using non-parametric statistics tools such as Gaussian processes to approximate the value function.
BiographyShie Mannor graduated from the Technion with a BSc in Electrical Engineering and BA in mathematics (both summa cum laude) in 1996. After that he spent almost four years as an intelligence officer with the Israeli Defence Forces. He was involved in a few ventures in the high-tech industry. He earned my PhD in Electrical Engineering from the Technion in 2002, under the supervision of Nahum Shimkin. He was then a Fulbright postdoctoral associate with LIDS (MIT) working with John Tsitsiklis for two years. He joined the Department of Electrical and Computer Engineering in McGill University in July 2004, where he holds a Canada Research Chair in Machine Learning.