 191531920
 Markov Decision Theory and Algorithmic Methods
Course: 191531920
5 ECTS
Course type: Course
Language: English
Contact: dr.ir. A. Braaksma
E-maila.braaksma@utwente.nl
 Contact person: dr.ir. A. Braaksma
Academic year: 2020
Starting block:
 1B
Registration procedure: Self-registration via OSIRIS Student
Register via OSIRIS: Yes
 To get familiar with general theory and techniques with respect to Markov Decision Processes (MDPs). Further, the student gets an overview of algorithmic methods for MDPs. At the end of the course students are required to be able to prove additional variants of the theory, and to apply the results to specific examples.
 Markov decision theory addresses optimal sequential decision-making in the presence of uncertainty. In particular, it provides solution methodologies for a wide range of problems concerning sequential decisions in a random environment, statistically modeled by a finite-state Markov chain. The optimal strategy is calculated by appropriate algorithms, which are derived and illustrated in the first part of the course. The second part of the course focuses on algorithmic methods for large scale Markov decision problems. Under some mild conditions, exact results may be derived. Due to the large state space, these are computationally intractable. Therefore, attention is paid to the state-of-the-art of approximate dynamic programming, also known as reinforcement learning. This research area provides methods and techniques for approximating the optimal value and strategy of large scale Markov decision problems.  The following subjects are discussed, amongst others: • Discrete-time Markov decision processes with finite horizon • Discrete-time Markov decision processes with infinite horizon and discounted or average rewards • Large scale Markov decision problems • Approximate dynamic programming (also known as reinforcement learning)
 Recommended: - Basic knowledge of probability theory at the level: S.M. Ross, Introduction to probability models, 8th edition, Academic Press, 2003 (Chapters 1-3). - Basic knowledge of Markov chains at the level: S.M. Ross, Introduction to probability models; 8th edition, Academic Press, 2003 (Chapters 4-6) – as covered in AM Module 8
 Master Applied Mathematics
Book
 M.L. Puterman, Markov Decision Processes: Discrete Stochastic Dynamic Programming, Wiley, Hoboken, 2005
 Hand-outs
 Assessment: Exam 70% and 2 assignments each 10%, homework and presentation 10%
