SluitenHelpPrint
Switch to English
Cursus: 191531920
191531920
Markov Decision Theory and Algorithmic Methods
Cursus informatieRooster
Cursus191531920
Studiepunten (ECTS)5
CursustypeCursus
VoertaalEngels
Contactpersoondr.ir. A. Braaksma
E-maila.braaksma@utwente.nl
Docenten
Docent
dr.ir. A. Braaksma
Examinator
dr.ir. A. Braaksma
Contactpersoon van de cursus
dr.ir. A. Braaksma
Collegejaar2021
Aanvangsblok
1B
AanmeldingsprocedureZelf aanmelden via OSIRIS Student
Inschrijven via OSIRISJa
Cursusdoelen
To get familiar with general theory and techniques with respect to Markov Decision Processes (MDPs). Further, the student gets an overview of algorithmic methods for MDPs. At the end of the course students are required to be able to prove additional variants of the theory, and to apply the results to specific examples.
 
Inhoud
Markov decision theory addresses optimal sequential decision-making in the presence of uncertainty. In particular, it provides solution methodologies for a wide range of problems concerning sequential decisions in a random environment, statistically modeled by a finite-state Markov chain. The optimal strategy is calculated by appropriate algorithms, which are derived and illustrated in the first part of the course. The second part of the course focuses on algorithmic methods for large-scale Markov decision problems. Under some mild conditions, exact results may be derived. Due to the large state space, these are computationally intractable. Therefore, attention is paid to the state-of-the-art of approximate dynamic programming, also known as reinforcement learning. This research area provides methods and techniques for approximating the optimal value and strategy of large-scale Markov decision problems.

The following subjects are discussed, amongst others:
  • Discrete-time Markov decision processes with finite horizon
  • Discrete-time Markov decision processes with infinite horizon and discounted or average rewards
  • Large-scale Markov decision problems
  • Approximate dynamic programming (also known as reinforcement learning)
Voorkennis
Recommended:
- Basic knowledge of probability theory at the level: S.M. Ross, Introduction to probability models, 8th edition, Academic Press, 2003 (Chapters 1-3).
- Basic knowledge of Markov chains at the level: S.M. Ross, Introduction to probability models; 8th edition, Academic Press, 2003 (Chapters 4-6) – as covered in AM Module 8
Participating study
Master Applied Mathematics
Verplicht materiaal
Book
M.L. Puterman, Markov Decision Processes: Discrete Stochastic Dynamic Programming, Wiley, Hoboken, 2005
Handouts
Hand-outs
Aanbevolen materiaal
-
Werkvormen
Hoorcollege

Toetsen
Exam, Assignments

Opmerking
Exam 70%, 2 assignments 10% each, home work and presentation 10%

SluitenHelpPrint
Switch to English