Sluiten Help Print
 Cursus: 191531920
 191531920Markov Decision Theory and Algorithmic Methods
 Cursus informatie Rooster
Cursus191531920
Studiepunten (ECTS)5
CursustypeCursus
VoertaalEngels
Contactpersoondr.ir. A. Braaksma
E-maila.braaksma@utwente.nl
Docenten
 Contactpersoon van de cursus dr.ir. A. Braaksma Examinator dr.ir. A. Braaksma
Collegejaar2020
Aanvangsblok
 1B
AanmeldingsprocedureZelf aanmelden via OSIRIS Student
Inschrijven via OSIRISJa
 Cursusdoelen
 body { font-size: 9pt; font-family: Arial } table { font-size: 9pt; font-family: Arial } To get familiar with general theory and techniques with respect to Markov Decision Processes (MDPs). Further, the student gets an overview of algorithmic methods for MDPs. At the end of the course students are required to be able to prove additional variants of the theory, and to apply the results to specific examples.
 Inhoud
 body { font-size: 9pt; font-family: Arial } table { font-size: 9pt; font-family: Arial } Markov decision theory addresses optimal sequential decision-making in the presence of uncertainty. In particular, it provides solution methodologies for a wide range of problems concerning sequential decisions in a random environment, statistically modeled by a finite-state Markov chain. The optimal strategy is calculated by appropriate algorithms, which are derived and illustrated in the first part of the course. The second part of the course focuses on algorithmic methods for large scale Markov decision problems. Under some mild conditions, exact results may be derived. Due to the large state space, these are computationally intractable. Therefore, attention is paid to the state-of-the-art of approximate dynamic programming, also known as reinforcement learning. This research area provides methods and techniques for approximating the optimal value and strategy of large scale Markov decision problems.  The following subjects are discussed, amongst others: • Discrete-time Markov decision processes with finite horizon • Discrete-time Markov decision processes with infinite horizon and discounted or average rewards • Large scale Markov decision problems • Approximate dynamic programming (also known as reinforcement learning)
Voorkennis
 Recommended: - Basic knowledge of probability theory at the level: S.M. Ross, Introduction to probability models, 8th edition, Academic Press, 2003 (Cchapters 1-3). - Basic knowledge of Markov chains at the level: S.M. Ross, Introduction to probability models; 8th edition, Academic Press, 2003 (Chapters 4-6) – as covered in AM Module 8
 Participating study
 Master Applied Mathematics
Verplicht materiaal
Book
 M.L. Puterman, Markov Decision Processes: Discrete Stochastic Dynamic Programming, Wiley, Hoboken, 2005
Handouts
 Hand-outs
Aanbevolen materiaal
-
Werkvormen
 Hoorcollege
Toetsen
 TentamenOpmerkingTentamen 70% en 2 opdrachten elk 10%, huiswerk en presentatie 10%
 Sluiten Help Print