SluitenHelpPrint
Switch to English
Cursus: 202100109
202100109
Reinforcement Learning
Cursus informatie
Cursus202100109
Studiepunten (ECTS)5
CursustypeCursus
VoertaalEngels
Contactpersoondr.ing. A.B. Zander
E-maila.b.zander@utwente.nl
Docenten
Contactpersoon van de cursus
dr.ing. A.B. Zander
Examinator
dr.ing. A.B. Zander
Collegejaar2022
Aanvangsblok
2B
AanmeldingsprocedureZelf aanmelden via OSIRIS Student
Inschrijven via OSIRISJa
Cursusdoelen
After following this course, the student is able to:
  • explain the different dimensions of (value-based) reinforcement learning algorithms,
  • explain the ODE (ordinary differential equation) approach to show convergence for stochastic approximation schemes,
  • model a (real-world) sequential decision-making problem (SDMP) as a Markov decision process,
  • choose a suitable reinforcement learning algorithm to solve the SDMP, which may include the design of an appropriate approximation framework,
  • implement and test reinforcement learning algorithms using a modern software package,
  • analyze a given reinforcement learning algorithm with respect to stability, convergence, and optimality,
  • analyze, critically evaluate and explain a scientific article's reinforcement learning problem and the corresponding solution approach.
Inhoud
Reinforcement learning (RL) is an area of machine learning concerned with how intelligent agents should act in an environment to maximize a reward signal. Formally, the environment is given as a Markov decision process (MDP) for which the underlying dynamics may be known or unknown. This course introduces techniques for modeling and solving RL problems, focusing on provable performance guarantees such as convergence and optimality.

The covered models and algorithms correspond to Chapters 1-10 in the textbook by Sutton and Barto: "Reinforcement Learning: An Introduction," 2nd edition. In addition to the textbook, we draw theory from Borkar's book on" Stochastic Approximation: A Dynamical Systems Viewpoint" and further insights from Powell's book "Reinforcement Learning and Stochastic Optimization: A Unified Framework for Sequential Decisions."

There will be two homework sets. One set will focus on theoretical exercises and the other on implementation and experimentation. In addition, students will study a recent scientific article on RL in groups of 2 or 3 and present it in class.

Assessment
  • Written exam (50%)
  • Two homework sets (30%)
  • Reading and presenting a scientific paper (20%)
Voorkennis
191531920 - Markov Decision Theory and Algorithmic Methods is a prerequisite.
Participating study
Master Applied Mathematics
Participating study
Master Computer Science
Participating study
Master Electrical Engineering
Participating study
Master Interaction Technology
Participating study
Master Systems and Control
Verplicht materiaal
Book
Sutton and Barto, "Reinforcement Learning: an Introduction", second edition, 2018. https://www.andrew.cmu.edu/course/10-703/textbook/BartoSutton.pdf
Aanbevolen materiaal
Book
Bokar, "Stochastic Approximation, A Dynamical Systems Viewpoint", 2008. https://doi.org/10.1007/978-93-86279-38-5
Articles
Powell, ”A unified framework for stochastic optimization”, EJOR, 275(3), 2019, 795-821. https://doi.org/10.1016/j.ejor.2018.07.014
Werkvormen
Hoorcollege

Presentatie(s)
AanwezigheidsplichtJa

Vragenuur

Werkcollege

Zelfstudie met begeleiding

Toetsen
Written Exam

SluitenHelpPrint
Switch to English