CloseHelpPrint
Kies de Nederlandse taal
Course module: 202100109
202100109
Reinforcement Learning
Course infoSchedule
Course module202100109
Credits (ECTS)5
Course typeCourse
Language of instructionEnglish
Contact persondr.ir. A.B. Zander
E-maila.b.zander@utwente.nl
Lecturer(s)
Contactperson for the course
dr.ir. A.B. Zander
Examiner
dr.ir. A.B. Zander
Academic year2021
Starting block
2B
Application procedureYou apply via OSIRIS Student
Registration using OSIRISYes
Aims
After following this course, the student is able to:
  • model a real-world problem in a reinforcement learning model,
  • choose a suitable learning algorithm and design an appropriate approximation framework for that model,
  • use and extend a state-of-the-art software package for reinforcement learning,
  • validate a given algorithm with respect to convergence, stability, and optimality.
Content
Reinforcement learning (RL) is an area of machine learning concerned with how intelligent agents ought to take actions in an environment to maximize reward. This environment is stated as a Markov decision process (MDP) for which the underlying model is unknown. This course introduces techniques for modeling and handling large-scale RL problems. The focus is on i) provable performance guarantees, like on (speed of) convergence, and ii) algorithmic techniques.

The covered models and algorithms correspond to Chapters 1-10 in the textbook by Sutton and Barto: “Reinforcement Learning: an Introduction,” 2nd edition. In addition to the textbook, we draw theory from Borkar’s book on ”Stochastic Approximation, A Dynamical Systems Viewpoint” and  further insights from Powell’s article “A unified framework for stochastic optimization.”

There will be two mandatory homework sets on these topics. These sets will include both theoretical exercises and exercises focused on implementation and obtaining numerical results. In addition to these topics, students will study a recent scientific paper in groups of 2 or 3. They will study this paper under the supervision of a lecturer and present the paper in class.

Assessment
  • Written exam (60%)
  • Two mandatory homework sets (20%)
  • Reading and presenting a scientific paper (20%)
Assumed previous knowledge
191531920 - Markov Decision Theory and Algorithmic Methods is a prerequisite.
Participating study
Master Applied Mathematics
Participating study
Master Computer Science
Participating study
Master Electrical Engineering
Participating study
Master Interaction Technology
Participating study
Master Systems and Control
Required materials
Book
Sutton and Barto, "Reinforcement Learning: an Introduction", second edition, 2018. https://www.andrew.cmu.edu/course/10-703/textbook/BartoSutton.pdf
Recommended materials
Book
Bokar, "Stochastic Approximation, A Dynamical Systems Viewpoint", 2008. https://doi.org/10.1007/978-93-86279-38-5
Articles
Powell, ”A unified framework for stochastic optimization”, EJOR, 275(3), 2019, 795-821. https://doi.org/10.1016/j.ejor.2018.07.014
Instructional modes
Lecture

Presentation(s)
Presence dutyYes

Q&A

Self study with assistance

Tutorial

Tests
Written Exam

CloseHelpPrint
Kies de Nederlandse taal