Kies de Nederlandse taal
Course module: 202100109
Reinforcement Learning
Course infoSchedule
Course module202100109
Credits (ECTS)5
Course typeCourse
Language of instructionEnglish
Contact A.B. Zander
Contactperson for the course A.B. Zander
Examiner A.B. Zander
Academic year2021
Starting block
Application procedureYou apply via OSIRIS Student
Registration using OSIRISYes
After following this course, the student is able to:
  • model a real-world problem in a reinforcement learning model,
  • choose a suitable learning algorithm and design an appropriate approximation framework for that model,
  • use and extend a state-of-the-art software package for reinforcement learning,
  • validate a given algorithm with respect to convergence, stability, and optimality.
Reinforcement learning (RL) is an area of machine learning concerned with how intelligent agents ought to take actions in an environment to maximize reward. This environment is stated as a Markov decision process (MDP) for which the underlying model is unknown. This course introduces techniques for modeling and handling large-scale RL problems. The focus is on i) provable performance guarantees, like on (speed of) convergence, and ii) algorithmic techniques.

The covered models and algorithms correspond to Chapters 1-10 in the textbook by Sutton and Barto: “Reinforcement Learning: an Introduction,” 2nd edition. In addition to the textbook, we draw theory from Borkar’s book on ”Stochastic Approximation, A Dynamical Systems Viewpoint” and  further insights from Powell’s article “A unified framework for stochastic optimization.”

There will be two mandatory homework sets on these topics. These sets will include both theoretical exercises and exercises focused on implementation and obtaining numerical results. In addition to these topics, students will study a recent scientific paper in groups of 2 or 3. They will study this paper under the supervision of a lecturer and present the paper in class.

  • Written exam (60%)
  • Two mandatory homework sets (20%)
  • Reading and presenting a scientific paper (20%)
Assumed previous knowledge
191531920 - Markov Decision Theory and Algorithmic Methods is a prerequisite.
Participating study
Master Applied Mathematics
Participating study
Master Computer Science
Participating study
Master Electrical Engineering
Participating study
Master Interaction Technology
Participating study
Master Systems and Control
Required materials
Sutton and Barto, "Reinforcement Learning: an Introduction", second edition, 2018.
Recommended materials
Bokar, "Stochastic Approximation, A Dynamical Systems Viewpoint", 2008.
Powell, ”A unified framework for stochastic optimization”, EJOR, 275(3), 2019, 795-821.
Instructional modes

Presence dutyYes


Self study with assistance


Written Exam

Kies de Nederlandse taal