
This course addresses 12 learning objectives. The numbers between brackets indicate the link with the Professional Academic Qualifications of the IEM MSc master program. By the end of this course, the student will be able to:
 Explain the core concepts of reinforcement learning, the computational challenges it tackles and the potential applications [A3, B3];
 Classify reinforcement learning algorithms into one of four policy classes [A3];
 Formulate typical dynamic resource allocation problems in financial engineering and operations research as a Markov Decision Process model [A1];
 Design appropriate features to capture the most salient properties of problem state values [A2];
 Model and apply basic value function algorithms [A3];
 Model and apply basic policy function algorithms [A3];
 Explain the relevant tunable parameters that determine learning performance, including features and learning, exploration and discount rates [A3, A4, B3];
 Apply neural network applications in the context of (deep) reinforcement learning, as well as other contemporary developments in the domain [A3];
 Design classical reinforcement learning problems with the use of neural networks [A2]
 Transfer classical finance problems to the context of deep reinforcement learning [A3]
 Apply reinforcement learning principles in the broader context of data science, starting with retrieving raw data and concluding with comprehensible insights [A4, B3].
 Explain and present design choices and business implications to both team members and external stakeholders [A4, B1, B2, B3].



Most business problems entail an intelligent allocation of resources over time (e.g., distributing budget over a stock portfolio, allocating parcels to trucks), requiring a strategy that maximizes profits (or minimizes costs) while dealing with uncertainties. The ability to solve such complex business problems is essential for industrial engineers. Reinforcement learning is a popular solution method to these kinds of dynamic resource allocation problems. The key objective of this course is to provide an introduction into reinforcement learning and a basis for handling such problems in practice. In addition, the course aims to let students experience the complete Data Science pipeline, ranging from data ingestion to providing managerial insights. The course is primarily geared towards Industrial Engineering & Management, but may be of interest to other disciplines.
The course aims to teach students a global understanding of reinforcement learning in the context of financial engineering and operations research. After finishing the course, students should be able to formally describe allocation problems, and select and apply appropriate reinforcement learning algorithms within their domain. To achieve this purpose, a combination of lectures, assignments, a project and a written examination is used.
The lectures will treat the theoretical foundations of reinforcement learning, various types of algorithms, and modeling techniques. Although we will mainly discuss the basis of reinforcement learning, more advanced techniques such as artificial intelligence, deep neural networks and multiagent systems are also integrated in the course design. Demo models are shown in a tutorial context; these will be made available to the students for experimenting.
Handson experience is crucial to grasp the underlying mechanisms of reinforcement learning models. In groups, the students will tackle two assignments, in which they apply elementary reinforcement learning algorithms in Python and write a concise report. Starting models will be provided for the assignments, such that the focus can be on the algorithmic implementation.
In the project, the student groups are required to model and solve a standard problem in finance or operations research of their choosing, and select an appropriate algorithm for doing so. A satisfactory result demonstrates that they are able to critically assess a problem's key properties, transform it into a model, and work towards a suitable solution, which is what will be expected from them in practice as well. The project is closed with a presentation session.
The aim of the individual written test (exam) is to gauge the student's theoretical understanding of the course. The student should be able to formulate problems as a Markov decision model, understand the computational complexity of problems, select appropriate solution models for the problem,
After completion of the course, the student will be able to conceptualize typical dynamic resource allocation problems as encountered in practice, transform them into formal mathematical models, and select and apply appropriate reinforcement learning algorithms to solve them. Furthermore, the understanding of the core principals and a broad outlook on solution classes forms a springboard to more advanced topics and algorithms.




 Assumed previous knowledgeLinear algebra, calculus, probability & statistics, Markov Decision Processes, programming 
Master Industrial Engineering and Management 
  Required materialsRecommended materialsBookSutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction. MIT Press. ISBN 9780262039246 

 Instructional modesLecture
 Practical
 Presentation(s)Presence duty   Yes 
 Project unsupervised
 Tutorial

 TestsWritten exam, assignments and presentation


 