The general aim is that the student gains knowledge about the recent developments in the area of deep learning theory. At the end of the course, the student
- is able to explain the state-of-the-art of deep learning theory, including recent developments and open questions
- is able to read recent research papers and grasp the material in such a way that
- the student can present it to the fellow students and
- the student can indicate the most important aspects of the paper by formulating suitable questions/assignments (for the fellow students) and subsequently grading them appropriately.
Although Deep Learning has already reached the state of the art in many machine learning tasks, the underlying understanding of the methodology is still in its infancy. Most applications rely on intuition and trial and error. So far, we have only a limited understanding of
In this course, we cover the latest in deep learning theory from different perspectives. We start by analyzing the expressive power of neural networks and discuss why depth helps here. We derive statistical risk bounds for different statistical settings, learn about their proof strategies and also their limitations. We discuss optimization related results, analyze the energy landscape, discuss the implicit bias in gradient descent and learn about different regularization methods.
- why we can reliably optimise non-convex objectives.
- how expressive our architectures are, with respect to the class of hypotheses they describe.
- why most complex models generalise to unseen examples when we use data sets orders of magnitude smaller than what classical statistical learning theory considers sufficient
After a first survey on neural network structures, the first few lectures cover the fundamental tools from optimization, statistics and information theory. These lectures are given by the lecturer. There is a weekly assignment for this part, the performance of which is included in the overall grade at the end. The second part of the lecture has the purpose to go over a number of recent publications. This is the active part of the lecture, where each student is responsible for presenting a topic assigned to him/her. The way the student designs the lecture is up to him/her. Interactive parts can be built in here to get the other students to participate. Of course, students will get support from the instructor in preparing their presentations for the course.
As the purpose of this class is to get students engaged with new research in the area of deep learning theory, the majority of credit will be given for the presentation. In particular, the final grade is composed of the quality of the students’ presentation (60%), the overall participation during the course (10%), and the average score for the assignments (30%).