
After this course the student is able to
 Use, apply and understand several types of advanced statistical learning models.
 To apply different methods to build and validate a statistical learning model, including variable selection.
 To deal with missing data with appropriate methods.
 To critically appraise the outcomes of such statistical models for health care or other application fields
 Use R and Rstudio for applying statistical learning
 Apply his/her obtained knowledge to a real life practical problem


The amount of data that is available in health care for example is growing exponentially over the last two decades. Moreover, the diversity of digital health data became larger with the rise of the field of genomics, proteomics, diagnostics and imaging, lab tests and wearables. The field of statistical learning is seen as a promise for better risk stratification, personalized medicine, and health care optimization. Also in other fields the amount and type of data that is becoming available is rising.
Statistical learning is the combination of machine learning with statistics. Similar to machine learning, it studies computer algorithms for learning with the goal to make predictions, where the same algorithms as in machine learning are applied, but in addition, it involves statistical models and the assessment of their uncertainty. Statistical learning is about automatically learn the structure, patterns and regularities in complicated data, and to use these patterns to predict future data. Although statistical learning is applied in all kind of fields, we focus primary on health science, but the methodology learned in this course can be easily used in other fields.
Main topics in the course will be:
 Supervised learning with regression and/or classification and treebased statistical models
 Model selection with resampling methods (crossvalidation, bootstrap, Ridge, Lasso)
 Variable selection methods
 Unsupervised learning (principal components analysis and clustering methods
 Methods for dealing with missing data
In this course you will learn how to derive sound statistical models including variable selection and dealing with missing data, to assess their quality and goodness of fit with the use of the statistical programming language R. In the first week we will start with a brief introduction to R and Rstudio. Next, you will learn the theory behind the models with selfstudy and available online material during the following weeks. Every week there will be two meetings of two hours each where you can work under supervision on the lab assignments and ask questions on the theory. Also, during the weeks you need to work on assignments on a health related project (this can be individually or with a fellow student) where the aim is to apply the models that you have learnt. For this project you need to submit results every week and write a small report in the end. The final assessment will be an individual assignment on the theory and / or application of statistical learning. The focus of the course is not only on applications in health care.





Master Industrial Engineering and Management 
  Verplicht materiaalAanbevolen materiaalBookAn introduction to statistical learning (with applications in R), G. James, D. Witten, T. Hastie & R. Tibshirani; Springer; ISBN 9781461471370
Online available as pdf from the authors. 

 WerkvormenHoorcollege
 Practicum
 Project begeleid
 Zelfstudie geen begeleiding

 ToetsenProject + Individual assignment


 