Modern Biostatistics and Statistical Learning

Course Number
CHL5229H
Series
5200 (Biostatistics)
Course Instructor(s)
Rafal Kustra

Course Description

This course will  introduce students to the statistical methods suitable for analysing large observational data, data constructed from multiple institutional databases, web­based data, and any data that may benefit from non­classical approaches. The theory will be presented as an extension of classical tools such as linear and logistic regression, parametric hypothesis testing, multivariate Gaussian theory, to make it more intuitive and accessible.

Course Objectives

At the end of the course students should be aware of:

  1. Distinction between, and application of, supervised and unsupervised statistical learning problems;
  2. Classification problems, similarities between classifiers and regression models;
  3. Non-­classical regression and classification tools: loess and spline smoothing, tree­based methods, and kernel­-based methods;
  4. Importance and implementation of prediction error control in statistical modelling using v-fold cross­-validation and leave-one-­out bootstrap; and
  5. Importance of, and tools for, complex data handling, testing, and manipulation.

Methods of Assessment

Assignments (2) 50%
Midterm 25%
Paper presentation 15%
Participation 10%