Skip to main content

Unit information: Statistical Methods 1 in 2021/22

Unit name Statistical Methods 1
Unit code MATHM0041
Credit points 20
Level of study M/7
Teaching block(s) Teaching Block 1 (weeks 1 - 12)
Unit director Dr. Song Liu
Open unit status Not open
Pre-requisites

None

Co-requisites

None

School/department School of Mathematics
Faculty Faculty of Science

Description including Unit Aims

This unit covers the topic of prediction, from the initial consultation with the client all the way through to delivering an effective prediction algorithm and quantifying its out-of-sample performance. Prediction is an important activity in its own right, but we also use it to illustrate many of the major topics in computational statistics. These include statistical optimality, the limitations of naive approaches such as nearest-neighbour and cross-validation, the Normal Linear Model for regression, the concepts of prior, posterior, and predictive distributions, regression modelling with basis expansions, extensions to data-dependent regressors, and the treatment of more complex parameters through optimization.

In its most sophisticated form, the extended Normal Linear Model provides a powerful and computationally tractable platform for regression and optimal prediction, but does not provide similar benefits for classification. The later part of the unit considers the challenges presented by classification, and the various approaches that are used to approximate the predictive distribution. These approaches, including numerical optimization of penalized likelihoods and approximate numerical integration, are core tools in computational statistics and machine learning.

Intended Learning Outcomes

By the end of the unit students should be able to:

  • Formulate a prediction problem with a client, either regression or classification, including a discussion about an appropriate loss function, and about out-of-sample performance.
  • Demonstrate both theoretically and numerically how the curse of dimensionality undermines naive approaches to prediction.
  • Describe the purpose of a parametric model, and explain the benefits and the limitations of integrate-out versus plug-in, for the model parameters.
  • For the Normal Linear Model for regression, derive the explicit forms for the posterior and predictive (‘integrate-out’) distributions, and the marginal likelihood as a function of hyperparameters, and code these into an efficient and numerically-stable prediction algorithm.
  • State the discriminative modelling framework for classification, and contrast it with regression, highlighting the additional challenges that classification brings over regression.
  • Outline numerical approximation methods which can be used for both ‘plug-in’ and ‘integrate-out’ approaches to classification, and code these into an algorithm based on existing tools, such as the Generalized Linear Model and ‘glmnet’ in R.

Teaching Information

Some lab based instruction

Assessment Information

Formative: homework each week.

Summative:

  1. A personal portfolio of notes, code snippets, and vignettes, 30%.
  2. Assessed coursework, 2 at 20% each.
  3. A group project, 30%.

Resources

If this unit has a Resource List, you will normally find a link to it in the Blackboard area for the unit. Sometimes there will be a separate link for each weekly topic.

If you are unable to access a list through Blackboard, you can also find it via the Resource Lists homepage. Search for the list by the unit name or code (e.g. MATHM0041).

How much time the unit requires
Each credit equates to 10 hours of total student input. For example a 20 credit unit will take you 200 hours of study to complete. Your total learning time is made up of contact time, directed learning tasks, independent learning and assessment activity.

See the Faculty workload statement relating to this unit for more information.

Assessment
The Board of Examiners will consider all cases where students have failed or not completed the assessments required for credit. The Board considers each student's outcomes across all the units which contribute to each year's programme of study. If you have self-certificated your absence from an assessment, you will normally be required to complete it the next time it runs (this is usually in the next assessment period).
The Board of Examiners will take into account any extenuating circumstances and operates within the Regulations and Code of Practice for Taught Programmes.

Feedback