# Unit information: Linear and Generalised Linear Models in 2020/21

Please note: you are viewing unit and programme information for a past academic year. Please see the current academic year for up to date information.

Unit name Linear and Generalised Linear Models MATH30013 20 H/6 Teaching Block 1 (weeks 1 - 12) Dr. Cho Not open MATH11005 Linear Algebra and Geometry and MATH20800 Statistics 2 None School of Mathematics Faculty of Science

## Description

Unit Aims

• To provide students with the definition of linear models and theoretical treatment of the least squares estimation using QR decomposition for statistical inference.
• To provide students with the definition of generalised linear models and theoretical treatment of the maximum likelihood estimation.
• To demonstrate the procedure of model fitting including model diagnosis, stepwise model building and interpretation of the results.
• To enable students to use 'lm', 'glm' and related functions in R to handle the computational aspects of model fitting.
• To provide students with a brief introduction to penalised least squares methods for handling 'big data'.

Unit Description

The Linear Model is the ubiquitous model in Statistics. It is used extensively in experiments to evaluate interventions (e.g. medicine and public health, toxicology assessment, agricultural field trials, experimental psychology), and also to analyse observational data and make predictions. First half of this unit covers the theory and the practice of Linear Modelling, including least squares-based estimation and computation, model building, diagnostics, and the hypothesis testing, and use of the statistical computing environment R (most notably the 'lm' function and its methods).

Linear Modelling has its limitations, notably for quantities which are discrete. In healthcare, for example, we would like to model the response of a patient to a new treatment; typically this response is binary (yes/no, presence/absence). Or else, we would like to analyse count data, such as the number of occurrences of an event in a population, or for a person over a time interval. The second half of this unit provides an introduction to Generalised Linear Models explaining how it extends the normal distribution implicitly assumed in Linear Models to the much larger Exponential Family of distributions, which includes the Binomial and the Poisson distributions, among many others. The theory and the practice of Generalised Linear Model is covered, including the maximum likelihood-based estimation and computation, diagnostics and the hypothesis testing. The unit also covers practical aspects of fitting Generalised Linear Models in R (using the 'glm' function in R), including model choice, diagnostic checking, and prediction. Several important applications are considered in detail: binary responses, categorical responses (i.e., more than two levels) and count data.

Relation to Other Units

This unit builds on the basic ideas of linear models introduced in Statistics 1 and Statistics 2, and extends them to deal with more general specifications. Other related units are Bayesian Modelling and Theory of Inference.

## Intended learning outcomes

Familiarity with the nature and common syntax of the Linear and Generalised Linear Models, and with their use in a variety of applications.

Experience of fitting and analysing the regression models in R.

## Teaching details

The unit will be taught through a combination of

• synchronous online and, if subsequently possible, face-to-face lectures
• asynchronous online materials, including narrated presentations and worked examples
• guided asynchronous independent activities such as problem sheets and/or other exercises
• synchronous weekly group problem/example classes, workshops and/or tutorials
• synchronous weekly group tutorials
• synchronous weekly office hours

## Assessment Details

90% Timed, open-book examination 10% Coursework

Raw scores on the examinations will be determined according to the marking scheme written on the examination paper. The marking scheme, indicating the maximum score per question, is a guide to the relative weighting of the questions. Raw scores are moderated as described in the Undergraduate Handbook.

If you fail this unit and are required to resit, reassessment is by a written examination in the August/September Resit and Supplementary exam period.