Skip to main content

Unit information: SWBio DTP: Statistics and Bioinformatics in 2016/17

Please note: you are viewing unit and programme information for a past academic year. Please see the current academic year for up to date information.

Unit name SWBio DTP: Statistics and Bioinformatics
Unit code BIOCM0010
Credit points 20
Level of study M/7
Teaching block(s) Teaching Block 1 (weeks 1 - 12)
Unit director Professor. Mendl
Open unit status Not open



SWBio DTP: Core Skills for Life Scientists, SWBio DTP: Science in Society, Business and Industry, SWBio DTP: Rotation Project 1, followed by SWBio DTP: Rotation Project 2

School/department School of Biochemistry
Faculty Faculty of Life Sciences


This unit aims to deliver a working knowledge and understanding of the range of statistical and bioinformatic methods commonly used in biological science research, and how such methods are deployed in analyses of data. It comprises two intensive week-long periods of classroom-based learning, each followed by a period of recommended and self-directed further reading and completion of assessment activities.

Analyses of data, and in particular of large datasets is becoming a fundamental technique common to many areas of biological science research and it is therefore important that those entering the profession are familiar with such techniques, even if they are not directly relevant to their current research projects. The unit will provide students with a thorough grounding in the types of statistical tests that are available, an understanding of how and why each type of analysis can be deployed and how to use R scripts to analyse data. It will include discussion of the limitations of each approach and the types of data to which each is appropriate. An appreciation of these limitations is essential if experiments are to be designed in an appropriate manner.

Bioinformatic analyses of DNA sequence and other data is also an essential skill, be this for phylogentic, population genetic studies or gene expression analyses. This part of the unit will focus on how to manipulate such data and then to analyse such datasets in a meaningful manner, and will include working in a Linux environment.

On completion, the student will have acquired familiarity with the terminology in common usage within these forms of analysis, be confident in using R and Linux in such analysis, be able to identify the appropriate forms of analyses for their data and to be able to use these techniques to critically analyse relevant datasets.

Intended learning outcomes

To be able to:

  • Understand R and how it can be used for descriptive statistics and graphing, and in experimental design.
  • Design tests for association and difference - from basic (e.g. correlation, t-tests) to more advanced (e.g. regression, ANOVA).
  • Use statistical modelling on their experimental data and use general and generalised linear models.
  • Gain an understanding of multivariate models (e.g. ordination and cluster analysis) and more advanced modelling methods (e.g. mixed or additive models).
  • Use basic computational and programming skills to perform analyses within the Linux environment.
  • Gather and analyse moderate to large data sets.
  • Gain experience of using genomics approaches utilised in handling the output from massively parallel short read sequencing.
  • Effectively communicate and collaborate with bioinformaticians in the handling, modelling, and analysis of large-scale biological data.

Teaching details

Lectures, seminars, practical activities and workshops.

Assessment Details

There will be two assessments: (1) to demonstrate an understanding of the conceptual and practical aspects of statistical analyses by answering short answer-style statistical questions (50%), and (2) to demonstrate an understanding and competency in bioinformatic analyses by writing a bioinformatic practical report (50%).

Reading and References


Data Analysis with R statistical Software: A Guidebook for Scientists. By Rob Thomas (2015)


UNIX and Perl to the Rescue!: A Field Guide for the Life Sciences (and Other Data-rich Pursuits)

by Keith Bradnam, Ian Korf (2012)

Publisher: Cambridge University Press (19 July 2012)

ISBN-10: 0521169828

ISBN-13: 978-0521169820