Skip to main content

Unit information: SWBio DTP: Data Science and Machine Learning for the Biosciences in 2021/22

Please note: you are viewing unit and programme information for a past academic year. Please see the current academic year for up to date information.

Unit name SWBio DTP: Data Science and Machine Learning for the Biosciences
Unit code BIOCM0022
Credit points 20
Level of study M/7
Teaching block(s) Teaching Block 1 (weeks 1 - 12)
Unit director Dr. Barker
Open unit status Not open
Pre-requisites

None

Co-requisites

BIOCM0010 SWBio DTP: Statistics and Bioinformatics,

BIOCM0013 SWBio DTP: Science in Society, Business and Industry,

BIOCM0021 SWBio DTP: Rotation Project 1,

BIOCM0020 SWBio DTP: Rotation Project 2

School/department School of Biochemistry
Faculty Faculty of Life Sciences

Description including Unit Aims

The key aim of this unit is to introduce and familiarise doctoral students with the basics of coding, machine learning and general principles of data science as applied in the analysis of data from the biosciences. It is assumed that students will have minimal previous experience with coding (but noting that they will have made limited usage of R in the co-requisite BIOCM0010 unit which is taken prior to this new unit). By the end of the unit it is anticipated that students will be able to complete a short coding project manipulating data of relevance to their doctoral research studies.

The specific aims of this unit are:

  • to provide students with an introduction to programming, primarily using Python;
  • familiarisation with data analysis using Python modules such as Pandas, Numpy, Matplotlib, Matplot;
  • an understanding of the process of software engineering including design, documentation, testing and version control;
  • basic theory and application of elementary machine learning techniques using TensorFlow, as applied to image analysis;
  • familiarisation with data science principles underlying model generation for deep learning applications, and of ethics in machine learning
  • (optional) Intermediate to advanced programming in Python.
  • (optional) Parallel programming using task-based, message passing and shared memory models

Intended Learning Outcomes

  1. An understanding of the process by which software is assembled and operated, as implemented within the Python platform.
  2. Familiarity with core data analysis modules implemented in Python including Pandas, Numpy, Matplotlib and Scikit-learn;
  3. Ability to write, compile and debug simple Python scripts for the analysis of biological data.
  4. An understanding of the basic principles of machine learning and their application to research data such as image processing.
  5. Understanding and demonstrating competence in how combinations of sequential segments of coding can be combined to provide in depth analysis of large data sets.

Teaching Information

This unit will have an intensive one week of teaching, comprising lectures, workshops, practical activities including some small-group activities. This will be followed by recommended- and self-directed study, to prepare the student for the various assessment activities.

Assessment Information

This is a pass/fail unit, with each individual assessment being assessed using the pass/fail criteria.

There will be 2 assessments:

(1) A short group project, including a verbal presentation to the whole cohort and to which all group members will need to contribute (pass/fail; must pass), and

(2) an individual short project, involving development of simple software for elementary analysis of a large data set from their area of doctoral research (pass/fail; must pass)

Resources

If this unit has a Resource List, you will normally find a link to it in the Blackboard area for the unit. Sometimes there will be a separate link for each weekly topic.

If you are unable to access a list through Blackboard, you can also find it via the Resource Lists homepage. Search for the list by the unit name or code (e.g. BIOCM0022).

How much time the unit requires
Each credit equates to 10 hours of total student input. For example a 20 credit unit will take you 200 hours of study to complete. Your total learning time is made up of contact time, directed learning tasks, independent learning and assessment activity.

See the Faculty workload statement relating to this unit for more information.

Assessment
The Board of Examiners will consider all cases where students have failed or not completed the assessments required for credit. The Board considers each student's outcomes across all the units which contribute to each year's programme of study. If you have self-certificated your absence from an assessment, you will normally be required to complete it the next time it runs (this is usually in the next assessment period).
The Board of Examiners will take into account any extenuating circumstances and operates within the Regulations and Code of Practice for Taught Programmes.

Feedback