Skip to main content

Unit information: Data Science Toolbox in 2021/22

Unit name Data Science Toolbox
Unit code MATHM0029
Credit points 20
Level of study M/7
Teaching block(s) Teaching Block 4 (weeks 1-24)
Unit director Dr. Lawson
Open unit status Not open
Pre-requisites

MATH10013 Probability and Statistics and MATH20800 Statistics 2

Co-requisites

MATHM0028 Introduction to Mathematical Cybersecurity

School/department School of Mathematics
Faculty Faculty of Science

Description

Unit Aims

The purpose of this unit is to provide all students with theoretical and (especially) practical data science literacy relevant to cybersecurity.

Unit Description

This unit will cover the following topics.

  1. Exploratory Data Analysis tools (including data summaries; regression; visualisation; clustering; statistical testing; outlier detection) using appropriate languages such as R and Python.
  2. Applied Machine Learning (including fitting Random Forests, topic models &neural networks; cross validation; interpretation of performance metrics).
  3. Handling Big Data (including the use of command line tools; data processing algorithms, for example, bloom filters and streaming summarisation; introduction to computational complexity; Big Data platforms, for example, Hadoop and Spark).

This unit will be partly assessed by coursework with a focus on real cybersecurity datasets.

Intended learning outcomes

By the end of the unit, students will:

  • Be able to access and process cyber security data into a format suitable for mathematical reasoning
  • Be able to use and apply basic machine learning tools
  • Be able to make and report appropriate inferences from the results of applying basic tools to data
  • Be able to use high throughput computing infrastructure and understand appropriate algorithms
  • Be able to reason about and conceptually align problems involving real data to appropriate theoretical methods and available methodology to correctly make inferences and decisions
  • Be able to work as part of a team to apply mathematical methods to difficult data science problems

Teaching details

The unit will be taught through a combination of

  • synchronous online and, if subsequently possible, face-to-face lectures
  • asynchronous online materials, including narrated presentations and worked examples
  • guided asynchronous independent activities such as problem sheets and/or other exercises
  • synchronous weekly group problem/example classes, workshops and/or tutorials
  • synchronous weekly group tutorials
  • synchronous weekly office hours

Assessment Details

50% Timed, open-book examination
50% Practical Assignments

Resources

If this unit has a Resource List, you will normally find a link to it in the Blackboard area for the unit. Sometimes there will be a separate link for each weekly topic.

If you are unable to access a list through Blackboard, you can also find it via the Resource Lists homepage. Search for the list by the unit name or code (e.g. MATHM0029).

Feedback