Skip to main content

Unit information: Applied Data Science (Teaching Unit) in 2020/21

Please note: you are viewing unit and programme information for a past academic year. Please see the current academic year for up to date information.

Unit name Applied Data Science (Teaching Unit)
Unit code COMS30050
Credit points 0
Level of study H/6
Teaching block(s) Teaching Block 2 (weeks 13 - 24)
Unit director Professor. Seth Bullock
Open unit status Not open
Pre-requisites

COMS10016 Imperative and Functional Programming and COMS10017 Object Oriented Programming and Algorithms I or equivalent

COMS10014 Mathematics for Computer Science A and COMS10013 Mathematics for Computer Science B or equivalent

COMS20011 Data-Driven Computer Science or equivalent

COMS30035 Machine Learning or equivalent

Good knowledge of machine learning.

Programming: Python or another major programming language (Java, C)

Maths: basic linear algebra, basic statistics, some calculus, some discrete maths.

Co-requisites

EITHER Undergraduate students in Year 3 must choose Assessment Unit COMS30051

OR M-level students must choose the Masters Level Assessment Unit COMSM0055

OR Interactive Artificial Intelligence CDT PhD students should choose Assessment Unit COMSM0056.

Please note, COMS30050 is the Teaching Unit for Applied Data Science. Undergraduate students can take this unit in either their third or fourth year, and must also choose the Assessment Unit for their year group. Interactive Artificial Intelligence CDT PhD students must chose the CDT Assessment Unit.

School/department School of Computer Science
Faculty Faculty of Engineering

Description including Unit Aims

This unit introduces key data science concepts and their application to support data-driven approaches to problem solving.

The aim of this unit is to allow students to acquire fundamental skills covering the full data science pipeline, including the pre-processing, manipulation, integration, storage, exploration, visualisation and privacy.

Students will study techniques to transform raw data into advanced representations that will enable a deeper understanding of the original data:

  • Data ingress and pre-processing
  • Data storage and data management
  • Data transformation and integration
  • Data exploration and visualisation
  • Data sharing, privacy and anonymisation

The students will also gain practical skills in handling structured and unstructured data, gaining hands-on experience of software tools widely used in real-world settings.

Intended Learning Outcomes

On successful completion of the unit, students will:

  • Acquire a working knowledge of practical data science, applied to real world problems.
  • Be able to start from raw data and deliver a representation allows a better understanding of the topics in the data.
  • Have experience of using software tools for data pre-processing and management.
  • Acquire first hand experience in specific techniques for data storage.
  • Understand the differences between different visualisation strategies to efficiently explore the data.
  • Have learnt how to present and interpret data to/for a non-technical audience.
  • Be able to share data under privacy constraints.
  • Have practised teamwork and time management.

Teaching Information

Teaching will be delivered through a combination of synchronous and asynchronous sessions, including lectures, group work and self-directed exercises.

Assessment Information

100% coursework.

M-level students are expected to go deeper in their analysis and reflect on the process and the steps followed.

CDT student coursework is also 100% coursework.

Reading and References

  • Leskovec, Jure, Rajaraman, Anand and Ullman, Jeffrey David, Mining of Massive Datasets (Cambridge University Press, 2011) ISBN: 978-1107015357
  • Hand, David J., Mannila, Heikki and Smyth, Padhraic, Principles of Data Mining (MIT Press, 2001) ISBN: 978-0262082907
  • Ware, Colin, Information Visualization (Morgan Kaufmann, 2012) ISBN: 978-0123814647
  • Additional reading material in the form of research papers, online resources, etc.

Feedback