Unit name | Advanced Data Analytics |
---|---|

Unit code | COMSM0088 |

Credit points | 20 |

Level of study | M/7 |

Teaching block(s) |
Teaching Block 2 (weeks 13 - 24) |

Unit director | Professor. Nabney |

Open unit status | Not open |

Units you must take before you take this one (pre-requisite units) | |

Units you must take alongside this one (co-requisite units) | |

Units you may not take alongside this one |
None |

School/department | School of Engineering Mathematics and Technology |

Faculty | Faculty of Engineering |

Visual analytics couples the visual representation of data with analytical processes to support complex decision making and understanding. A picture may be worth a thousand words, but only if it is well designed to represent data faithfully and meaningfully. This unit will enable students to create powerful analyses of data and communicate them effectively to non-specialists.

This unit extends the material taught in the co-requisite unit Introduction to Data Analytics by giving students a solid grounding in contemporary advanced machine learning. In visual analytics, such methods serve to as useful tools to change the data representation, e.g. through dimensionality reduction, or as a way of analysing visual data) in a framework of statistical pattern recognition; in text analytics, such methods serve to produce powerful analyses that traditional methods fail to deliver.

Machine learning topics covered by this unit include: principles of Statistical Pattern Recognition (probabilistic models for data, curse of dimensionality generalisation error, bias-variance dilemma); linear models (Probabilistic Principal Component Analysis; Discriminant Analysis); generalised dissimilarity mappings and neighbour embedding techniques; Gaussian Processes; latent variable models (Gaussian Mixture Models, Generative Topographic Mapping and Gaussian Process Latent Variable Model); Bayesian model regularisation and combination; feature selection; challenges of large datasets and potential solutions. The text analytics methods taught include rule-based approaches, traditional machine learning techniques, and also current leading techniques such as those based on deep-learning neural networks.

Throughout the unit there is a focus on understanding theory and modelling principles in order to apply them effectively to represent and analyse data

Students will be able to

- Apply established text analysis methods on large-scale text-data sources.
- Define the types and semantics of data.
- Build machine learning models for data and explain their operation in terms of a statistical pattern recognition framework.
- Use Bayesian regularisation and variational methods to fit models.
- Create user-focused visualisations of numerical, categorical, time series, and network data using visualization tools such as those available in the public domain via Python and Tablea

Problem-based learning combining lecture elements with practical individual work.

**Mid-term coursework (30%)**: design and implement a system for automated analysis of a substantial text corpus and write a report on the findings from deploying this (ILO 1).

**Final coursework (60%):** Create a visualisation of key features of a medium-sized real-world dataset, analyse and evaluate the representation through a user trial, and report on conclusions relating them to the theory of information visualization (ILO 2, 3, 4, & 5).

**Lab tests (10%):** tests on the work, completed in the class.

If this unit has a Resource List, you will normally find a link to it in the Blackboard area for the unit. Sometimes there will be a separate link for each weekly topic.

If you are unable to access a list through Blackboard, you can also find it via the Resource Lists homepage. Search for the list by the unit name or code (e.g. COMSM0088).

**How much time the unit requires**

Each credit equates to 10 hours of total student input. For example a 20 credit unit will take you 200 hours
of study to complete. Your total learning time is made up of contact time, directed learning tasks,
independent learning and assessment activity.

See the University Workload statement relating to this unit for more information.

**Assessment**

The Board of Examiners will consider all cases where students have failed or not completed the assessments required for credit.
The Board considers each student's outcomes across all the units which contribute to each year's programme of study. For appropriate assessments, if you have self-certificated your absence, you will normally be required to complete it the next time it runs (for assessments at the end of TB1 and TB2 this is usually in the next re-assessment period).

The Board of Examiners will take into account any exceptional circumstances and operates
within the Regulations and Code of Practice for Taught Programmes.