Skip to main content

Unit information: Genomic Data Science in 2020/21

Please note: you are viewing unit and programme information for a past academic year. Please see the current academic year for up to date information.

Unit name Genomic Data Science
Unit code SSCM30005
Credit points 20
Level of study H/6
Teaching block(s) Teaching Block 1 (weeks 1 - 12)
Unit director Dr. Gibran Hemani
Open unit status Not open

This is part of an intercalated BSc for Medical, Veterinary or Dental students



School/department Bristol Medical School
Faculty Faculty of Health Sciences


Having been introduced to the landscape of molecular characteristics of the cell over the course of Unit 1, we now focus on how those characteristics differ between people in the population. In particular, we will learn how to infer the extent to which variation is due to genetic difference; how to identify specific positions in the genome that influence different diseases and complex traits; and how these findings can be exploited. Essential programming and data analysis skills will be taught throughout the unit to re-enforce these messages and to equip the students for practical work for subsequent modules and projects.

Intended learning outcomes

  1. Interpret heritability estimates and recall the heritability estimates of some key phenotypes
  2. Discuss the core objectives of genome wide association studies and critically evaluate this study design
  3. Perform genome wide association studies on large scale population genetic data
  4. Interpret genetic association using a range of bioinformatic tools
  5. Critically evaluate the features and applications of different types of genetic data capture
  6. Apply Linux command line tools and the R programming language for basic data analysis

Teaching details

Methods of Teaching

This unit will adopt a blended learning approach, including a mix of interactive synchronous and asynchronous sessions. Where practical this will include some on-campus teaching, but all material will also be available for online learning.

Student input

20 hours scheduled activities, 20 hours independent coursework, a proportion of an end-of-programme assessment, 150 hours independent study

Assessment Details

50% of the unit is assessed through an end of year assessment.

50% of the unit is assessed through a summative in-unit project.

There will be one formative assessment prior to the summative coursework.

Reading and References

Lewin – Genes

Strachan and Read – Human Molecular Genetics – 4th Edition - 2010

HAPMAP (2005) A haplotype map of the human genome. Nature 437: 1299-1320.

HAPMAP (2007) A second generation human haplotype map of over 3.1 million SNPs. Nature 449: 851-861.

1000 genomes. A global reference for human genetic variation. Nature 526 68-74 2015

1000 genomes. An integrated map of structural variation in 2,504 human genomes Nature 526 75-81 2015

1000 genomes. An integrated map of genetic variation from 1,092 human genomes” Nature 491 56-65 2012

1000 genomes. A map of human genome variation from population-scale sequencing” Nature 467 1061-1073 2010

UK10K Consortium. The UK10K project identifies rare variants in health and disease. Nature. 2015 Oct 1;526(7571):82-90. doi: 10.1038/nature14962. Epub 2015 Sep 14.

Marchini J1, Howie B. Genotype imputation for genome-wide association studies. Nat Rev Genet. 2010

Jul;11(7):499-511. doi: 10.1038/nrg2796.

Collins, Rory (2011): UK biobank: the need for large prospective epidemiological studies. In: Journal of Epidemiology and Community Health,65 (1), pp. A37, 2011.

Collins, Rory (2012): What makes UK Biobank special? In: The Lancet , 379(9822), pp. 1173 - 1174, 2012.

Tyler-Smith C, Yang H, Landweber LF, Dunham I, Knoppers BM, et al. (2015) Where Next for Genetics and Genomics? PLoS Biol 13(7): e1002216. doi: 10.1371/journal.pbio.1002216

Balding, D. J. (2006) A tutorial on statistical methods for population association studies. Nat Rev Genet 7, 781–791.