The Data Analysis Group

Module Overview

Variation of genetic sequences underlies a vast range of phenotypes exhibited in living organisms, and is frequently associated with incidence of disease. Such sequence variants can be identified through alignment of sequence reads to a reference genome, then applying algorithms which are capable of determining which potential variants are likely true variants, and which are a result of technological noise i.e. sequencing errors, misalignments. This module introduces the different categories of variant, appropriate file formats, methods of identifying and filtering variants and assessing the impact of such variants.

Learning Outcomes

After completing this module, participants will be able to:

Understand the different types of sequence variant
Understand the VCF and BCF file formats
Identify Single Nuclear Variants (SNVs), Insertions/Deletions (INDELs) and Structural Variants (SVs)
Filter variants on a range of criteria
Phase variants
Determine the potential impact of variants through effect prediction

Prerequisite Modules/Knowledge

Introduction to Linux
The HPC Cluster
Introduction to NGS
Read Alignment

Course Schedule

Not currently scheduled Join Waiting List