Module Overview
Variation of genetic sequences underlies a vast range of phenotypes exhibited in living organisms, and is frequently associated with incidence of disease. Such sequence variants can be identified through alignment of sequence reads to a reference genome, then applying algorithms which are capable of determining which potential variants are likely true variants, and which are a result of technological noise i.e. sequencing errors, misalignments. This module introduces the different categories of variant, appropriate file formats, methods of identifying and filtering variants and assessing the impact of such variants.
Learning Outcomes
- Understand the different types of sequence variant
- Understand the VCF and BCF file formats
- Identify Single Nuclear Variants (SNVs), Insertions/Deletions (INDELs) and Structural Variants (SVs)
- Filter variants on a range of criteria
- Phase variants
- Determine the potential impact of variants through effect prediction
Prerequisite Modules/Knowledge
- Introduction to Linux
- The HPC Cluster
- Introduction to NGS
- Read Alignment