YoVDO

WGS Variant Calling - Variant Calling with GATK - Part 1 - Detailed NGS Analysis Workflow

Offered By: Bioinformagician via YouTube

Tags

Bioinformatics Courses Quality Control Courses Bash Scripting Courses

Course Description

Overview

Dive into a comprehensive tutorial on variant calling from whole genome sequencing (WGS) data using the GATK best practice workflow. Learn how to set up a pipeline in bash (Linux) to pre-process and align reads, ultimately generating a VCF file. Follow step-by-step instructions for quality control with FastQC, alignment using BWA-MEM, marking duplicate reads, performing Base Quality Score Recalibration (BQSR), and calling variants with HaplotypeCaller. Gain insights into the intuition behind each step, runtime expectations, and memory requirements. Access provided code, data sources, and additional resources to enhance your understanding of SAM file formats, SAM flags, and VCF file formats. Perfect for bioinformaticians and researchers looking to master variant calling techniques in genomic analysis.

Syllabus

Intro
Aim & Intuition behind variant calling
What is GATK?
Somatic vs Germline variants
GATK best practice workflow steps
Data pre-processing steps - alignment
A note on Read Groups
Data pre-processing steps - mark duplicate reads
Data pre-processing steps - Base Quality Score Recalibrator
Variant discovery
Data used for demonstration
System requirements
Setting up directories
Download data
Download reference fasta, known sites and create supporting files .fai, .dict
Setting directory paths
Step 1: Perform QC - FastQC
Step 2: Align reads - BWA-MEM
Step 3: Mark Duplicate Reads - GATK MarkDuplicatesSpark
Step 4: Base Quality Score Recalibration - GATK BaseRecalibrator + ApplyBQSR
Step 5: Post Alignment QC - GATK CollectAlignmentSummaryMetrics and CollectInsertSizeMetrics
Create multiQC report of post alignment metrics
Step 6: Call variants - GATK HaplotypeCaller


Taught by

bioinformagician

Related Courses

Analytical Chemistry / Instrumental Analysis
Rice University via Coursera
Введение в биоинформатику (Introduction to Bioinformatics)
Saint Petersburg State University via Coursera
Evaluating Social Programs
Massachusetts Institute of Technology via edX
Introduction to Computer Numerical Control
TenarisUniversity via edX
Introduction to Oil Country Tubular Goods (OCTG)
TenarisUniversity via edX