Structural variation, large-scale structural differences in genomic DNA, contribute greater genomic diversity at a nucleotide level than any other type of genetic variation, and are linked to a range of Mendelian diseases and complex traits. As such, accurate identification of structural variants from whole-genome sequencing (WGS) data will provide useful information in understanding human genetic diversity and its role in disease. Recently, next-generation sequencing has allowed for improved identification and characterisation of structural variation within human genomes. Compared to long-read sequencing, output from short-read sequencing technologies are less error-prone, more accurate and cost-effective. Because of these properties, short-read analyses are useful in detecting structural variation at a population-level scale. Traditionally, studies analysing short-reads to detect structural variants have used a single caller/program. However, the use of a single caller presents a lack of sensitivity in structural variation detection, as each caller has limitations and biases in integrating signals needed to accurately call specific types of structural variants. As such, pipelines integrating calls from multiple programs will provide a more robust and accurate structural variant callset for downstream analyses.
Here, we present a pipeline that utilises output from multiple callers to detect structural variation in human datasets generated using Illumina short-read WGS. This pipeline incorporates the use of traditional callers such as LUMPY (Layer et al., 2014) and DELLY (Rausch et al., 2012), newer programs (SvABA; Wala et al., 2018), and a custom assembly-based pipeline to detect structural variation in trios sequenced as part of Phase 3 of the 1000 Genomes Project. This pipeline, once sufficiently robust, will be used to detect structural variants within Indigenous Australian genomes at a population-level scale. Such analyses will contribute to the generation of a set of Indigenous Australian-specific reference genomes as part of the National Centre for Indigenous Genomics' Reference Genome Project.
Structural variation, large-scale structural differences in genomic DNA, contribute greater genomic diversity at a nucleotide level than any other type of genetic variation, and are linked to a range of Mendelian diseases and complex traits. As such, accurate identification of structural variants from whole-genome sequencing (WGS) data will provide useful information in understanding human genetic diversity and its role in disease. Recently, next-generation sequencing has allowed for improved identification and characterisation of structural variation within human genomes. Compared to long-read sequencing, output from short-read sequencing technologies are less error-prone, more accurate and cost-effective. Because of these properties, short-read analyses are useful in detecting structural variation at a population-level scale. Traditionally, studies analysing short-reads to detect structural variants have used a single caller/program. However, the use of a single caller presents a lack of sensitivity in structural variation detection, as each caller has limitations and biases in integrating signals needed to accurately call specific types of structural variants. As such, pipelines integrating calls from multiple programs will provide a more robust and accurate structural variant callset for downstream analyses.
Here, we present a pipeline that utilises output from multiple callers to detect structural variation in human datasets generated using Illumina short-read WGS. This pipeline incorporates t ...
2B9 - Building 2 GSA2018_APCC6 GSACC62018@canberra.edu.auTechnical Issues?
If you're experiencing playback problems, try adjusting the quality or refreshing the page.
Questions for Speakers?
Use the Q&A tab to submit questions that may be addressed in follow-up sessions.