Rating 0 out of 5 (0 ratings in Udemy)
What you'll learn- Genomic selection signature bioinformatics pipeline
- Linux bash scripting for bioinformatics
- Pooled Heterozygosity statistical methodology
- R integration in bioinformatics data analysis
DescriptionIn this course, audience will learn complete bioinformatics pipeline for detecting genomic selection signatures using pooled-seq whole-genome sequencing data with the help of Linux & R. Basics bash command-line and R scripting is …
Rating 0 out of 5 (0 ratings in Udemy)
What you'll learn- Genomic selection signature bioinformatics pipeline
- Linux bash scripting for bioinformatics
- Pooled Heterozygosity statistical methodology
- R integration in bioinformatics data analysis
DescriptionIn this course, audience will learn complete bioinformatics pipeline for detecting genomic selection signatures using pooled-seq whole-genome sequencing data with the help of Linux & R. Basics bash command-line and R scripting is being used for running different steps involved in this pipeline. As there are many statistical methods for detecting positive selective sweeps in the subject genome which entirely depends on the type of omics data you have. we used pooled-heterozygosity statistics (Hp) which is a robust in case of pooled-seq dataset with sliding window approach.
Trimming of the sequences was performed with Trimmomatic software followed by indexing of the reference genome and mapping with BWA-MEM, sorting and mark-duplication steps were performed with Picard tool, then SAMtools was used for making the pileup files, further, PoPoolation2 tool was used to generate the .rc files and to synchronize them. Finally, in-housed Ruby Hp script was used to find out the hitchhiking positive selection pressure regions in the subject genome. A brief commentary was also provided on the adopted Hp statistics, moreover, data was prepared for normalization and visualized on R. Manhattan, qq density plots & histograms were generated and finally Tajima’s D statistics was also applied for analyzing the same data with this classical method of site frequency spectrum.