Software and Workshops
Alongside my empirical work, I am interested in the development of rigorous and user-friendly software for evolutionary genomics. I strongly believe that in this highly specialized and technical era of scientific progress, the demystification of challenging technical concepts, with the goal of training a next generation of diverse scientists, is the only way we will continue to generate scientific breakthroughs. To that end I actively participate in the development of open-source software and training materials, so that motivated investigators from any background can embark on the crucial journey of deeply understanding the processing pipelines and statistical decisions involved in generating evolutionary insights from genomic data.
Software
SNPfiltR
As part of my empirical work on the genomics of wild bird populations undertaken during my PhD, I identified a gap in available pipelines for analyzing next-generation sequence data (i.e., Illumina short-read sequences mapped to a reference genome). Specifically, there was no available software designed to filter single nucleotide polymorphism (SNP) datasets in an interactive way with built-in visualization tools. This typically resulted in users falling back on previously published sets of “standard filters”, rather than optimizing filtering parameters based on the idiosyncracies of their specific dataset. To address this gap I developed the R package SNPfiltR, which is designed to visualize key parameters such as genotype quality and missing data proportion, allowing users to quickly and interactively design an optimized set of filtering parameters for their SNP dataset, without the hassle and potential introduction of mistakes associated with designing unique, homebrewed visualization and filtering scripts for each SNP dataset. A manuscript announcing the package was published in Molecular Ecology Resources and the package is freely available for download via CRAN and GitHub. The package website https://devonderaad.github.io/SNPfiltR/ contains detailed tutorial-style vignette walkthroughs demonstrating the filtering process for real empirical datasets.
bgchm
I am involved as a co-author on the R package bgchm, which is focused on updating Bayesian approaches to genomic and geographic cline analysis using Hamiltonian Monte Carlo (HMC; as opposed to the traditional Markov Chain Monte Carlo, MCMC) for sampling posterior distributions. This update facilitates faster and more accurate sampling of posterior distributions, allowing the R package to handle genome-scale datasets. The manuscript announcing the details of the package update can be found here.
Workshops
RADseq analysis workshop
In Fall 2022 I led a group of graduate students through a semester-long weekly workshop detailing the process of analyzing RADseq data from start to finish. This workshop included hands-on training on: understanding file directory structure, navigating the command-line interface, using a high performance computing cluster (HPCC), using RStudio to interactively filter SNP datasets using SNPfiltR, and performing standard descriptive phylogenetic and population genetic analyses. The materials for this workshop are publicly available and documented in the following GitHub repository. Visualization of the workshop’s structure is here:
SNPfiltR workshop (AOS 2023)
At the American Ornithological Society meeting in Ontario, Canada (August 2023) I led a half-day workshop designed to take a group of investigators through the process of filtering their SNP datasets. The materials are based heavily on the SNPfiltR website and the exact details can be found in this publicly available GitHub repository.