Next-generation sequencing technologies have opened a new era of research in population genetics. Following these new sequencing opportunities, the use of restriction enzyme-based genotyping techniques, such as restriction site-associated DNA sequencing (RAD-seq) or double-digest RAD-sequencing (ddRAD-seq), has dramatically increased in the last decade. From DNA sampling to SNP calling, the laboratory and bioinformatic parameters of enzyme-based techniques have been investigated in the literature. However, the impact of those parameters on downstream analyses and biological results remains less documented. In this study, we investigated the effects of sevral pre- and post-sequencing settings on ddRAD-seq results for two biological systems: a complex of butterfly species (Coenonympha sp.) and several populations of common beech (Fagus sylvatica). Our results suggest that pre-sequencing parameters (i.e., DNA quantity, number of PCR cycles during library preparation) have a significant impact on the number of recovered reads and SNPs, on the number of unique alleles and on individual heterozygosity. In the same way, we found that post-sequencing settings (i.e., clustering and minimum coverage thresholds) influenced loci reconstruction (e.g., number of loci, mean coverage) and SNP calling (e.g., number of SNPs; heterozygosity) but had only a marginal impact on downstream analyses (e.g., measure of genetic differentiation, estimation of individual admixture, and demographic inferences). In addition, replication analyses confirmed the reproducibility of the ddRAD-seq procedure. Overall, this study assesses the degree of sensitivity of ddRAD-seq data to pre- and post-sequencing protocols, and illustrates its robustness when studying population genetics.
see on Pubmed