Summary: Microbiome studies continue to provide tremendous insight into the importance of microorganism populations to the macroscopic world. High-throughput DNA sequencing technology (i.e., Next-generation Sequencing) has enabled the cost-effective, rapid assessment of microbial populations when combined with bioinformatic tools capable of identifying microbial taxa and calculating the diversity and composition of biological and environmental samples. Ribosomal RNA gene sequencing, where 16S and 18S rRNA gene sequences are used to identify prokaryotic and eukaryotic species, respectively, is one of the most widely-used techniques currently employed in microbiome analysis. Prior to bioinformatic analysis of these sequences, trimming parameters must be set so that post-trimming sequence information is maximized while expected errors in the sequences themselves are minimized. In this application note, we present FIGARO: a Python–based application designed to maximize read retention after trimming and filtering for quality. FIGARO was designed specifically to increase reproducibility and minimize trial-and-error in trimming parameter selection for a DADA2–based pipeline and will likely be useful for optimizing trimming parameters and minimizing sequence errors in other pipelines as well where paired-end overlap is required.
Availability and implementation: The FIGARO application is freely available as source code at https://github.com/Zymo-Research/figaro