Paper No. 23-7
Presentation Time: 10:20 AM
EVALUATING THE SENSITIVITY OF SUBSURFACE MICROBIAL METAGENOME ASSEMBLED GENOME PROPERTIES AS A FUNCTION OF METAGENOMIC SHOTGUN SEQUENCING DEPTH
Subsurface microbial communities are often taxonomically and metabolically diverse. The organisms that constitute these communities influence geochemical cycles, but most remain difficult to characterize because they resist isolation in pure culture. The emergence of next-generation sequencing (NGS) combined with algorithms for metagenomic binning has provided a tool for identifying the metabolic capabilities of microbes within these communities. When designing sampling and sequencing plans for metagenomic binning, it is not clear how the number and quality of bins relates to sequencing depth and the ecological complexity of the environment. Under-sequencing samples may lead to overlooking genes of metabolic significance or marker genes used in taxonomic assignment. Over-sequencing samples will theoretically lead to diminishing returns in these genes and increase sequencing cost per sample. Here, a pipeline for generating metagenome assembled genomes (MAGs) was developed to explore the optimal Illumina metagenomic shotgun sequencing necessary for characterizing different subsurface microbe communities. The pipeline randomly sub-sampled input Illumina reads in triplicate to 100%, 95%, 90%, 80%, 60%, 40%, 20%, and 10% of the original data set size as a means of simulating different degrees of sequencing depth. Randomly sub-sampled reads were then assembled into contigs and clustered into MAGs using BinSanity software according to an automated process. The resulting MAGs were evaluated for total number of retrieved MAG bins, MAG completeness and contamination, and relative abundance of phyla as a function of sequencing depth. The pipeline was validated against the mock community, MBARC-26. Upon validation, publicly available Illumina runs for prairie soil, tundra soil, and lake sediment were downloaded from NCBI's Sequence Read Archive (SRA) and analyzed with the MAG pipeline. Results indicated MAG number, completeness, and species richness retrieved from BinSanity responded linearly to sequencing depth.