Paper No. 8
Presentation Time: 8:00 AM-12:00 PM
RELATING MICROBIAL GENE SEQUENCES TO ENVIRONMENTAL FACTORS: CUTTING THROUGH THE TREES
Microorganisms shape the environment in which they live and affect many geochemical and ecological processes. Understanding the interactions between microbes and the environment is a central goal for geomicrobiology, biogeochemistry, and microbial ecology. Regularly, the simplest angle used to evaluate microbial communities and microbial processes has been to treat microbes as a black box', for lack of better tools. With recent advances in genomic technologies, new microbial processes have been identified, thereby unraveling components of the microbial black box' and significantly altering our view of the natural world. However, the practice of organizing and interpreting gene libraries, often a central component of microbial ecology studies, still poses many potential problems. The common practice has relied on the construction of complex evolutionary models to describe gene survey data, which results in a variety of phylogenetic trees. Creating meaningful trees requires expertise on the part of the researcher, and interpreting the relevant information from the trees is often intricate and frustrating. Moreover, it has been difficult to relate environmental and geochemical factors to phylogenies. We contend that it is time for the forest to be thinned out, or for trees to be chopped down altogether if we want to address biogeochemical problems with gene sequence information. Using a variety of large sequence databases that were constructed with relevant geochemical/ecological data, we applied alignment-independent methods to reveal significantly correlated relationships not seen by using phylogenies. Also, by applying word frequency of short sequence fragments, we were able to implement discriminant analysis routines to find variables that effectively identified microbial communities by environmental type (e.g., microbial source tracking). One advantage of these methods is that individual- and community-level correlations to environmental factors were differentiated and cross-validated, without complex evolutionary models and assumptions. The implication of these findings is that geochemical similarities of unknown microbial environments may be inferred directly from sequence libraries.