GSA Annual Meeting in Indianapolis, Indiana, USA - 2018

Paper No. 253-5
Presentation Time: 9:00 AM-6:30 PM

MINING INFORMATION FROM COLLECTIONS OF PAPERS: ILLUSTRATIVE ANALYSIS OF GROUNDWATER AND DISEASE


ZHANG, Yiding, Environmental Sciences Graduate Program, The Ohio State University, 125 South Oval Mall, Columbus, OH 43210, JI, Xiaonan, Environmental Science Graduate Program, The Ohio State University, 125 South Oval Mall, Columbus, OH 43210, IBARAKI, Motomu, School of Earth Sciences, The Ohio State University, 125 South Oval Mall, Columbus, OH 43210-1308 and SCHWARTZ, Franklin W., School of Earth Sciences, The Ohio State University, 125 S. Oval Mall, Columbus, OH 43210

The academic world is driven by scholarly research and publications. Yet, for many fields, the volume of published research, and the associated knowledge base have been expanding exponentially for decades. The result is that scientists are literally drowning in data and information. There are strategies and approaches that could help with this problem. The goal of this paper is to demonstrate the power of computer-based approaches such as data mining and machine learning to evaluate large collections of papers. The objective is to conduct a systematic analysis of research related to the emerging area of groundwater-related diseases. More specifically, the analysis of information from the database of papers will examine systematics in the research topics, the inter-relationships among multiple diseases, contaminants, and groundwater, and discover styles of research associated with groundwater and disease. The analysis uses 426 papers (1971 – 2017) retrieved from a MEDLINE bibliographic database, PubMed, given the search terms “groundwater” and “disease”. We developed tools that take care of necessary text processing steps, which lead naturally to clustering and visualization techniques that demonstrate published research. The resulting 2D article map shows how the collection of paper is subdivided into 11 article clusters. The cluster topics were determined by analyzing keywords or common words contained in the articles’ titles, abstracts, and key words. We found that research on water-related disease in groundwater primarily focuses on two types of contaminants – chemical compounds and pathogens. Cancer and diarrhea are two major diseases associated with groundwater contamination. According to the systematic analysis, the study of this area is still growing.
Handouts
  • groundwater_disease_poster_GSA.2018.pdf (3.5 MB)