Paper No. 0
Presentation Time: 10:45 AM
DATA MINING IN GEOSCIENCE RESEARCH
We studied bibliometric data to uncover patterns in the usage of research work in hydrology. What we found was disturbing. Citation information related to key journals (e.g., Water Resources Research, Ground Water and Journal of Hydrology) showed that although many competent scientific papers have been produced, most are only minimally cited. This paper extends this research through the application of data mining in textual classification to examine whether research topic influences impact as measured by citations. More specifically, we studied research papers in five different sub-fields of hydrology (precipitation, unsaturated zone, groundwater, river/lake, and estuary/ocean). Bibliometric information related to articles in Water Resources Research is available on ISIs Web of Science. These data were correlated with information derived from a classification based on data mining of the entire text of the journal articles. The full contents of all articles published after 1990 were accessed and downloaded in digital format through American Geophysical Union website. The bibliometric analysis showed broad variability in the impact of articles among these sub-fields, measured in terms of citations. Our preliminary results for Water Resources Research from 1990 through 1996 show that articles focused on the unsaturated zone and ground water received more attention than other sub-fields. We also discovered a relationship between major topical areas covered by the journal and the citations to papers in these topical areas. A framework is being developed such that we can review research work in water science in the past and predict the trend of research work in that discipline in the future. Ultimately, we are seeking ways to guide researchers in creating greater impact for their work.