2008 Geoinformatics Conference (11-13 June 2008)

Paper No. 16
Presentation Time: 4:20 PM

GEOINFORMATICS ON THE FRONT LINES: PURDUE's INAUGURAL GEOINFORMATICS COURSE


MILLER, C.C., Libraries, Purdue University, 2215E Earth & Atmospheric Sciences (EAS) Library, 550 Stadium Mall Dr, West Lafayette, IN 47907 and FOSMIRE, Michael, PSET Libraries, Purdue University, Physics, West Lafayette, IN 47907, ccmiller@purdue.edu

In 2007 the National Science Foundation's (NSF) Cyberinfrastructure Council published their "Cyberinfrastructure Vision for 21st Century Discovery" (2007). In it, the NSF lays out a plan for directing and funding the initiatives designed to take the many and rapid advances in (mostly) large-scale computing architectures, communication and data transfer protocols and very large data storage capabilities and build from them a coordinated, integrated, interoperable cyberinfrastructure (CI) that, among other virtues, "serves as an agent for broadening participation and strengthening the nation's workforce in all areas of science and engineering" (The Council, 2007, p. 6). Much of what NSF proposes is grid-level, at the scale and scope one would expect from an NSF response to a revolution. Yet they've taken care to acknowledge in their plan the fact that although e-infrastructure is an increasingly vital component of scientific research these machines still require engineers, of sorts; scientists who can care for the data drawn from and fed into the electronic machines that produce or consume them. "In the future," writes The Council, "U.S. international leadership in science and engineering will increasingly depend upon our ability to leverage this reservoir of scientific data captured in digital form" (The Council, 2007, p. 22). And as such, "ongoing attention must be paid to the education of the professionals who will support, deploy, develop, and design current and emerging cyberinfrastructure" (The Council, 2007, p. 38).

In other words, just as much as high-end computing architectures must be relied upon to process and manipulate and share data, there is an equally important amount of human finesse that goes into the preparation, consumption, interpretation of and care for those data. To this end, faculty from Purdue Libraries and Earth & Atmospheric Sciences cooperated to offer an inaugural course in Spring 2008 entitled "Geoinformatics." The course was designed to be a discipline-agnostic overview of emerging trends and issues in geoscience that fall within the purview of geoinformatics and was intended to fit into that niche within the emerging universe of cyberinfrastructure that will "both demand and support a new level of technical competence in the science and engineering workforce and in our citizenry at large" (The Council, 2007, p. 37). Course instructors intended the course to fit somewhere within that hotspot between CI and the semantic web and data futures and rapidly-developing, increasingly geospatially-savvy technologies, where we could push the "workforce" component of the NSF plan (The Council, 2007, ch. 5) and begin teaching our next generation of scientists about the stores of data available in online systems, the power and limitations of those networked tools and data structures, and the importance of good data hygiene.

Data are only as interoperable as the scientists who are willing to understand the technologies, meet standards and share their work with others. The provenance of data, who collected it, how they collected it, and whether and how it has been verified, are all important factors for researchers to consider before incorporating external data into their analyses, for sure, but are at least as important when it comes time to fold data and results back into the community via the growing CI-based sharing and dissemination systems and digital libraries. Course instructors therefore emphasized data hygiene and sharing mechanisms throughout the course and contextualized all coursework within the greater world of geodata and geodata issues. Course modules ranged from data collection and massaging (GPS, statistics and databases) to the more semantic concerns of metadata, ontological structures, and systems designed to assist with data stewardship and sharing.

In this presentation, one of the instructors of Purdue's "Geoinformatics" course, a GIS Librarian, will discuss the factors that informed the development of the course, briefly describe the modules and assignments and technologies introduced, and address the successes and failures of an attempt to introduce the gargantuan world of geoinformatics to a diverse (by background and technical skill) body of 13 students.

Infrastructure is only as good as the societies and cultures and development built thereupon. Likewise, future scientists will only be as willing to work within the bothersome strictures necessitated by adherence to standards and interoperability if they've been trained and have been convinced of the benefits of doing so. Stated in analog perhaps one last time, even the fantastic technology of the book was lost on the illiterate, and the literacy skills required of scientists in the interdisciplinary, high-grade CI future proposed by NSF and others are perhaps more nascent than the revolution itself. This is geoinformatics on the front lines; lessons learned by students will ideally have prepared them to move further into their respective domains with eyes open to the opportunities that exist (or will exist) for advancing disciplinary or interdisciplinary research, and lessons learned by the course instructors speak to the difficulties of moving geoinformatics itself into the future and the importance of doing so.

REFERENCES CITED

National Science Foundation (U.S.), & Cyberinfrastructure Council (National Science Foundation), 2007, Cyberinfrastructure vision for 21st century discovery: Arlington, VA, National Science Foundation Cyberinfrastructure Council, available online at http://purl.access.gpo.gov/GPO/LPS80410. (Accessed January 12, 2008).

ACRONYMS

NSF

CI