GSA Annual Meeting in Denver, Colorado, USA - 2016

Paper No. 280-1
Presentation Time: 8:15 AM

TRANSDISCIPLINARY RESEARCH IN THE GEOSCIENCES AND BEYOND REQUIRES ACCESS TO ALL DATA BOTH ‘BIG’ AND ‘SMALL’ (Invited Presentation)


WYBORN, Lesley A.I. and EVANS, Benjamin J.K., National Computational Infrastructure, Australian National University, 56 Mills Road, Acton, 2600, Australia, lesley.wyborn@anu.edu.au

In the Geosciences it is generally perceived that ‘Big Data’ is the domain of those that collect large volumes of observational or computed data (e.g., global sensor networks, satellites, airborne surveys, etc.). Increasingly ‘Small Data’ is gaining recognition that is also a highly valuable asset: individual surface observations can be used to calibrate ‘Big’ remotely sensed data collections and/or multiple small datasets can be concatenated into high value, standardized ‘Big Data’ sets. Thus, all data, regardless of how it was collected, has the potential to contribute to the new data-rich world and reshape scientific practice.

When modern compute infrastructures (Cloud, HPC) are combined with either centralized or network-accessible domain repositories, it is theoretically possible to integrate diverse ‘Big’ and ‘Small’ datasets to address global and societal challenges. But management and operation of diverse and often distributed data collections is complex. To ensure users can find, evaluate and then utilise data requires benchmarking against emerging frameworks and standards that are designed to make data more discoverable, accessible and usable. Despite decades of effort and some success, technical barriers still exist between data from different geoscience disciplines, making interdisciplinary research extremely difficult, whilst combining geoscience data with data from other domains is a high a barrier for researchers to realistically undertake.

Addressing modern global and societal challenges requires a transdisciplinary approach. Data collections, both ‘Big’ and ‘Small’, need to be ‘born connected’ across discipline boundaries and beyond academia: this process needs to start at the conception of any data collection program. Researchers across the science disciplines, including the data generators as well as those in secondary data analysis, need to work together to mature the frameworks and standards required to create integrated data platforms that interoperate across discipline boundaries to enable access by a diversity of users from high-end researchers, to undergraduates and to policy makers. Further, to engage with the general public, this inclusive activity should include humanities and social science researchers to ensure even further value can be extracted from the data.