LIDAR-IN-A-BOX: SERVING LIDAR DATASETS VIA COMMODITY CLUSTERS

NANDIGAM, Viswanath¹, BARU, Chaitan², CHANDRA, Sandeep¹ and FRANK, Efrat¹, (1)San Diego Supercomputer Center, University of California, San Diego, La Jolla, CA 92093, (2)San Diego Supercomputer Center, Univ of California, San Diego, La Jolla, CA 92093-0505, viswanat@sdsc.edu

The Geosciences Network (GEON, www.geongrid.org) is an NSF-funded project to create an IT infrastructure that facilitates a collaborative, interdisciplinary science effort in the field of earth sciences. GEON facilitates data registration, ingestion, and integration of a range of geoscience data types including LiDAR (Light Distance And Ranging) data. LiDAR datasets can be used to create high quality digital earth surface models, which are useful in a variety of geoscience and geospatial applications. The recent, rapid increase in the rate of acquisition and popularity of these datasets far outpaces the resources available to most geoscientists for processing and using these data.

GEON provides a novel approach for processing and distributing LiDAR datasets and derived products using a high performance backend database machine, a portal as the front-end user interface, and the Kepler scientific workflow system for managing the computations. Currently, the LiDAR datasets are stored in an IBM DB2 database running on one of the nodes of DataStar - an IBM supercomputer system that is one of the computational resources in the TeraGrid. This node is linked to a large disk subsystem via a high-end fibre channel link. This machine configuration is well suited to handle the massive amounts of LiDAR data, which frequently exceed several millions of data points per dataset.

We propose a new approach to hosting LiDAR data based on commodity clusters, which can provide a better price/performance solution. This is achieved by taking advantage of DB2's "partitioned database" feature. In this approach, the LIDAR database tables would be partitioned across multiple machines or "nodes". Each partition is managed by an independent database manager, each with its own data, configuration files, indexes, and transaction logs. This architecture provides better scalabilitynew machines can be added to the complex and the database can be expanded across them. In this paper, we describe this new, parallel database architecture for hosting LiDAR data. We refer to this system as the "LiDAR in the Box" since one of the benefits of this approach is that individual researchers will be able to deploy such a system at their sites. We will describe the approach that will be used to make the LiDAR-in-a-Box easily deployable.

Session No. 3--Booth# 2

Geoinformatics Poster Session

Thursday, 17 May 2007: 2:30 PM-4:30 PM

© Copyright 2007 The Geological Society of America (GSA), all rights reserved. Permission is hereby granted to the author(s) of this abstract to reproduce and distribute it freely, for noncommercial purposes. Permission is hereby granted to any individual scientist to download a single copy of this electronic file and reproduce up to 20 paper copies for noncommercial purposes advancing science and education, including classroom use, providing all reproductions include the complete content shown here, including the author information. All other forms of reproduction and/or transmittal are prohibited without written permission from GSA Copyright Permissions.

Back to: Geoinformatics Poster Session

<< Previous Abstract | Next Abstract >>

Geoinformatics 2007 Conference (17–18 May 2007)

LIDAR-IN-A-BOX: SERVING LIDAR DATASETS VIA COMMODITY CLUSTERS