Paper No. 2
Presentation Time: 10:00 AM
EFFECTIVE FUTURE USE OF CURRENT REMOTELY SENSED DATA SETS TO STUDY LONG TERM CLIMATE CHANGES WILL BENEFIT FROM ALTERATION OF PRESENT DATA MANAGEMENT APPROACHES AND DATA PRODUCTION STANDARDS
Determining changes of a geophysical parameter over time requires having a way to estimate the parameter values at both ends of the time range under consideration. Measurements being made now will comprise one end of the data record for studies to be done in the future. Unfortunately present standards for archival of remotely sensed data sets do not cover all the information that future scientists will need to precisely understand and effectively use the current data and even more unfortunately the scientists that made the current data sets will be long gone and unable to answer the resulting questions. The problem is that any given measurement system has both a relative accuracy or precision and an absolute accuracy. Typically the precision of a “validated” measurement system is much better known than the absolute accuracy. Trends can be estimated from precise measurements as long as the same measurement system is continued in use and the fundamental geophysical situation does not change from that present when the data set was validated. However no instrument lasts forever and improved measurement techniques often lead to fundamentally different ways of measuring the same parameter. Over long time periods geophysical values that were assumed as inputs for a given measurement system (input as static climate values rather than measured quantities) may also change. This might not be a great problem if the absolute accuracy of both systems is known and if the trend to be investigated involves changes that are substantially larger than the difference between the absolute accuracy of the current and future measurement systems. If this is not the case it would be possible for a future scientist to estimate the size of these changes and their impact on the resulting long term trend if (but only if) he or she knew exactly how the original data set was made. However current practice does not require archival of this information. Journal articles, subject to page limitations, describe the approach but not the details of the implementation. Algorithm Theoretical Basis Documents as currently produced for many of NASA's measurement systems are longer and more detailed but are written before launch and not updated at the end of the mission to include the changes between what was intended and what was finally done. The only thing that is sure to be consistent with the archived measurements is the source code that was used to make the measurements. If this is augmented with specific identification and archival of all of the inputs to the processing (for every production run) and the exact details of the processing system itself it becomes possible to replicate and understand the results. We have developed and are currently running a production system, described in an accompanying presentation, which automates the collection and storage of all the provenance information described above. However, for a future scientist to do sensitivity studies of the impact of various input and algorithm changes there are several more things that are necessary including details of how tables used in the algorithm were developed and an explanation of what the code was doing at every step of the process. There are structural changes in the current data management approach and best practice procedures that can minimize the above problems and these will be discussed. They are not without cost. However, since it will be impossible for future scientists to go back in time to make new measurements of the past situation, decisions about implementing the necessary changes and best practices should be made as the result of conscious study rather than being left to defaulting to the current approach.