Content Innovations
HomeAboutServicesSolutionsLinksContact
 
NGDA Data Format Registry Findings Overview
   

Our findings are detailed in the Report to National Geospatial Digital Archive Regarding Geospatial Data Treatment in Data Format Registry Efforts

The project and challenges are further described in Assessing the Utility of Current Format Registry Efforts for Geospatial Formats presented at The Society for Imaging Science and Technology'(IS&T)'s Archives 2009 Conference by Nancy Hoebelheinrich; Stanford University Libraries; Stanford, CA/USA & Natalie K. Munn; Content Innovations, LLC; San Francisco, CA/USA .

About the Project: The National Geospatial Digital Archive (NGDA) project funded by the Library of Congress’ National Digital Information Infrastructure and Preservation Program (NDIIPP) has been investigating whether existing or planned format registry efforts do or would support the often quite complex geospatial data formats which NGDA and other institutions are collecting for long term preservation. 

As one of the principle nodes in the network, and partners in the NDIIPP funded project, members of the NGDA team at the Stanford University Libraries have engaged in research to identify the information that is considered important to gather in order to archive geospatial data over time. Initial research resulted in An Investigation into Metadata for Long-Lived Geospatial Data Formats, a paper by Nancy Hoebelheinrich, et al. documenting the results of an investigation into the need for preservation metadata for geospatial resources. The initial investigation found an assumption that the use of format registries were an implicit and important part of the metadata strategy for most archiving and preservation institutions.  Yet, from a cursory review, it was unclear how comprehensively geospatial data could be documented within burgeoning data format registry efforts in the US and the UK, and thus the NGDA team decided to build a wiki-based format registry as a temporary measure until research could be done on the treatment of geospatial data in select data format registry efforts. Content Innovations, LLC & Geodata Analytics LLC were contracted to conduct this research under the direction of Stanford NGDA staff.    

Content Innovations & GeoData Analytics researched treatment of 23 geospatial data formats & 13 format subtypes in key format registries and registry related efforts such as PRONOM, the Global Digital Format Registry (GDFR) , and the Library of Congress’ sustainability factors planning matrix .   We analyzed sample data targeted for ingest into NGDA and examined how the target formats were represented in these key registries.

In Appendix C: NGDA Format Registries' Field Map we compare format registry data models and map common fields and features across the registry efforts of NGDA, PRONOM, GDFR, and the Library of Congress.  This effort will aid NGDA either in finalizing its data model for the NGDA format registry, or in deciding which format registry is best suited for documenting the geospatial data that is being archived in the NGDA.

 

 

Content Innovations, LLC
655 Montgomery Street, 5th Floor
San Francisco, CA 94111

v 415.550.0650
f 415.837.3204
email: info@contentinnovations.com
Content Innovations, LLC has developed a range of services and software solutions that address many of today's information management challenges. 
Solutions That Work for Your Company