A mechanism to post NADM-related data models and implementations on the NADM web site will be established in order to record NADM related activities and to provide a basis for NADM evolution. All contributions will be welcome, providing they: (1) supply brief summary documentation and diagrams, and (2) develop in conjunction with DMDT a short comparison of the contribution with NADM 4.3. An initial list of contributions will be developed and posted, followed by a public solicitation for more contributions.
A requirements document is well in progress and will be completed by July 15.
A draft geologic map query language was designed for possible use in tools.
Development of common lithologic vocabulary is well in progress. Common rock names and their associated descriptions are being identified. Some unresolved data model issues have arisen from this work.
Several existing and new Canadian initiatives are utilizing NADM.
NADM facilitates knowledge construction and representation in geolibraries.
HTML-based interop. of NADM databases is successfully prototyped.
An object-oriented NADM variant is prototyped in GE SmallWorld.
An Australian-based XML encoding for geoscience data is well underway.
A mechanism to post NADM-related data models and implementations on the NADM web site will be established in order to record NADM related activities and to provide a basis for NADM evolution. All contributions will be welcome, providing they: (1) supply brief summary documentation and diagrams, and (2) develop in conjunction with DMDT a short comparison of the contribution with NADM 4.3. A template for the comparison document will be developed by DMDT. An initial list of contributions will be developed and posted, followed by a public solicitation for more contributions. Each contribution will generate a discussion thread on the NADM web site; Peter Schweitzer will aid in posting contributions and adapting the current web site. Initial contributions possibly include:
In order to collect requirements information for a geologic spatial database, geologists from the USGS and state geological surveys from around the country were requested to submit lists of 20 questions they would like to be able to answer using such a system.
ActivityThe questions were compiled by Jonathan Matti and organized into a hierarchy of 84 categories. Bruce Johnson reviewed the queries and classified them into groups based on the degree to which the queries could be resolved using the Johnson et al. [1998] data model 4.3 architecture. These categories were:
1 | Queries that can be answered with data in the current Model |
2 | Queries that can be answered with current model, if the user re-classifies rock units |
3 | Queries that could be answered by data in the current model in conjunction with additional data themes |
4 | Queries best answered with additional attributes or minor modifications to the current data model |
5 | Queries that require a separate application +/- additional data (3) |
6 | Queries that require data that is not normally available (10) |
7 | Queries that are not within the purview of the NADM (14) |
8 | Unclear queries (4) |
Within groups 1-5, the queries were classified into one of 48 topical categories, with some hierarchical arrangement of the categories.
Jerry Weisenfluh (JAW) developed an Excel Spread sheet to categorize the queries. Lists of concepts, query forms, classifications, descriptions and spatial concepts necessary to answer the queries were compiled from the list of queries. The spreadsheet includes 566 queries, some of which are compound. Each query includes Bruce Johnson' s classification and a classification by JAW to identify the kind of query according to the type of activity required. Table 1 lists the classes used.
Table 1. Jerry Weisenfluh's query classesSQ | Simple query: Find spatial objects according to criteria from a single property |
CQ | Compound query: Find spatial objects according to criteria from more than one property |
CALC | Calculation: Perform a calculation on a set of spatial objects |
MD | Metadata: Return metadata information pertaining to a set of objects |
SA | Spatial analysis: Evaluate a question by performing a spatial comparison |
MC | Map classification: Reclassify map objects in order to generalize or differentiate |
CF | Complex function |
AM | Ambiguous query |
?? | Not quite sure what the user wants |
Stephen Richard (SMR) transferred the compilation of queries by Jonathan Matti (20_queries_master_1.pdf) into a Microsoft Access2000 database, constructed to allow multiple Categories to be related to each Query. The queries were then sequentially reviewed and analyzed for their component categories. The categories are meant to classify the queries according to the sort of information required to answer the query. After this analysis, the categorized queries were reviewed to generate a distilled list of queries that typify the kinds of information requests. At the same time, a list of classifications, descriptions, relationships and operations required to address the queries was compiled. These lists were presented to the DMDT committee at the meeting.
Summary of SMR analysis: 848 total queries. 16 of these I generated during the course of the analysis. 209 of the Queries in the JAW table were not identical to queries in the SMR table because one or the other of us edited the text of the query. 496 queries were analyzed into categories based on the sort of information required to answer the query. In 29 of these cases the queries were ambiguous and significant interpretation of the intention of the query was made in order to do the analysis. 352 queries were deemed duplicates for the purposes of this analysis. At the end of the analysis, 92 (variously consistent) categories had been identified.
SummaryAt this point we have 4 semi-independent analyses of the queries, distilled to varying degrees. The list distributed by SMR at the technical team meeting will be distributed sequentially (order: JAW, Canadians, Jim McDonald, Ron Wahl/ Jordan Hastings, Peter Schweitzer, Bruce Johnson, BMB/SMR) to all committed members for review, and should be compared against the analyses made by Weisenfluh, Johnson, and Matti. The review will remove duplication, and add any omitted query types, information elements or operations, with a goal of keeping the list as brief as possible, while being complete. The final list will include:
Circulation of document complete by June 30. SMR will coordinate the circulation. July 15: submit to NADMSC with completed summary attached.
PlanThese lists will serve as requirements for a NADM database. A few pages of explanatory material will be prepared as a introduction to these lists by JAW and SMR (and whoever else wants to contribute...), and the text, lists, and the complete list of queries will be presented to the NADM steering committee by July 15'01, as a recommendation for a requirements document as criteria for evaluation of the present and future data models.
A preliminary syntax for the SLLT queries was developed and described in BNF notation. The syntax could be well applied in user interfaces for querying geological maps.
Two unresolved data model issues have arisen from SLLT work: (1) lithologic description of some rocks is non-unique; and (2) multiple rock names may be assigned to the same rock description.
The study of classifications for metamorphic rocks and igneous rocks by SLLT suggests that rock name vocabulary requires two components: (1) one (or more) hierarchical arrangements of rock names, and (2) a catalog of common rock descriptions that the names refer to. The rock descriptions would consist of salient attributes (e.g. fabric, texture, composition, etc.), with each attribute itself drawing from a (hierarchical) list of appropriate terms. The SLTT working groups would then be tasked to
This scheme is beneficial in that normalizing rock descriptions, and segregating them from rock names, permits specific naming conventions to be accommodated, while minimizing confusion about rock name meanings as their definitions are clearly stated. This approach is challenged by the possibility that agreement on the commonality of many rock descriptions may not be achieved (i.e. non-uniqueness of rock descriptions)
Background
The Federal (GSC), Provincial and Territorial surveys all want to capitalize on the Internet as a vehicle for raising public awareness of geoscience, and for disseminating geoscience data and information to both traditional and new audiences. Their association, the National Geological Surveys Committee or NGSC, has endorsed the idea of a collaborative initiative-the Canadian Geoscience Knowledge Network (CGKN)-to share the development effort to do this, and also to provide a consistent interface to access the surveys' individual information holdings.
Guidelines for development of CGKN
Priority data types
Current Projects
Metadata
Bedrock Geology
Surficial Geology
Geochemistry
This project has been underway for two years to develop a "standard" data model for geochemical data (see http://geochem.gsc.nrcan.gc.ca/). It continues this year to develop web-accessible tools to facilitate the input of Canadian geochemical data into databases based on the data model. The ultimate goal is to make the geochemical data holdings of NGSC agencies available on line.
Geophysics and Mineral deposits
Preliminary work is underway to assemble inter-agency project teams to develop requirements for data models.
XML
A proposal has been submitted for funding to develop XML transfer standards for mineral occurrence, geochemical and geophysical data. If approved, a contractor will start by developing UML models for the mineral occurrence and geochemical data sets of provincial/territorial agencies, and potential field geophysical data in GeoSoft grid format. From the UML models the contractor will develop XML schemas, and test them against real data. Finally, tools will be developed to generate XML encoded files for data transfer.
Coordination
The overall approach is to develop discipline-specific models that can be linked rather than a single monolithic model. To ensure that individual data models are compatible, and there is minimal duplication between them, oversight for the whole process is essential. The way this is to be done is evolving, and includes:
Other data modelling activities
The Public Petroleum Data Model (PPDM) is used by some agencies for sub-surface (well) information. There is overlap with NADM in some types of information such as stratigraphy and lithology. For the later, NADM's SLTT is well ahead, but the latest version of PPDM (3.5) has a comprehensive model for stratigraphic information that might benefit NADM. PPDM has also started work on the 3D spatial enabling of well data.
Conclusion
Significant in-kind and actual funding has been committed to developing the CGKN data infrastructure for 2001/02. Further funding has been requested in funding from Geoconnections to accelerate the building of the geoscience component of the Canadian Geospatial Data Infrastructure (CGDI). Progress is accelerating, and is being made on several fronts. Coordination of the many components is recognized as a major challenge that remains to be fully addressed.
The conceptual foundations and rationale behind the CordLink digital library were presented.
GSC Québec had the opportunity to test interoperable implementation of two NADM 5.2 databases. The implementation is very preliminary and has limited functionality but offers quick win solution for immediate problem. The requirement for the implementation was to allow relatively autonomous databases to be queries using a central concept tree (COA). The system works with a central database of COA items tag with global unique ids. The local instances of database map their own local COA trees to the global tree and correlate their local concept to globally defined concepts. Since most of NADM attributes can be somewhat related to a COA, pieces of information can be extracted by referencing a commonly known COA. This mechanism permit a certain level of flexibility from local instance of the database to expand the global tree to fit their own need.
An object-oriented (OO) variant of the NADM 4.3 data model was developed and implemented using the GE SmallWorld GIS and map information from the Kentucky GS. This work represents a prototype approach to the development of the US National Geologic Map Database. Benefits related to the object-oriented design and its implementation were observed. Benefits realized from the object-oriented design include:
Benefits related to the GE SmallWorld implementation include:
The XMML project is developing a general purpose XML encoding for geoscience data. XMML is an extension/specialisation of GML - the Geography Markup Languaged eveloped by the OpenGIS Consortium. It is based on an object-model for geospatial features, compatible with modern GIS concepts and ISO. Because the encoding uses XML, it is highly compatible with the web infrastructure and generic B2B systems. Because it is based on OGC standards, it will be compatible with WFS and GML conformant servers and clients.
The project is being coordinated by CSIRO, and is sponsored by several mining-industry organisations, and also by the geological surveys in Australia. We are pursuing active collaboration with additional organisations from the geoscience sector. This will be particularly important for some vocabularies. The current sponsors have determined the priority order for the development of specific features, starting with samples and drillholes. However, the patterns used are being designed to support general applications in the geosciences, and implementation of additional specialised feature types is straightforward. The model is designed to be application neutral, rather than directly mapping onto the internal data model of any specific processing system. It is likely to be best used as a transfer or archive format, reflecting a "snapshot" of an extract from a datastore.
For more information, see http://www.ned.dem.csiro.au/XMML/.