Generic issues

Fri May 5 13:43:47 2000

A comment by Jonathan C. Matti about

Some generic issues to consider

by Jonathan C. Matti

Response by Jonathan C. Matti to the Generic Concepts memo

(a) can we really develop common science-language standards on a
continent-wide basis?

**Yes-But only through free and open communication within the map-making and
map-using community.

Several SLTT participants have cautioned that "a standard is not a law", and
the standard does not necessarily have to be adhered to. This raises some
intriguing questions:

(1) For whom is the standard intended? Those who make map data bases for the
USGS? AASG? GSC? Canadian Provincial Surveys? Academia? Industry?
(2) If for public-agency use (data origination), do the agencies ENFORCE the
standard through peer review, digital review, and editorial review? Or is
adherence to standards left to map-maker discretion?
(3) In the case of the US, will science-language standards be a necessary
filter for archiving digital-map products in the National Geologic Map Data
Base? If so, then does the "standard" become, in fact, not just a standard
of reference but a requirement?
(4) Do we inevitably raise the specter of "big brother" modulating how we
express our science?
(5) Perhaps the answer lies in the notion of "data base" versus "data". It
seems to me that as soon as we enter the world of systematically populated,
managed, and distributed data bases, then we necessarily give up some of the
freedom that formerly allowed us to disseminate "data" using the conventions
we were taught or that we developed experientially. In exchange for the
benefit of exchanging data sets efficiently and consistently, we probably
will have to give up some of the freedom that we formerly had to express
ourselves scientifically. (But this only would apply in the data-base
sense, not in the science-analysis sense).

--------------------------------------------------------------------------------------------------

(b) can we really do this at a level deeper than "granite versus basalt" or
"glacial versus deltaic" or "geologic contact versus fault", etc?

**Yes

In discussions with many of my USGS colleagues, I have observed surprising
consensus on fairly deep levels of rock and structure classification,
terminology, hierarchy, descriptive versus interpretive factors, etc. This
consensus occurred on subjects ranging from the mundane (grain-size
classifications, bedding-thickness classifications, petrographic
terminology) to the complex and interpretive (language for tectonic and
depositional environments, geologic structures, etc.).

--------------------------------------------------------------------------------------------------

(c) what role do regional geologic differences and geologic-mapping
traditions play in the development of science-language standards?

**This is an interesting question, because if regional traditions and
purviews hold sway over national or international uniformity and
consistency, then the notion of a continental standard is violated. The
SLTT is populated by public-agency geoscientists from a broad geographic
distribution and from diverse backgrounds and cultures. We will have to see
what level of disparity develops due to this diversity.

I have talked about this with colleagues, and each of them has a different
take on the role of regional tradition.

Consider, for example, the terms "arkose" and "wacke". I learned these
terms as an undergraduate in the 60's. However, in graduate school they
were hammered out of me by strong-minded sedimentary geologists who pointed
out the inconsistency of usage (arkose) and the contested origin of
between-grain "matrix" (one hallmark of wacke). However, geologists more
enlightened than myself may continue to use these terms. Should they be
incorporated into the rock-classification standard in lieu of other terms,
or should they be eliminated from the standard in deference to more
up-to-date or more widely-accepted classification of feldspar-rich
sedimentary rock (arkose) or "matrix"-bearing sandstone (wacke).

Moreover, who is to say that the petrographic sandstone classification of
Folk is more enlightened than that of McBride or Pettijohn or Friedman or
Dickinson?

The IUGS commission on the nomenclature of igneous rocks came to some sort
of closure on igneous classification, and much of the world now uses this
scheme. But what about "charnockite" and "mangerite" and "jotunite"? Are
these terms abandoned? I believe they actually are, in terms of the IUGS,
standard, unless some sort of equivalency table is linked to the IUGS
classification.

Unfortunately, an IUGS Commission has not been established for each of the
geoscience-terminology arenas that we must enter.

We are it, for the moment.

So, with charnockite as an example, once we get deep enough in the hierarchy
of any single terminology tree, we may have to establish equivalency tables
that allow regional traditions to flourish in accompaniment to the
continental standard.

--------------------------------------------------------------------------------------------------

(d) should there be one single terminology standard, or multiple standards
linked by translators and equivalency tables?

**To me, the deeper we go down a particular classification tree, the greater
the potential for un-resolvable usage traditions and shadings of usage. I
suggest we start at the top of each tree, and work our way down until
conflicts start to develop, then see what happens. If we do our job
correctly, we should be able to develop consensus within our SLTT group, and
ultimately within the geoscience community that reviews our efforts. If no
consensus can be reached, then do we make arbitrary judgments and decisions
that may fly in the face of regional traditions?

--------------------------------------------------------------------------------------------------

(e) what kinds of scientific queries should be supported by standard
terminologies at the National, Regional, and Local levels, and should a
single science-language structure support each and all levels?

**This is a re-statement of (d), essentially. The queries posed to-date by
our 20-questions exercise (posted at this web-conference site) are ones that
can be posed at local through continental levels. The resultant scientific
terminology supporting these queries needs to be as consistent and uniform
as possible to the deepest level possible.

--------------------------------------------------------------------------------------------------

(f) To what audience(s) will the data-model science language speak on behalf
of our various agencies? Technical only? Hybrid technical and
non-technical? One language for technical, a second language for
non-technical?

**To me, our primary audiences should include the following (highest
priority to lowest priority): (1) the experienced Master's level geologist
working in coordination with the land-use manager having biologic,
hydrologic, geographic, engineering, mineral-resource, and geologic-hazard
background); (2) the geotechnical-consultant community; (3) the Ph.D.-level
research community; (4) lay managers and program types. In my opinion, the
lay public (interested in where they might build a house or learn about
earthquakes or threatened-tortoise habitat) should have access to our data
bases, but through a separate portal.

To achieve this audience diversity and ordering, I believe the content,
richness, and detail of our geologic-map language standard should be pitched
toward the descriptive side, should be fairly high-end in terms of detail,
and should cater to solving applied geoscience problems while at the same
time supporting more curiosity-driven searches and applications.

We might want to consider two data-base portals: (1) the principal one
leading the informed expert down a pathway toward progressively more
technical detail and interpretation, (2) the other leading the informed lay
audience down a pathway that answers general questions not requiring much
geoscience background. For this pathway, out-reach specialists can design
user-friendly translator tables and generalizing tables. Users could cross
back and forth between lexicons as their interests or skills permit.

In my view, the lay-audience portal is a derivative product. But it is an
essential product, because the politicians of the world are part of the lay
audience and they must be induced into appreciating the power of our
geologic-map data bases.

--------------------------------------------------------------------------------------------------

(g) What does each map-producing agency expect to query (search for and
retrieve) from geologic-map data bases produced by the data model? (agency
point of view)

**Agencies that support geoscience in the public interest should be pleased
by the range of queries in the 20-questions exercise. I believe complete
overlap exists between the kinds of data-base queries that agencies would
require and those we as individual geoscientists would require (see h,
below). Do any of you see any differences? Are we missing something here?
How about those of you who have had agency-management experience: does the
20-questions exercise reflect programmatic requirements within your agency
or constituency?

--------------------------------------------------------------------------------------------------

(h) What kind of geologic information will the typical geologist expect to
put INTO the data model and retrieve FROM it? (geologist point of view)

**The range of queries in the 20-questions exercise indicates the diverse
interests that we SLTT participants represents. We as individuals are the
builders (and users) of the data bases, so I guess our answers speak for
themselves.

--------------------------------------------------------------------------------------------------

(i) What kinds of interdisciplinary science should be incorporated into the
data model science language? Or, put differently, how should the data model
be structured and populated to ensure its utility to the geophysics,
geo-engineering, earthquake, geochemical, and hydrogeologic communities?

**In my opinion, the sub-disciplines of geoscience (geophysics,
geochemistry, geo-engineering, hydrogeology, biogeology) should be able to
extract from our science-language structure the basic kinds of data they
require in order to at least begin their use of digital geologic-map data.
Some of us on the SLTT in fact are geophysicists and geochemists, and
several have geo-engineering strengths. Some of their data-base queries
reflect their needs and requirements.

To what degree is the SLTT obligated to develop common language to support
these communities? In my opinion, fully obligated. It becomes a matter of
finding out what the geologic communities need to extract from our data
bases. And that is what the SLTT is about, in my opinion.

For example: do we restrict penetration-resistance data to only those data
generated by standardized procedures (the SPT procedures championed by
Boulton Seed and his colleagues), or do we incorporate data generated by any
kind of penetration test? Or only those data in which the testing
procedures are described adequately (thereby eliminating some
early-generation data)?

How about geochemical language?

What are our responsibilities here?

--------------------------------------------------------------------------------------------------

(j) What kinds of feature-level locational-accuracy issues should be
addressed by our science language, as these bear on agency accountability?

(k) What kinds of feature-level scientific-confidence issues should be
addressed by our science language, as these bear on agency accountability?

(l) What kinds of feature-level data-origination issues should be addressed
in our science language, as these bear on agency accountability?

**As respondents have indicated, these three questions relate to "truth in
advertising": what are the quality of the geologic-map data? I haven't
detected a trend in the responses yet, but it seems like some of you believe
this is the purview of the data-base builder (map maker) and the data-base
model people-not the purview of science language.

But look at some of the queries that came forward: there are many that link
specific geologic questions with feature-level "metadata" questions. I
believe our deliberations must include discussions about feature-level
metadata-not in a technical FGDC sense but in the sense of major categories
of data-quality.

These include:

(1) X/Y positional accuracy
(2) Scientific confidence
(3) Method of feature identification
(4) Source of compiled information

I propose that we spend some time discussing the language of these metadata
issues, because they are part of our feature-level science.

Consider, for example, the following terms:

Contact, certain
Fault, certain
Fold axis, certain
Fault, approximate
Fold axis, approximate
Fault, inferred

What does "certain" mean in each of these cases? "Approximate"? "Inferred"?

I, for one, would never want to attribute a geologic-map data base or its
derivative plot with the appelation "Fault, certain". Lawyers would have a
field day with that nomenclature, no matter how carefully I defined the term.

Does "certain" refer to the fault's "existence", or to it's "location"?
Does "inferred" refer to "locational integrity" or to "scientific integrity"?

I believe we need to spend some time considering these feature-level
metadata issues, because ultimately, data-quality and data-origin issues are
as important as data-meaning issues (definitions).

No matter how controversial and conflictual such subjects are, I believe we
map-making and map-using geoscientists have a responsibility to recommend to
our agencies what kinds of data-quality issues and their attendant language
are appropriate to protect agency accountability. If we don't make these
recommendations, who will?

Context of this discussion

This page is part of a discussion of Some generic issues to consider:

Previous comment: Some Initial Concerns
Next comment: Some thoughts on issues in working memo#1

Further discussion of Generic issues (this page):

(No comments about this document have been posted.)