The authors of geographic
information standards at European (CEN), American
(FGDC) and international (ISO),
levels and the implementation of these standards in the Universal Geospatial Metadata Manager
(GeM+).
Other international initiatives:
The OpenGIS Consortium
Comments on the OGC Abstract specification Topic 11: Metadata.
Implementation of the OpenGIS Abstract Specifications in GeM+.
Other (Spanish) national initiatives:
MIGRA v.1: (Mecanismo de Intercambio de Información Geográfica Relacional formado por Agregación).
MIGRA metadata.
Implementation of MIGRA in GeM+.
The Comité Européen de Normalisation (CEN) has the role of promoting standards in all fields within Europe. It is constituted by the national standards authorities of: Austria, Belgium, The Czech Republic, Denmark, Finland, France, Germany, Greece, Iceland, Ireland, Italy, Luxembourg, The Netherlands, Norway, Portugal, Spain, Sweden, Switzerland and the United Kingdom, as well as representatives of the following industry associations:
At the end of October 1991 the CEN created the Technical Committee CEN/TC 287 with responsibility for standardisation in the frame of geographic information. The European Norm, called the EN, produced by this Technical Committee will substitute the previous norms in the above-mentioned countries.
The Technical Committee 287 (TC 287) known as Geographic Information, has published a total of 12 documents in pre-normative form (ENV or experimental norms). These documents are not public. To obtain a copy it is necessary to request it from the corresponding national authority - in Spain AENOR (Asociación Española de Normalización y Certificación) - and to pay the established price. The documents are not available in digital form or through the Internet. It is foreseen that, once these standards become national norms (UNE) they will be translated from English (to Spanish, for instance), but MiraMon has been developed with reference to the English versions.
The Universal Geospatial Metadata Manager (GeM+) gathers together specifications contained in the following documents:
Other norms that, because of their content, also affect the normalisation of the GeM+ metadata are:
Outlines of the contents of each document produced by the Comité are available.
STRUCTURE OF THE CEN METADATA DOCUMENT
The European Prestandard (ENV) 12657, Geographic Information - Data description - Metadata, was approved by the CEN in October 1998 initially with a validity of 3 years. After 2 years the members of the CEN are invited to express their opinions, particularly in relation to the conversion of this pre-normative document into a European norm. This was expected in October 2000, and the initial period of validity ended in October 2001.
The document is structured in four chapters and three annexes. The metadata list is contained in chapter 4, section 4.6 Elements of metadata for describing geographic datasets. Annex A, which is normative, is the description of the metadata in the EXPRESS language. This language is defined in ISO 10303-11:1994 and was created to define entities through their attributes, as well as through the relations between them. Annexes B and C are informative. The first is another description of the metadata but in EXPRESS-G, that is in diagram form, whilst Annex C is the complete list of fields with their descriptions, their limitations (whether or not they are obligatory), their cardinality and their type. The latter is limited to distinguishing between strings and dates. (It is not meant to be exhaustive).
During the development of GeM+ special interest was given to clarifying the real utility of the EXPRESS language as it seemed to be excessively complicated. The trend, led by ISO, to use the UNIFIED MODELLING LANGUAGE (UML) for the conceptual schemes and XML as the interchange format, has been followed.
The document begins with a simple and classical definition of Metadata ('data about datasets') and establishes, as an aim, the specification of the data that will document geographic datasets. These include information about the content, the representation, the extension (geographic and in time), the spatial reference, the quality and administrative information. It also defines the minimum content (mandatory) that metadata should have and it includes a glossary with definitions of the terms used.
It does not provide either instructions or techniques for applying the scheme since, as explained, this is not the aim of the document. Moreover, it defines itself as being designed mainly for defining digital data, but at the same time stating that its principals are also applicable for describing geographic datasets in other formats, such as paper.
There are two types of metadata: obligatory (or mandatory) and optional. The obligatory are indicated by being preceded by the word "shall" whilst the optional are preceded by "may".
For a metadata set to conform to this norm, it should include the keys defined as mandatory in the following chapters:
and it should optionally provide information about the contents in the following:
All the dates given in the text follow the EN 28601 norm, i.e., Year-month-day (YYYY-MM-DD).
The language used by the texts in the dataset descriptions is to be identified by the codes defined in ISO 6 39. Where required these may be repeated in other languages.
GeM+ is based on the European standard since it incorporates as "obligatory" all of the contents considered as such by the CEN, but it also follows the American standard since it also incorporate as obligatory everything defined as either Mandatory or Mandatory if applicable by the FGDC.
Consult a summary table showing the
equivalences between GeM+ and the CEN and FGDC standards.
The FEDERAL GEOGRAPHIC DATA COMMITTEE
The Federal Geographic Data Committee (FGDC) is the branch of the US Federal Government charged with developing the National Spatial Data Infrastructure in those areas related to the distribution of geographic data. In June 1994 the the Content Standard for Digital Geospatial Metadata was approved, the latest revision of which is available at http://www.fgdc.gov/metadata/csdgm/. The version used for reference when developing GeM+ can also be consulted in pdf format
Organization of the Standard
The standard is organized in a hierarchy of data elements and compound elements that define the information content for metadata to document a set of digital geospatial data. The starting point is "metadata" (section 0). The compound element "metadata" is composed of other compound elements representing different concepts about the data set. Each of these compound elements has a numbered section in the standard. In each numbered section, these compound elements are defined by other compound elements and data elements. The section "contact information" is a special section that specifies the data elements for contacting individuals and organizations. This section is used by other sections, and is defined once for convenience.
*Other general sections like this are the Citation Information and the Time Period Information.*
Compound Elements
A compound element is a group of data elements and other compound elements. All compound elements are described by data elements, either directly or through intermediate compound elements. Compound elements represent higher-level concepts that cannot be represented by individual data elements. The form for the definition of compound elements is:
Compound element name -- definition.
Type: compound
Short Name:
The type of "compound" uniquely identifies the compound elements in the lists of terms and definitions. Short names consisting of eight alphabetic characters or less are included to assist in implementation of the standard.
Data Elements
A data element is a logically primitive item of data. The entry for a data element includes the name of the data element, the definition of the data element, a description of the values that can be assigned to the data element, and a short name for the data element. The form for the definition of the data elements is:
Data element name -- definition.
Type:
Domain:
Short Name:
The information about the values for the data elements include a description of the type of the value, and a description of the domain of the valid values. The type of the data element describes the kind of value to be provided. The choices are "integer" for integer numbers, "real" for real numbers, "text" for ASCII characters, "date" for day of the year, and "time" for time of the day.
The domain describes valid values that can be assigned to the data element. The domain may specify a list of valid values, references to lists of valid values, or restrictions on the range of values that can be assigned to a data element.
The standard categorizes elements as being mandatory, mandatory-if-applicable, or optional as follows:
The optionality of a section or compound element always takes precedence over the elements that it contains. Once a section or compound element is recognized by the data set producer as applicable, then the optionality of its subordinate elements is to be interpreted.
Production Rules
A production rule specifies the relationship between a compound element, and data elements and other (lower-level) compound elements. Each production rule has a left side (identifier) and a right side (expression) connected by the symbol "=", meaning that the term on the left side is replaced by or produces the term on the right side. Terms on the right side are either other compound elements or individual data elements. By making substitutions using matching terms in the production rules, one can explain higher-level concepts using data elements. The symbols used in the production rules have the following meaning:
= is replaced by, produces, consists
of + and [|] selection - select one term from the list of enclosed terms (exclusive or). Terms are separated by "|" m{}n iteration - the term(s) enclosed is(are) repeated from "m" to "n" times () optional - the term(s) enclosed is(are) optional |
Only for terms bounded by parentheses does the producer have the discretion of deciding whether or not to provide the information.
The variation among the ways in which geospatial data are produced and distributed, the fact that all geospatial data does not have the same characteristics, and the issue that all details of data sets that are in work or are planned may not be decided, caused the need to express the concept of "mandatory if applicable." This concept means that if the data set exhibits (or, for data sets that are in work or planned, it is known that the data set will exhibit) a defined characteristic, then the producer shall provide the information needed to describe that characteristic. This concept is described by the production rule:
0{ term }1
The content is structured in the following way:
Identification Information | Mandatory |
Data Quality Information | Mandatory if applicable |
Spatial Data Organization Information | Mandatory if applicable |
Spatial Reference Information | Mandatory if applicable |
Entity and Attribute Information | Mandatory if applicable |
Distribution Information | Mandatory if applicable |
Metadata Reference Information | Mandatory |
The International Organization for Standardization (ISO) is the international body responsible for standardization. Its name is a word play on the Greek prefix meaning "the same in all aspects". It has its headquarters in Geneva and it integrates the national standards authorities of 80 countries, including the US ANSI and the AENOR in Spain, to give just two well-known examples.
ISO has created a Technical Committee; TC 211 Geographic information/Geomatics, whose work is based on the standards previously developed by the FCDC, the CEN and the OpenGIS Consortium.
The projects forming part of TC 211 in May 2000
were:
19101 (15046-1): Geographic information - Reference model
19102 (15046-2): Geographic information - Overview
19103 (15046-3): Geographic information - Conceptual schema language
19104 (15046-4): Geographic information - Terminology
19105 (15046-5): Geographic information - Conformance and testing
19106 (15046-6): Geographic information - Profiles
19107 (15046-7): Geographic information - Spatial schema
19108 (15046-8): Geographic information - Temporal schema
19109 (15046-9): Geographic information - Rules for application schema
19110 (15046-10): Geographic information - Feature cataloguing methodology
19111 (15046-11): Geographic information - Spatial referencing by coordinates
19112 (15046-12): Geographic information - Spatial referencing by geographic identifiers
19113 (15046-13): Geographic information - Quality principles
19114 (15046-14): Geographic information - Quality evaluation procedures
19115 (15046-15): Geographic information - Metadata
19116 (15046-16): Geographic information - Positioning services
19117 (15046-17): Geographic information - Portrayal
19118 (15046-18): Geographic information -: Encoding
19119 (15046-19): Geographic information - Services
19120 (15854): Geographic information - Functional standards
19120/Amedmend 1: Geographic information - Functional standards - Amendment 1
19121 (16569): Geographic information - Imagery and gridded data
19122 (16822): Geographic information/Geomatics - Qualifications and Certification of Personnel
19123 (17753): Geographic information - Schema for coverage geometry and functions
19124 (17754): Geographic information - Imagery and gridded data components
19125: Geographic information - Simple feature access - SQL option
19126: Geographic information - Profile - FACC Data Dictionary
19127: Geographic information - Geodetic codes and parameters
A summary of the content of each of these projects can be found at http://www.statkart.no/isotc211/scope.htm
The work schedule of TC211 has evolved from one working meeting to another. According to the FGDC web page, during the last plenary meeting, which took place between 6 and 10 March 2000 in Capetown, South Africa, the following forecasts for the ISO Metadata Standard were established:
3rd Committee Draft Issued: May 2000 (restricted access)
Draft International Standard: October 2000
Final Draft International Standard: March 2001
International Standard: July 2001
The draft documents are not accessible by the general public. Unless the ISO policy changes, the documents that are accessible are only available in paper form, and must be paid for at prices set by the respective national standards authorities such as AENOR in Spain. Only the publications index is available through the Internet, as well as the possibility of requesting documents. Curiously, U.S. users are provided with a password that gives them free access to these documents.
With regards to telematics standards, the ISO has its equivalent in the ITU-CCITT (ITU, the International Telecommunication Union; CCITT, Comité Consultatif International Télégraphique et Téléphonique) whose standards are freely available through the Internet. In some cases, draft versions of documents are also available in this way.
Despite the problems of access to ISO documents, there are a number of addresses providing various documents and some ISO norms, like, for instance, ftp.uni-erlangen.de.
Future versions of GeM+ will incorporate the ISO norms once they are published. This will undoubtedly not greatly affect the substance of GeM+ since it already implements European and American standards.
The OpenGIS Consortium (OGC), is an international organisation founded in 1994. Currently it is constituted by 210 members, including prestigious universities and research centres, as well as software and telecommunications companies (Autodesk, Inc., Bentley Systems, Inc., Intergraph Corporation, MapInfo Corporation, Microsoft Corporation and many more). US organisations with responsibility for producing cartography, such as the US Army ERDC, the US Census Bureau, the US Federal Geographic Data Committee, the US National Aeronautics and Space Administration-NASA, the US National Imagery and Mapping Agency and many others are also active participants.
The OpenGIS Consortium participates, at the same time, in the various standardisation initiatives, notably the TC 211.
The aim of the OGC is to put an end to problems originating in the exchange of information between operating systems, between DCPs (Distributed Computing Platforms) and between different user groups who with their own specific data models.
The Technical Committee (TC) is charged with developing the
so-called OpenGIS Interface Specifications. It is organised in
Special Interest Groups (SIGs), which are, in turn, subdivided in
two broad categories according to the tasks they perform: Domain
Technologies and Core Technologies. Their functions are briefly
explained:
Feature SIG: Develops consensus on feature
geometry, identity, and relationships.
Coordinate Transformation SIG: Develops strategies for
encoding of coordinate systems and transformations between coordinate
systems.
The following text is illustrative of the work carried out by the SIGs (Special Interest Groups) and the subsequent process:
Each SIG develops a white paper describing the SIG's purpose and scope of work. SIGs develop "use cases," or detailed scenarios which 1) identify a range of actors (people and systems) and services (for example, in imaging: rectification, orthorectification, registration, differencing, feature detection, image-to-image comparison, synthetic photo generation, etc) and 2) identify where services appear in the scenario, and where value is generated. Interface requirements are captured in a formal software modelling language (UML) and become part of the OpenGIS Abstract Specification, a high level specification independent of computing platforms.
The OpenGIS(r) Abstract Specification is a living document subject to changes and additions at each OGC Technical Committee Meeting. Only members of OGC can formally propose changes and additions.
Specifications can be downloaded from: http://www.opengis.org./techno/specs.htm
It is worth repeating that these specifications may be subject to change but that such changes can only be initiated by members of the OGC.
Some example of Abstracts:
Topic 1 - Feature Geometry version 4
Topic 2 - Spatial Reference Systems
Topic 11 - Metadata
On publication of the OGC Abstract Specifications there follows the Request for Proposals (RFP), which are the intermediate step towards the Implementation Specifications:
OpenGIS(r) Simple Features Specification for CORBA
Revision 1
OpenGIS(r) Simple Features Specification for SQL
The OGC also publishes Recommendations:
To date only one recent (12 May 2000) recommendation has been published, with the title Geography Markup Language (GML).
Comments on the OGC Abstract specification Topic 11: Metadata
The version that is currently linked to the OGC web is from 3 March 1998. The latest modification dates from May 1999. This document defines the Essential Model and the Abstract Model for metadata.
The final selection of which metadata entities and elements to associate with each Feature and Feature Collection is left to each data producer ad/or geospatial information community. This selection of metadata elements and entities will often be somewhat different for different feature types and different Feature Collection purposes. This OGC document leaves the definition of standardized metadata elements and entities to other organizations, including ISO/TC 211, Federal Geographic Data Committee (FGDC), and more specific geospatial information community organizations.
This document is subject to change until its contents allow the Implementation Specification to be produced.
Implementation of the OpenGIS Abstract Specifications in GeM+
The metadata work of the OGC is based entirely on the parallel developments of the ISO which, in turn, inherit the work carried out in the context of the CEN and the FGDC.
Other national initiatives
MIGRA is an initiative created by the Spanish national standards authority AENOR (Asociación Española de Normalización y Certificación) and the National Committee for Cooperation between Public Administrations in the realm of IT, the COAXI (Comisión Nacional para la Cooperació entre las Administraciones Públicas en el campo de los Sistemas y Tecnologías de la Información) that brings together the Ministry of Public Administrations (MAP) and the Federation of Municipalities and Provinces (FEMP).
MIGRA v.1 was developed from January to September 1996 and since January 1997 it is an experimental national standard (UNE exp.), which should be revised after three years. (It should have been revised in January 2000, but at the time of writing there was no news of this revision). This revision should consider the substitution of the norm by the European or International standard available at that time. A summary note can be found at http://www.map.es:80/csi/pg5m51.htm.
Principal Characteristics:
MIGRA is a digital geographic data transfer standard that includes:
It only applies to vector data. It makes no references to raster data except in the introduction where it states that existing commercial formats are sufficiently well-known for there to be no problems when exchanging data.
Indirect reference systems are not considered either, it deals only with reference systems based on coordinates.
MIGRA does not deal with queries in GIS nor the transfer of possible query responses, nor mechanisms for transferring updates.
It uses the EXPRESS (ISO 10303-11:1993) language - as adapted by the CEN for the definition of conceptual data models-, but the choice of EXPRESS is related to maintaining coherence with other European norms even though, as noted by its authors in the introduction to the Express conceptual model, it is not the most suitable language for diffusing and implementing the MIGRA standard.
The MIGRA data model defines four levels of structure:
The norm includes an example of each level of structuring.
The MIGRA catalogue of elements is made up of:
This gathers together the classes of elements used by the National Geographic Institute (Instituto Geográfico Nacional - IGN) and the Directorate General of the Cadastre (CGC), as well as those added by the Federation of Municipalities and Provinces (FEMP).
The coding criteria follow a hierarchical model with three levels: Theme, group and subgroup.
The MIGRA physical model specifies that all the files and all the fields are fixed length registers, except the metadata, and treats the relational tables in 3ª Normal Form. The latter guarantees that the tables do not contain redundant data, and that anomalies are not produced when updating, deleting or inputting data.
The contents of the fields in MIGRA format archives may be of the following types:
Should a field contain either of the last two types, then the following rule is applicable:
The difference between MIGRA and the CEN standard is that MIGRA does define a format for the metadata files. In this format there are two classes of registers:
Although it does not say so explicitly, this format, like the GeM+ format, faithfully follows the Windows INI format, with just two exceptions:
The abbreviations NA and ND are used whenever they apply and no variable should have a blank space as its value.
MIGRA does not specify whether or not sections and variables are obligatory, or what type of cardinality they are organised into. The sections are the following:
[PRODUCTOR](producer)
[DISTRIBUIDOR](distributor - if different from the producer)
[DATOS](data)
[CONTENIDO](content)
[NOTAS](notes)
Within the [CONTENIDO] (content) section can be found information refering to the thematic and quality attributes. These are, in summary:
Thematic attributes
Two subsections are detailed:
[FICHERO_DE_ATRIBUTOS_1](attribute file 1)
This is the physical description of the attribute file: length and number
of registers, size in bytes, etc.
[ATRIBUTOS_DEL_FICHERO_1](file attributes 1)
This is the specific information on each attribute:
Quality
The quality criteria applicable for data exchange are, according to MIGRA, those referring to:
The criteria established for data quality are: