Note: This document must be considered as a development document only endorsed by the authors.
- Preface
- Introduction
- Scope
- Normative References
- Terms and Definitions
- Compliance Encoding Rule
- Schema Conversion Rule
- Application Schemas
- Types
- Feature types
- Data Types
- ISO 19103
- ISO 19107 - Geometry types
- ISO 19108 - Temporal types
- ISO 19115 - Metadata types
- ISO 19139 - Metadata XML implementation types
- Other types
- Flattening of nested structures
- Replace abstract types in properties by Identifier
- Union Types
- Enumerations and Code lists
- Voidable
- Alternate Structures for specific types or type hierarchies
- Properties
- Association Roles
- Naming modification
- Dataset Encoding Rule
- Schema Conversion Rule
- Flattened Dataset Encoding Rule
This document is based on an outcome of Action 2017.2 on alternative encodings for INSPIRE data. Specifically, it is based on the initial template for proposing new encodings and the ongoing work in the INSPIRE UML-to-GeoJSON encoding rule.
A GeoPackage is an open, standards-based, platform-independent, portable, self-describing, compact format for transferring geospatial information within a a SQLite database. The GeoPackage Encoding Standard describes a set of conventions for storing the following within a SQLite database:
- vector features
- tile matrix sets of imagery and raster maps at various scales
- attributes (non-spatial data)
- extensions
The GeoPackage Encoding Standard has been developed by the Open Geospatial Consortium. The current version is 1.2.1 (2018-09-06). The GeoPackage Encoding Standard standard defines the schema for a GeoPackage, including table definitions, integrity assertions, format limitations, and content constraints. The required and supported content of a GeoPackage is entirely defined in the standard. These capabilities are built on a common base and the extension mechanism provides implementers a way to include additional functionality in their GeoPackages.
Since a GeoPackage is a database container, it supports direct use. This means that the data in a GeoPackage can be accessed and updated in a native storage format without intermediate format translations. GeoPackages that comply with the requirements in the standard and do not implement vendor-specific extensions are interoperable across all enterprise and personal computing environments.
Within INSPIRE, this encoding rule can be used to encode large datasets from several, complex themes without information loss in a single, portable file. Any tool able to access to SQLite database can access and update attribute data within a GeoPackage. ArcGIS, QGIS, FME, GeoTools are desktop and server tools that implement the GeoPackage Encoding Standard and provide access to GeoPackages. GDAL (C/C++/Python) and NGA GeoPackage Libraries (Java/Objective-C/JavaScript) and SDK (Android/iOS/Web) can be used for storing and access data from GeoPackages in Desktop, server and mobile environments.
This sections describes the scope of the INSPIRE UML-to-GeoPackage encoding rule.
This encoding rule has two parts:
- The Compliance encoding rule, intended for INSPIRE compliance.
- The Flattened Dataset encoding rule, intended for data usability.
The Compliance encoding rule can be used for:
- Creating a GeoPackage file with a INSPIRE schema for being used as template for implementers (Schema Conversion Rule).
- Encoding a dataset in a GeoPackage file using such template (Dataset Encoding Rule).
The Flattened Dataset encoding rule can be used for flattening a dataset encoded with the Compliance encoding rule without loss of relevant information.
The Compliance encoding rule specifically addresses INSPIRE compliance. This rule intended to ease INSPIRE data compliance checking procedures.
The Flattened Dataset encoding rule specifically addresses data usability in desktop client software, such as ArcMap and QGIS. This rule optimizes usage of a INSPIRE dataset for mapping and geoprocessing in such applications.
For each theme to be covered, an empty GeoPackage template is provided. The GeoPackage template is a GeoPackage instantiated with the schema for a specific INSPIRE theme.
These can reside in subfolders in the same repository as this general encoding rule, but can also be maintained in other places.
This encoding rule addresses specific technical issues that have been problematic when using the default encoding:
- Most GIS software cannot fully make use of non-simple attributes and nested structures for styling, processing and filtering;
- Multiple values per (complex) properties cannot be used fully in ArcGIS and other GIS tools;
- References to other features often cannot be resolved by GIS tools; Properties of referenced features cannot be used in styling or for filtering;
- Abstract geometry types for an object mean that a wide range of different geometries can be used for any single feature class;
- Mixed geometry types in a FeatureCollection are usually not supported.
The encoding uses the OGC® GeoPackage Encoding Standard - with Corrigendum version 1.2.1 with the following extensions:
- GeoPackage Non-Linear Geometry Types
- Metadata
- Schema
- Related Tables Extension
- WKT for Coordinate Reference Systems
TIN and 3D geometries are not supported yet, however they can be stored as BLOB
.
Raster (native, as tile matrix) and coverage (extension) data support has not been tested yet.
The Implementing Rules on interoperability of spatial data sets and services (Commission Regulation (EU) No 1089/2010) lays down the following requirements for encoding rules:
Article 7 -- Encoding
- Every encoding rule used to encode spatial data shall conform to EN ISO 19118. In particular, it shall specify schema conversion rules for all spatial object types and all attributes and association roles and the output data structure used.
- Every encoding rule used to encode spatial data shall be made available.
D2.7 specifies more detailed requirements and recommendations for encoding rules. The following list lists the requirements from that document and shows which ones are also met in this encoding rule:
- Requirement 12: ... documents shall be required to be encoded using UTF-8 as character encoding.
D2.7 also contains a relevant recommendation:
- Recommendation 3: Encoding rules should be based on international, preferably open, standards.
This section contains references to standards documents and related resources.
- OGC® GeoPackage Encoding Standard - with Corrigendum version 1.2.1
- [OGC GeoPackage Related Tables Extension][geopackage-related]
- INSPIRE Drafting Team Data Specifications. D2.7: Guidelines for the encoding of spatial data, Version 3.3
Terms and Definitions can be found in the Glossary document.
INSPIRE defines the conceptual model using UML. The default encoding rule maps this UML model to XML Schema.
In the GeoPackage encoding rule, we take a two-step approach, where we apply model transformations on the level of the conceptual model to transform INSPIRE constructs into conceptual GeoPackage constructs, and then we materialize these into a GeoPackage template per theme.
Theme-specific GeoPackage Encoding Rules may provide a formal definition of the encoded data structures as normative SQL statements and abstract test suites.
Executable test suites may use these to test the correctness of data elements (tables, columns, or values) and SQL constructs (views, constraints, or triggers).
For GeoPackage, SQL queries to GeoPackage metadata tables (e.g. gpkg_contents
, gpkg_spatial_ref_sys
) can be used to perform simple validation on GeoPackages.
A GeoPackage file
may encode one or more <<applicationSchema>>
packages related to a spatial data theme.
Deriving a GeoPackage file
from application schemas implies the creation of SQL statements for feature tables
, attribute tables
, user-defined mapping tables or views
and related metadata.
The application schemas to be encoded may include references to types, enumerations and code lists defined in other packages in attributes and navigable associations, which in turn may include additional references.
The process of inclusion of references is recursive and requires a stop criteria.
The logical boundary of the encoding of an application schema is defined by the set of <<featureTypes>>
defined in other packages that can be reached from items defined in the application schema.
This criterium defines a tangible boundary where the GeoPackage file makes sense as transport file of a dataset defined by one or more application schema.
Given a package A
with the stereotype <<applicationSchema>>
, its encoding as GeoPackage file
must include:
- Concrete spatial object types.
- Concrete data types.
- Enumerations and code lists.
- Types, enumerations and code lists reachable recursively except those that can only be reached through a
<<featureType>>
defined in an<<applicationSchema>>
different fromA
.
References to <<featureTypes>>
in the logical boundary are replaced by the type Identifier
.
This approach is similar to the role of XLink
in the GML
specification to implement referencing.
Model transformation rule:
Given a set
root
of<<applicationSchema>>
packages, this rule determines which types are candidates to be encoded in aGeoPackage file
.
Let
GeoPackage
be the set that contains all<<featureType>>
,<<dataType>>
,<<union>>
,<<codeList>>
and<<enumeration>>
found in packages ofroot
.Let
Boundary
be an empty set.Let
Frontier
be the set of the types referenced by means of attributes and associations, local or inherited, of elements of the set ofGeoPackage
not included in the setGeoPackage
.If
Frontier
is not empty:
- Add the elements of the
Frontier
bearing the stereotype<<featureType>>
toBoundary
.- Add the remaining elements of
Frontier
toGeoPackage
.- Go to step
3
.For each
item
inGeoPackage
do:
- Replace each reference of
item
to a type inFrontier
by the typeIdentifier
.The set
GeoPackage
contains the items to be encoded using this rule.
The GeoPackage encoding standard mandates in Requirement 30 that feature tables
(or views) shall only have a single geometry column.
This is advantageous because allowing multiple (or none) geometries per feature table would compromise GeoPackage's position in the GIS application ecosystem.
However, there are features types in INSPIRE models with none or multiple geometries.
Therefore, all concrete types that have the stereotype <<featureType>>
can be converted to GeoPackage content with the following considerations:
- Where a class has one attribute with a geometry type and a maximum multiplicity of
1
, the class will be mapped to afeature table
and the type of this attribute will be mapped to a GeoPackage geometry type (ISO 19107 - Geometry types). - Where a class has no geometry attributes (for example
Addresses::ThoroughfareName
), the class will be mapped to aattribute table
(see Data types). A GeoPackagefeature table
shall have one geometry column. - Where a class has several attributes with a geometry type, all with a maximum multiplicity of
1
, and only one is not<<voidable>>
, the class will be mapped to afeature table
, all<<voidable>>
geometry attributes are moved to newfeature type
classes named as the name of the original class plus an underscore (_
) plus the name of the attribute (e.g.CadastralParcel_referencePoint
,Borehole_downholeGeometry
) and an 1:1 association role is added between each new class and the original. [1] - Otherwise, i.e. a class has more than one non voidable geometry attribute or the maximum multiplicity is greater than
1
, a theme-specific profile has to define which geometry attribute is the default geometry and how to map the additional geometry attributes. A GeoPackagefeature table
shall have only one explicit geometry column.
[1] Since a GeoPackage is a relational database, it is recommended for associating values relating to the same feature in different feature tables
the use of the same row id
to link them.
All concrete types that have the stereotype <<dataType>>
can be converted to GeoPackage content with the following considerations:
- Where a class has no geometry attributes (for example
Addresses::ThoroughfareNameValue
), the class will be mapped to a GeoPackageattribute table
if, after applying the all the encoding rules, the class is yet used as type in some property, i.e. the class participates in at least onerelation table
derived from a property. - Otherwise, the rules for Feature types must be applied because
attribute tables
are intended for data that do not have an explicit geometry attribute.
All property types are transformed to the simple types that GeoPackage knows about. The exact mapping from the UML model to the GeoPackage datatypes is outlined in the following table:
Table: ISO 19103 to GeoPackage datatype mapping.
UML Model property type | GeoPackage datatype | SQLite storage class | Conversion Notes |
---|---|---|---|
CharacterString |
TEXT |
TEXT |
Unicode [1] |
URI |
TEXT |
TEXT |
Unicode |
Boolean |
BOOLEAN |
INTEGER |
0 for false and 1 for true |
Integer |
INTEGER |
INTEGER |
|
Real |
REAL |
REAL |
|
Number |
REAL |
REAL |
Alternative encoding [2] |
Decimal |
REAL |
REAL |
|
Date |
DATE |
TEXT |
ISO8601 date strings YYYY-MM-DD . |
DateTime |
DATETIME |
TEXT |
ISO-8601 date/time in UTC strings YYYY-MM-DDTHH:MM:SS.SSSZ |
UnitOfMeasure |
TEXT |
TEXT |
Constraint enum [3] |
[1] SQLite stores all text as Unicode characters encoded using either UTF-8 or UTF-16 encoding. The SQLite API contains functions which allow passing and retrieving text using either UTF-8 or UTF-16 encoding. These functions can be freely mixed and proper conversions are performed transparently when necessary.
[2] Implementers may use TEXT
if a dataset needs it; values stored in SQLite as TEXT
can be used in SQL CAST
expressions to convert them into INTEGER
and REAL
values. If the text:
- Looks like an integer and the value is small enough to fit in a 64-bit signed integer, then the result will be
INTEGER
. - Looks like floating point (there is a decimal point and/or an exponent) and can be losslessly converted back and forth between IEEE 754 64-bit float and a 51-bit signed integer, then the result is
INTEGER
. - Any other case yields the closest
REAL
result.
Any UML Model property whose Integer
or Real
values may overflow the limits of the respective SQLite storage class (i.e. INTEGER
, REAL
) should be mapped to TEXT
, with specific rules being defined on a case-by-case basis in each theme profile.
[3] The syntax and value space is the same of gml:UomIdentifier
.
The legal values of the new property are the literals of the data column constraint enum GML_UomIdentifier
.
This restriction may be enforced by SQL triggers or by code in applications that update GeoPackage data values.
Any other UML Model property type is to be mapped to TEXT
, with specific rules being defined on a case-by-case basis in each theme profile.
The unit of measurement attribute (uom
) on any property x
has to be retained.
It is transformed to a new property with the name x_uom
that will be mapped to a column of type TEXT
with the enum constraint GML_UomIdentifier
.
ISO 19107 defines a set of Geometry types, which need to be mapped to the types available in GeoPackage.
In a GeoPackage, features are geolocated using a geometry subset of the SQL/MM (ISO 13249-3).
The supported geometry model is as follows:
GEOMETRY
(abstract) subtypes arePOINT
,CURVE
(abstract),SURFACE
(abstract) andGEOMCOLLECTION
.CURVE
(abstract) subtypes areLINESTRING
,CIRCULARSTRING
andCOMPOUNDCURVE
.SURFACE
(abstract) subtype isCURVEPOLYGON
.CURVEPOLYGON
subtype isPOLYGON
.GEOMETRYCOLLECTION
subtypes areMULTIPOINT
,MULTICURVE
andMULTISURFACE
.MULTICURVE
subtype isMULTILINESTRING
.MULTISURFACE
subtype isMULTIPOLYGON
.
The correspondence of concepts of this geometry subset with concepts of the geometry model of ISO 19107 are described in ISO 19125-1:2004.
The tables below contain a the mapping of ISO 19107 used in INSPIRE UML models.
Table: ISO 19107 types used in models and its standard GeoPackage datatype mapping.
ISO 19107 type | GeoPackage datatype |
---|---|
GM_Curve |
CURVE |
GM_MultiCurve |
MULTICURVE |
GM_MultiSurface |
MULTISURFACE |
GM_Object |
GEOMETRY |
GM_Point |
POINT |
GM_MultiPoint |
MULTIPOINT |
GM_Polygon |
POLYGON |
GM_Surface |
SURFACE |
Table: ISO 19107 types used in models with non standard datatype mapping.
Application Schema | ISO 19107 type | GeoPackage datatype | Conversion Notes |
---|---|---|---|
Building 3D | GM_Solid |
BLOB |
Add the column geometrySolid_content_type . Store the 3D data in the column of type BLOB and specify the format in geometrySolid_content_type . [1] |
ElevationTIN | GM_Tin |
BLOB |
Store the TIN encoded in WKB. [2] |
Environmental Monitoring Facility | GM_Boundary |
POLYGON |
Used as type in EnvironmentalMonitoringActivity.boundingBox |
Hydro - Physical Waters, Hydrogeology | GM_Primitive |
GEOMETRY |
Restricted to POINT , CURVE or SURFACE and their subtypes. |
[1] The GeoPackage specification does not support 3D data, but it is a established practice in GeoPackage to store any kind of content along its content type.
The proposed mapping may imply the use of an attribute table
instead of a feature table
if the feature type has no additional geometry columns.
[2] The GeoPackage specification does not mention Triangle (GM_Triangle
), PolyhedralSurface (GM_PolyhedralSurface
) or TIN (GM_Tin
).
However the GeoPackage geometry blob format is based on ISO WKB, and hence can represent these geometric objects.
It is assumable that a future GeoPackage extension may include them as supported geometry types.
The proposed mapping may imply the use of an attribute table
instead of a feature table
if the feature type has no additional geometry columns.
If a specific dataset requires a geometry type that cannot be mapped to an instantiable type in the GeoPackage geometry model (either the Core model or an extension such as the Non-linear geometry type extension) and a standard encoding for the geometry is available, use the same strategy applied for GM_Solid
for such dataset.
In the GeoPackage 1.2.1 specification, Requirement 32 feature table
geometry columns shall contains geometries of the type or assignable for the type specified for the column in the gpkg_geometry_columns
table.
However, it is under consideration for the GeoPackage 1.3 specification (see here) that feature table
geometry columns shall contains only geometries of the type specified for the column in the gpkg_geometry_columns
table.
There will be two exceptions:
GEOMETRY
, then the feature table geometry column may contain geometries of any allowed geometry type.GEOMETRYCOLLECTION
then the feature table geometry column may contain zero or more geometries of any allowed geometry type.
This future change, when officially adopted, may lead to changes in this encoding rule.
For types from ISO 19108 used in INSPIRE schemas, suitable mappings need to be found on a case-by-case basis.
For types from ISO 19115 used in INSPIRE schemas, suitable mappings need to be found on a case-by-case basis.
Table: ISO 19115 to GeoPackage datatype mapping.
ISO 19115 type | Attribute | GeoPackage datatype | Constraints | Conversion Notes |
---|---|---|---|---|
URL |
TEXT |
Table: ISO 19139 to GeoPackage datatype mapping.
ISO 19139 type | Attribute | GeoPackage datatype | Constraints | Conversion Notes |
---|---|---|---|---|
URI |
TEXT |
Table: Other types to GeoPackage datatype mapping.
Type | GeoPackage datatype | Constraints | Conversion Notes |
---|---|---|---|
Short |
INTEGER |
||
Long |
INTEGER |
The complex structure of model elements are reduced by applying a recursive flattening method. The principle of the flattening is to derive a flat model structure by moving the nested child elements to its parent. The elements are renamed to represent the former element path in the name of the resulting element and to avoid naming conflicts. The cardinality of the derived elements should be calculated from the cardinalities of the former element path. The remaining properties of the original element must be copied to the derived elements.
This model transformation rule does not handle cardinalities greater than 1.
Model transformation rule:
This rule flatten properties with maximum occurrence of 1 which whose type has not been already mapped to a GeoPackage type (e.g. ISO 19107 geometric types), it is not a
<<union>>
,<<codeList>>
or<<enumeration>>
, or is the base typeIdentifer
.Recursively go down through the complex model structure as follows. Given a candidate property
x
of typeB
in a classA
that can be flattened, it is replaced by the properties of the classB
as follows:
- Each property
y
ofB
is copied into classA
and then:
- The new property is renamed to
x_y
.- The minimum multiplicity mew property is the minimum of the minimum multiplicities of
x
andy
.- The remaining characteristics of the property
x
are copied to the new property.- Finally, the property
x
is removed from typeA
.This process is performed recursively until the rule can no longer be applied.
Where an abstract <<featureType>>
with multiple concrete sub-types is used as a property type, the property is renamed to provide a hint and the type is replaced by Identifier
.
This approach is similar to the role of XLink
in the GML
specification to implement referencing.
Model transformation rule:
This rule modifies properties that references an abstract
<<featureType>>
.
- For each property
x
ofA
whose type is an abstract<<featureType>>
B
.
- The property
x
type is changed toIdentifier
.- The property
x
is renamed tox_B
.
This rule flattens union
types.
A union
type is structured data type without identity where exactly one of the properties of the type is present in any instance.
The property x
in A
is replaced with the properties of the union
B
, i.e the union
type B
is flattened when:
- the value type of property
x
of typeA
isB
. - the maximum multiplicity of the property
x
is1
.
The process is as follows:
- Each property of type
B
(i.e.y
) is copied into typeA
and then:- Its renamed to
x_y
. - Its minimum multiplicity is set to
0
. - The remaining characteristics of
x
are copied to it.
- Its renamed to
- Next, the property
x
is removed from typeA
.
The restriction that exactly one of the properties of the type is present in any instance may be may be enforced by code in applications that update GeoPackage data values.
The use of enumeration values and references to code lists are common in the INSPIRE data specifications.
An enumeration is a fixed list of possible values.
A code list is a extendable list of possible values.
They are represented using properties whose type is the class with the stereotype <<enumeration>>
and <<codeList>>
.
The schema extension provides a means to describe the columns of tables in a GeoPackage with more detail than can be captured by the SQL table definition directly.
The gpkg_data_columns
table contains descriptive information about columns in tables.
The gpkg_data_column_constraints
table contains data to specify restrictions on basic data type column values.
This encoding rule uses the above extension and creates a column enum constraint associated to each list in the conceptual model, records its allowed values as column constraints, provides a link to an external definition, and uses human readable values in property values.
Model transformation rule:
This rule transforms classes with the stereotype
<<codeList>>
or<<enumeration>>
to GeoPackage data column constraints table of typeenum
.Given a class with the stereotype
<<enumeration>>
, a row is added to thegpkg_data_column_constraints
for each possible value of the code list. This row contains:
- In the column
constraint_name
, the name of the class in lowercase.- In the column
constraint_type
, the string"enum"
.- In the column
value
, the value.- In the column
description
, a URI that points to a description of the value, usually"http://inspire.ec.europa.eu/enumeration/<class_name>/<value>"
.Given a class with the stereotype
<<codeList>>
, a row is added to thegpkg_data_column_constraints
for each possible value of the code list. This row contains:
- In the column
constraint_name
, the name of the class in lowercase.- In the column
constraint_type
, the string"enum"
.- In the column
value
, the value.- In the column
description
, a URI that points to a description of the value, usually"http://inspire.ec.europa.eu/codelist/<class_name>/<value>"
.Given a property with maximum cardinality of 1 that references an enumeration or code list.
- The type of the property is mapped to
TEXT
.- The corresponding row of the
gpkg_data_columns
that describes the column that implements such property (identified by the coordinatestable_name
,column_name
) has in the columnconstraint_name
the name of the enumeration or the code list in lowercase.The encoding of higher cardinalities is discussed in extract primitive array.
This rule extends MT008 - Simplified Codelist Reference to enumerations.
If a property has exactly a cardinality of 1
but has the stereotype <<voidable>>
, it is mapped to a column
that allows null
values.
Remaining uses of the stereotype <<voidable>>
in properties are ignored.
Authoritative descriptions of the reasons for void values in the INSPIRE Registry (VoidReasonValue code list) are encoded using the metadata extension as follows:
Table: Records in gpkg_metadata.
id |
md_scope |
md_standard_uri |
mime_type |
metadata |
---|---|---|---|---|
1 |
attribute |
http://www.isotc211.org/2005/gmd |
text/xml |
Content of http://inspire.ec.europa.eu/codelist/VoidReasonValue/Unknown/Unknown.en.iso19135xml |
2 |
attributeType |
http://www.isotc211.org/2005/gmd |
text/xml |
Content of http://inspire.ec.europa.eu/codelist/VoidReasonValue/Unpopulated/Unpopulated.en.iso19135xml |
3 |
attribute |
http://www.isotc211.org/2005/gmd |
text/xml |
Content of http://inspire.ec.europa.eu/codelist/VoidReasonValue/Withheld/Withheld.en.iso19135xml |
4 |
attributeType |
http://www.isotc211.org/2005/gmd |
text/xml |
Content of http://inspire.ec.europa.eu/codelist/VoidReasonValue/Withheld/Withheld.en.iso19135xml |
The package Cultural and linguistic adaptability of ISO 19139 is an informative package (i.e. concepts are not implementable as-is) that defines types for representing localized texts (LocalisedChacterString
, PT_FreeText
, PT_Locale
, PT_LocaleContainer
).
The GeoJSON encoding rule regarding to ISO 19103 basic types maps LocalisedChacterString
value to a string and keeps languageCode
as a separate property.
The GeoPackage encoding rule proposes a simplified way to representing localized texts similar but more expressive with the new <<type>>
SimpleLocalisedCharacterString
that has two properties:
- a property that holds the text of type
CharacterString
. - a property named
locale
of type<<codelist>>
Locale
.
The value space of Locale
is defined by PT_Locale
. The recommended syntax is a URI or the string language[_country][.characterEncoding]
similar to POSIX locales.
characterEncoding
should be considered optional and, if used, it should be the same as the encoding of the GeoPackage (UTF-8 or UTF-16).
The new type SimpleLocalisedCharacterString
should replace the use of LocalisedChacterString
and PT_FreeText
.
When the new type replaces the use of PT_FreeText
in an attribute, the upper bound of the attribute must be set to unbounded, so we can capture the intended purpose of the association textGroup
.
Geographical Names are re-used throughout more than 20 INSPIRE themes overall.
For many existing data sets, the <<data type>>
GeographicalName
is over specified, with very little information being unique to each instance.
GeographicalName
is easy to implement in GeoPackage after applying Flattening of nested structures.
However, the one-to-many relationship of GeographicalName
and SpellingOfName
, where is the key text
property, may cause usability problems.
SpellingOfName
has tree properties, the text
which is the way the name is written, the <<voidable>>
script
which is the set of graphic symbols employed in writing the name, and the <<voidable>>
optional transliterationScheme
.
GeoPackage uses Unicode for storing text, so the script
value is implicitly encoded in the text
for most of the cases (see comparison of ISO 15924 script codes and Unicode).
The transliterationScheme
is only relevant in a linguistic context.
The general transformation rule MT005 Simplified Geographical Name proposes a simplified alternative representation for GeographicalNames
when it is used by other INSPIRE themes. This transformation only retains two properties: language
and spelling.text
.
The simplified geographical name adds minimal information with the new <<datatype>>
SimpleGeographicalName
that has two properties:
spelling_text: CharacterString
, which is encoded in Unicode.language: <<voidable>> CharacterString [0..1]
.
The rationale for adopting for the GeoPackage encoding this model transformation is to ease the use of names.
Model transformation rule:
This rule replaces the uses of the class
GeographicalName
in application schemas different from Geographical Names.
- Create the
<<dataType>>
SimpleGeographicalName
in the packageBase Types 2
.- Substitute existing uses of
GeographicalName
in application schemas different from Geographical Names.
If an implementer requires more information, it may:
- document defaults in the dataset metadata if it is homogeneous.
- extend the type
SimpleGeographicalName
with additional columns mapped to existing properties of the originalGeographicalName
type if it is heterogeneous.
Citations are used in many different places throughout the INSPIRE data specifications. ISO 19115 defines CI_Citation, which has a very deep structure. INSPIRE introduced two base types to simplify citations, LegislationCitation and DocumentCitation.
CI_Citation
, LegislationCitation
and DocumentCitation
are easy to implement in GeoPackage after applying Flattening of nested structures and Extract primitive array.
The general transformation rule MT007 Simple Citation proposes a simplified alternative representation for CI_Citation
, LegislationCitation
and DocumentCitation
.
The rationale of MT007 Simple Citation is the overhead generated by deep structure of the citation structures.
The simplified citation is based on a link to an external publication and adds minimal information as the new <<datatype>>
SimpleCitation
with five properties:
name: CharacterString
, the official name of the documenttype: SimpleCitationType
, which is one ofCI_Citation
,LegislationCitation
andDocumentCitation
and indicates what kind of citation is representedlevel: LegislationLevelValue [0..1]
, the level at which the legislative document is adopteddate: Date [0..1]
, date of creation, publication or revision of the documentlink: URL [0..1]
, link to an online version of the document
The rationale for adopting for the GeoPackage encoding this model transformation is the unification of the citation structures.
Model transformation rule:
This rule replaces the classes
CI_Citation
,LegislationCitation
andDocumentCitation
by the classSimpleCitation
.
- Create the
<<dataType>>
SimpleCitation
in the packageBase Types 2
.- Substitute existing uses of
CI_Citation
,LegislationCitation
andDocumentCitation
bySimpleCitation
.
Instance transformation rule:
The
type
value in an instance must be consistent with the replaced original type.
If an implementer requires more citation information, it may:
- extend the type
SimpleCitation
with additional columns mapped to existing properties of the original citation types. - implement
CI_Citation
,LegislationCitation
andDocumentCitation
and create a relation from theSimpleCitation
to the corresponding citation.
If a property has an maximum cardinality of 1
, it is mapped to a column
.
The column name is the same as the property.
The column will allow null
values if and only if if the minimum cardinality of the property is 0
or it the property is <<voidable>>
.
If a property has a cardinality greater than 1
, a suitable mapping needs to be found on a case-by-case basis.
If a property represents values from code lists and enumerations the type of the column is TEXT
,
and the values from the code lists and enumerations are encoded using the literal of the value.
Allowed values are encoded as a GeoPackage data column constraints of type enum
.
Example: The example shows the SQL statements that encodes that an AdministrativeBounday
has a technicalStatus
property with the value notEdgeMatched
that is a value of the TechnicalStatusValue
enumeration registered in the INSPIRE Enumeration Registry.
INSERT INTO gpkg_data_column_constraints(constraint_name, constraint_type, value, description)
VALUES ('technicalstatusvalue',
'enum',
'notEdgeMatched',
'http://inspire.ec.europa.eu/enumeration/TechnicalStatusValue/notEdgeMatched');
INSERT INTO gpkg_data_columns(table_name, column_name, constraint_name)
VALUES ('AdministrativeBoundary',
'technicalStatus',
'technicalstatusvalue');
INSERT INTO AdministrativeBoundary(..., technicalStatus)
VALUES (..., 'notEdgeMatched');
The values of a simple high-cardinality property can be encoded by concatenating the values in a string separated by a delimiter.
Model transformation rule:
This rule transforms properties with maximum occurrence greater than 1 that has a known mapping to any GeoPackage data type except
BLOB
or geometry type.Given a candidate property
x
of typeB
in a classA
, it is updated as follows:
- The type of the property
x
is updated toTEXT
.- The maximum multiplicity is updated to 1.
- Remove of the property
x
, if exists, anyGeoPackage data column constraint
associated.- The name of the property
x
is pluralized (e.g.type
totypes
,activity
toactivities
).
Instance transformation rule:
Values are encoded as a JSON array. The SQLite json1 extension allows applications to manage JSON content stored in a GeoPackage.
The pluralization of the name of the property is a marker to remember the expected content.
Content constraints in extracted primitive arrays may be implemented with SQL triggers in Extended GeoPackages.
N:M
association roles are implemented as GeoPackage Related Tables
.
The mapping table of an N:M
association role is implemented as a table.
The mapping table of 1:N
, N:1
and 1:1
association roles is implemented as a view.
The base_id
of the mapping table is the id
column of the parent member of the association.
The related_id
of the mapping table is a column of type INTEGER
with the name of the child property of the relationship on the child member of the association.
For example:
CREATE VIEW 'AU_AdministrativeBoundary_inspireId' (base_id, related_id) AS
SELECT id, inspireId FROM AU_AdministrativeBoundary;
Creates the mapping table of the 1:1
relationship between AdministrativeBoundary
and Inspire
.
This rule can be used to modify the names of specific model elements in the application schema.
The name of the feature table
is the name of the type (e.g. AdministrativeUnit
).
The name of the attribute table
is the name of the type (e.g. SpellingOfName
).
The name of the constraint enum
is the name of the code list or the enumeration (e.g. AdministrativeHierarchyLevel
).
The name of the relation table
is as follows:
- If the relation is not mandatory for both sides, the name is the name of one of the types of the relation (e.g.
AdministrativeBoundary_upperLevelUnit
) plus an underscore (_
) plus the name of the role on that side of the relation (admUnit
). - If the relation is mandatory only for one side, this side provides the name of the relation (e.g
Condominium_admUnit
). - If the relation is mandatory for both sides, the name is the roles of the relation separated by an underscore (
_
) (e.g.boundary_admUnit
).
These names may be prefixed to avoid name collision when multiple application schemas are stored in the same GeoPackage. The prefix is the namespace prefix to be used as short form of the XML namespace of the application schema in uppercase, with hyphens (-
) replaced by underscores (_
) plus an underscore added at the end (e.g. TN_RO_
is the prefix for contents from the application schema Transport Network Roads
because tn-ro
is its namespace prefix).
Other entities encoded in the GeoPackage that have an implementation in a well known XML namespace may use a similar strategy (e.g. GMD_
for contents that have an implementation in Geographic MetaData (gmd
)) .
This section describes additional rules when a specific dataset is encoded into the model converted to GeoPackage.
The character encoding of all TEXT
encoding in GeoPackage shall be UTF-8 or UTF-16.
SQLite stores all text as Unicode characters encoded using either UTF-8 or UTF-16 encoding. The SQLite API contains functions which allow passing and retrieving text using either UTF-8 or UTF-16 encoding. These functions can be freely mixed and proper conversions are performed transparently when necessary.
A GeoPackage shall include a gpkg_spatial_ref_sys
table that contains CRS definitions to relate data in user tables to location on the earth.
The CRS definitions conforms to:
- OpenGIS® 01-009 Implementation Specification: Coordinate Transformation Services Revision 1.0 (a.k.a WKT1) (column
definition
specified in 1.1.2. Spatial Reference Systems) - OGC® 12-063r5 Geographic information - Well-known text representation of coordinate reference systems (a.k.a WKT2) (column
definition_12_063
specified in F.10. WKT for Coordinate Reference Systems)
The gpkg_spatial_ref_sys
table contain at a minimum the WGS-84 as defined by EPSG:4326 and two additional reference systems for undefined Cartesian and geographic coordinate reference systems.
The INSPIRE Data Specification on Coordinate Reference Systems – Technical Guidelines D2.8.1 proposes the use of coordinate reference system parameters and identifiers shall be managed in one or several common registers for coordinate reference systems. Only identifiers contained in a common register shall be used for referring to the coordinate reference systems. Additionally, lists in Table 1 a list of coordinate reference systems registered in the EPGS Geodetic Parameter Registry that shall be used for referring to the coordinate reference systems used in a dataset.
The following table explains how to encode default CRS from the Table 1 in the gpkg_spatial_ref_sys
table.
Table: Mapping to gpkg_spatial_ref_sys
.
Column Name | Column Type | Column Description | Value from Table 1 |
---|---|---|---|
srs_name |
TEXT |
Human readable name of this SRS | Text from Short name |
srs_id |
INTEGER |
Unique identifier for each SRS within a GeoPackage | Any value (e.g. EPSG numeric ID) |
organization |
TEXT |
Case-insensitive name of the defining organization | "EPSG" |
organization_coordsys_id |
INTEGER |
Numeric ID of the SRS assigned by the organization | EPSG numeric ID |
definition |
TEXT |
WKT representation | WKT1 representation of the SRS |
description |
TEXT |
Human readable description of this SRS | Text from Coordinate reference system |
definition_12_062 |
TEXT |
WKT representation | WKT2 representation of the SRS |
WKT1 and WKT2 representations can be obtained with the use of tools such as gdalsrsinfo
.
The gpkg_spatial_ref_sys
table of an INSPIRE GeoPackage may contain all the default CRS from the Table 1.
Void reason value information may be encoded as metadata and related to specific features in a GeoPackage.
The first component of GeoPackage metadata is the gpkg_metadata
table that MAY contain metadata in MIME encodings structured in accordance with any authoritative metadata specification. The GeoPackage interpretation of what constitutes "metadata" is a broad one. We can use the second component of GeoPackage metadata, the gpkg_metadata_reference
table, to link metadata in the gpkg_metadata
table to columns and row/columns in table.
implementers must provide authoritative descriptions of the reasons of null
values in when a correct value may exists as follows:
Table: Records in gpkg_metadata_reference giving a reason and scope.
Reason | Scope | reference_scope |
table_name |
column_name |
row_id |
md_file_id |
---|---|---|---|---|---|---|
Unknown |
attribute |
row/col |
table name | column name | row id | 1 |
Unpopulated |
attributeType |
column |
table name | column name | NULL |
2 |
Unpopulated |
attributeType |
table |
related table name | NULL |
NULL |
2 |
Withheld |
attribute |
row/col |
table name | column name | row id | 3 |
Withheld |
attributeType |
column |
table name | column name | NULL |
4 |
Example:.
Unpopulated
is encoded in gpkg_metadata
in the row with id
2
.
The dataset A
does not maintain information about the lifecycle of the feature type AdministrativeBoundary
and hence the values of endLifespanVersion
columns are null
values. As endLifespanVersion
is <<voidable>>
and optional, we can use gpkg_metadata_reference
to state without ambiguity that the dataset A
does not contain information about the `endLifespanVersion.
INSERT INTO gpkg_metadata_reference VALUES (
'column',
'AdministrativeBoundary',
'endLifespanVersion',
NULL,
'2019-11-17T14:49:32.932Z',
2,
null
)
The dataset B
maintains information about the lifecycle of the feature type AdministrativeBoundary
and hence the values of endLifespanVersion
columns are normally not null
. As endLifespanVersion
is <<voidable>>
and optional, we can use gpkg_metadata_reference
to state without ambiguity when a null
value means that we don't know the end data of the administrative boundary with id
15
.
INSERT INTO gpkg_metadata_reference VALUES (
'row/col',
'AdministrativeBoundary',
'endLifespanVersion',
15,
'2019-11-17T14:49:32.932Z',
1,
null
)
Note: The content of this section is unstable.
This section describes additional rules that allows to transform a specific dataset encoded with the rules of the previous section into a flattened version.
Feature tables and views with 0 rows must be dropped. The dependent attribute tables and views must be also dropped.
This implies also to remove rows that reference them in gpkg_contents
, gpkg_geometry_columns
, gpkg_data_columns
, gpkg_metadata_reference
, gpkgext_relations
and gpkg_extensions
.
Table relations tagged in gpkg_metadata_reference
as unpopulated
at table
level must be dropped.
This implies also to remove rows that reference them in gpkg_contents
, gpkg_data_columns
, gpkg_metadata_reference
, gpkgext_relations
and gpkg_extensions
.
Columns tagged in gpkg_metadata_reference
as unpopulated
at column
level must be dropped.
This implies also to remove rows that reference them in gpkg_data_columns
and gpkg_metadata_reference
.
This rule is performed iteratively until no more candidates can be transformed.
The steps are:
- Candidates are table relations that in the dataset behaves effectively as a 0:1 or 1:1 relationship instead of 1:N or N:M where one side was a
<<dataType>>
in the conceptual model and it is a simple table (henceforth, source table). - Source tables are flattened following the Flattening types rule as if the maximum cardinality of the relation is
1
. The description of the columns affected ingpkg_data_columns
is updated with to reflect the newtable_name
and newcolumn_name
. Content is copied from the columns of source table to the columns target table using the mapping information of the table relation. - Candidates are now table relations that in the dataset behaves effectively as a 0:1 or 1:1 relationship instead of 1:N or N:M where one side was a
<<dataType>>
in the conceptual model and it is now a simple table after a flattening. - If there are candidates, go to the step
2
. - Otherwise, there are no more tables that can be flattened. Next the candidate relations and source tables used during the iterations must be dropped if they only participated in the flattened relations.
This implies also to remove rows that reference them in
gpkg_contents
,gpkg_data_columns
,gpkgext_relations
andgpkg_extensions
.