Geospatial data validation in Statistics Portugal

by | Jan 20, 2022


Principle 1, quality control, GML INSPIRE, crosscheck


To validate geospatial data before use is an important part of the quality assurance process. This use case briefly introduces some mechanisms used to validate geospatial data in Statistics Portugal.

Within the Spatial Data Infrastructure management, Statistics Portugal has a set of data quality control routines during the editing process. Those routines are mostly internal GIS-based processes that allow identification of topological and attributional errors for the following spatial datasets:

  1. Enumeration Areas (area) – blocks with a three-level structure – sections, subsections and localities – integrated with the official administrative boundaries;
  2. the Road Segments Network (lines) – street line coverage at national and local level edited with geometric and alphanumeric data from the municipalities and used for the delineation of the Geographic Information Referencing Base;
  3. the Buildings Geographic Database (points), Geographic Information Referencing Base

Additionally, to the data provided by the municipalities, X and Y geographic coordinates of the Indicators System of Urban Operations (SIOU) concerning the building and dwelling permits and completed construction work, and dynamically maintained by the them, there is also a spatial cross-check routine to evaluate the location data quality and spatial and attribute (code) accuracy prior to the insertion of the building point in the Buildings Geographical Database. The addresses of these permits are also harmonized and cross-checked with existing addresses in the National Dwelling Register and validated before they are loaded into the National Dwelling Register.

At a technical level, Statistics Portugal is strongly engaged in the INSPIRE and has developed important skills concerning validating GML INSPIRE harmonized data according to the directive data specifications.