Introduction
This report describes the work executed by Statistics Portugal for the calculation of SDG indicator 11.3.1, within the framework of the GEOSTAT 3 project work package 2.
Statistics Portugal has been involved with this indicator in the context of workgroup B Data integration of UN-GGIM Europe. Early this year a report has been published with a description of the current practices and available data in Portugal[1]. Parts of this report are used for the conception of this GEOSTAT report.
This report describes the implementation of Indicator 11.3.1 in the following ways for the municipalities (LAU1) of mainland Portugal and national level:
- Using official land cover cartography of the Directorate-General for Territory (NMCA Portugal).and official populations estimates for the municipalities from Statistics Portugal
- Using the built-up surface 38 X 38 m resolution from GHS-BU and population data from Statistics Portugal and GHSL
- Using the built-up surface 250 X 250 m resolution from GHS-BU and population data from Statistics Portugal and GHSL
Data status
The following data sources have been used:
- Land Use and Land Cover Map (COS) from the Directorate-General for Territory (NMCA Portugal). Data series is available for the following reference years: COS 1995, COS 2007, COS 2010 and COS 2015. This national data corresponds to polygonal maps that represent homogenous land use/cover units. COS is based on a vector data model and the reference mapping unit corresponds to 1 hectare, with a defined distance between lines equal or higher than 20 meters and a percentage equal or higher than 75% of a given land use/ cover thematic class. COS thematic classification is based on a hierarchical system of 5-level classes. The built-up area concept would be correspondent to megaclass 1 of COS nomenclature – “artificial land”.
- Built-up land according to The Global Human Settlement Layer (GHSL)[2], resolution of 38 m2 and 250 m2. Data contain a multi-temporal information layer on built-up presence, as derived from Landsat image collections in four different epochs (GLS1975, GLS1990, GLS2000, and ad-hoc Landsat 8 collection 2013/2014). Each pixel is classified according to a binary scheme as built-up or non-built-up, values 1 or 101. JRC also provides a complementary layer describing the confidence of the classification on pixel level. Data for the following years is used in this test:
- 1975
- 1990
- 2000
- 2014
- Official population estimates (resident population) by municipality (LAU1) for the years 2010 and 2015. These population estimates are disseminated within six months (mid-June) of the end of the reference year.
- Official population data from the 1991 and 2001 Census.
- Population data according to GHSL. This spatial raster dataset depicts the distribution and density of population, expressed as the number of people per 250 m2 Each cell represents residential population estimates for target years 1975, 1990, 2000 and 2015. Only data for the year 1975 was used.
Currently there are two territorial classifications for urban delimitation:
- Classification of urban areas (TIPAU 2014) which classifies each parish (LAU 2) in one of three categories – Predominantly urban areas, Medium urban areas and Rural urban areas. This classification isn’t adequate as input data for the creation of this indicator.
- Classification of Statistical Cities, which is based on 2011 Census tracks geography.
Processes
The steps to calculate the indicator are different when using official land cover cartography for the area estimates and the GHSL data. The following figure shows globally the process how the indicators are calculated using the different type of data.
The indicator was produced for Portuguese mainland level, and municipalities, represented by the official administrative boundaries of 2015, who apply for all the years of the indicator. The territory of the islands of Azores and Madeira was excluded, since the COS 2010 and 2015 had no data for them, in contrast to the GHSL data which has data available for the islands.
In the following sections the several phases are described in detail.
Delimitation of built-up area
The percentage of current total urban land that was newly developed (consumed) is used as a measure of the land consumption rate, when calculating the indicator. This fully developed area is also sometimes referred to as built up area.
The delimitation of built-up area wasn’t an issue when using the GHSL data. This data is produced fully automatically using satellite imagery.
However the definition of what area represent built up area was important when using the official land cover cartography (COS). As previously referred COS has 5 primary classes of landcover, the built-up area would be correspondent to megaclass 1 of COS nomenclature – “artificial land”. From this class one type of land cover wasn’t included, the areas under construction[3] (class 1.3.3).
Calculation size of land area
The COS data are vectorial and the GHSL data are raster data, that’s why the process of the calculation of the size of the built up area for each municipality and Portugal continental was executed differently.
Once selected the areas which represent built-up areas from the COS data (2010 and 2015) the following has been done in ArcGIS:
- Selection of the area which represent built-up area within a separate dataset
- This data was intersected with the polygons of the municipalities
- Summary of the values by municipalities
The data processing of the GHS-BU datasets implicates some more work. Before describing the process to obtain the areas for each municipality some general remarks have to be made about the differences between the data sets.
The GHS-BU 250m2 data has associated a percentage of built up area for each cell. The GHS-BU 38 m2 data have for each cell the value 1 or 101, where 101 indicates that a full pixel represents built up area. The projections of both datasets are also different, respectively Mollweide and WQGS84. The GHS-BU 250 m2 data is only one tiff file, the GHS-BU 38 m2 dataset consists out of several tiff files and an associated pyramids file. Another issue with both datasets, especially the data with 38 m2 cell size, is that they are quite big (several GB) which complicate the data processing.
To obtain areas for each municipality the following has been done:
- Download of the datasets for all the years from the GHSL website (38 and 250 m2)
- Use the “CLIP RASTER” function in ArcGIS with a national mask having the same projection as the GHSL input data. QGIS doesn’t offer a real good alternative, because of the size of the GHSL datasets.
- The data have been reprojected in ArcGIS with the “Project Raster” function to a common projection, the ETRS-LAEA projection.
- Since the cells of the GHS-BU 38m2 data don’t represent the actual built-up area of each cell the data has to be reclassified so it can be used to summarize the data for each municipality. We used the “raster calculator” in QGIS to reclassify all the cells with the value 101 to the area of each cell and the cells classified with 1 with the value 0.
- The areas for each municipality have to be obtained using a zonal statistics function. Like the raster calculation function the zonal statistics function in ArcGIS is only available with the Spatial Analyst extension. The same functionality exists as a QGIS plugin, which has been used for this purpose[4]. Using this plugin creates a total for each municipality of the built up area (GHS-BU 38m2) or an average of the percentage of built up area (GHS-BU 250m2).
- The results of this process have been imported in an ArcGIS Personal Geodatabase and by using several query’s the total of each built up area was calculated for the several years. Also it allows to easily create a table to make the cartography of the results.
The following table shows a summary of all the built up area datasets used to calculate this indicator:
COS | GHS-BU 38 m2 | GHS-BU 250 m2 | |
YEARS | 2015 | 2014 | 2014 |
2010 | 2000 | 2000 | |
1990 | 1990 | ||
1975 | 1975 |
Calculation of population
For the calculation of the indicators data of population is necessary on municipal and national level for the years 2015, 2014, 2010, 2000, 1990 and 1975.
There weren’t tested different kind of sources for the calculation of the population for each municipality. The best source available was always used. The following table show how the population on municipal level for each year was obtained:
Year | Method |
2015 | Estimated population by municipality from Statistics Portugal |
2014 | Estimated population by municipality from Statistics Portugal |
2010 | Estimated population by municipality from Statistics Portugal |
2000 | Population from 2001 Census |
1990 | Population from 1991 Census |
1975 | Population from GHS-POP 250m2.
To obtain this data for each municipality a similar treatment has been executed like one needed to obtain the built up area data. |
The population from the 1991 and 2001 census isn’t available for the 2015 geography of the municipalities. Over time new municipalities have been created and administrative boundaries have been changed. Therefore the census geography (subsections) was intersected with these new boundaries and totals were calculated for each municipality. When a subsection exists in more than 1 municipality the percentage of the area was used to allocate the population to each municipality.
It should be remarked that the NMCA made some simulations to investigate the impact of considering the population and their growth within the real urban areas and not only the general territory. Using this simulation no real differences in the indicator values were detected. In this report we won’t represent any information of this analysis.
Calculation of the indicator values
The indicators were calculated using an EXCEL spreadsheet and the input data of built up area and population.
In order to ensure the comparability of the results at different times, it is recommended to normalize the values to obtain the variation a 10-year average change.
We used the formula for Land Use Efficiency (LUE) recommended by JRC, which ensures the comparability of the results at different times, normalized to obtain the variation a 10-year average change [5]. The formula is:
Results
The available data allowed calculating indices for the following periods:
- COS: 2010 to 2015
- GHS-BU 38 m2: 1975-1990; 1990-2000; 2000-2014
- GHS-BU 250 m2: 1975-1990; 1990-2000; 2000-2014
Of course it would have been possible also to combine COS and GHSL data to create values for other periods, for example 2000-2010, but we decided that this wouldn’t add any value to this study.
The national values for these indices are:
COS | LUE |
2010-2015 | -0,100 |
GHS-BU 38 m2 | LUE |
1975-1990 | -0,565 |
1990-2000 | -0,219 |
2000-2014 | -0,173 |
GHS-BU 38 m2 | LUE |
1975-1990 | -0,566 |
1990-2000 | -0,220 |
2000-2014 | -0,172 |
You can observe the national tendency that the value approaches to values near 0. The values for having data of 38 m2 or 250 m2 cells aren’t really that different. This shouldn’t be surprising since the Landsat source data used are the same; however there should be an effect of the bigger cell size. On national level this doesn’t seem to make a lot of difference. Interesting is to see the difference on municipal level.
These differences can be seen on the maps in figure 1, which show the values for the 2 datasets, GHS-BU 38 m2 and GHS-BU 250 m2, and the 3 periods 1975-1990, 1990-2000 and 2000-2014.
There are some conclusions you can take from these maps:
- The existence of differences for the indicators between the 2 GHSL datasets
- The change of the values between the years. The LUE values have more extreme values for the period 1975 to 1990, some municipalities can have negative values smaller then -2.
- The differences within the territory.
The 3 maps in figure 2 give a better view of the differences between using the 2 different GHSL datasets. You can see that over time the differences are getting less. Most likely this can be explained by the fact that the amount of built up land within each municipality increases and therefore the effect of the bigger cell size is diminishing.
The map in figure 3 show the values of the LUE values of the COS data for 2010 to 2015. In this map you can observe the difference between coastal Portugal and the interior.
Figure 1: LUE values GHS-BU 38 m2 and GHS-BU 250 m2 for the periods 1975-1990, 1990-2000 and 2000-2014
Figure 2: Differences between LUE values GHS-BU 38 m2 and GHS-BU 250 m2
Figure 3: LUE 2010-2015 for COS dataset
Evaluation
This report shows the result of creating 11.3.1 indicator, namely the Land Use Efficiency (LUE) version published by JRC, using different datasets.
In the future it is the intention of Statistics Portugal to publish values for this indicator using the COS dataset from the NMCA. It is important that regular data exists about landcover, the COS dataset with high precision and the regular update interval allows the production of indicator values by Statistics Portugal for municipal level.
There are some conceptual issues with this indicator. For this study we used LUE from JRC, but alternatively one can use the formula recommended by UN Habitat, the Ratio between the land use growth rate and population growth rate (LCRPGR).1. The notion of what territory represent built up land is also important.
This study shows that the GHSL data is an alternative dataset for the dynamics of the built up area. On national level there aren’t that many differences between using the 38 m2 or 250 m2 cell size. The use of other GHSL datasets like the confidence layer might be interesting to study.
The values of the indicator can be implemented without scripting. There are needed some steps for data processing but no real difficult procedures. It is also possible to use Open Source GIS software, like QGIS. We would like to mention the GRASS GIS software which is especially suited for doing the image analysis. Working with this software hasn’t got a big learning curve, but however you need some training.
It is interesting to investigate the use of other sources for built up area, for example the data of CORINE land cover which has data for 1990 and 2000, thus allowing to calculate the same index using the CORINE and the GHSL datasets.
More information
Contact information
Bart-Jan Schoenmakers, Statistics Portugal, bart.schoen@ine.pt
[1] Report available on basecamp from eurogeographics, https://eurogeographics.basecamphq.com/
[2] https://ghsl.jrc.ec.europa.eu/datasets.php
[3] COS class 1.3.3 “áreas em construção”
[4] Zonal Statistics Plugin: https://docs.qgis.org/2.18/en/docs/user_manual/plugins/plugins_zonal_statistics.html
[5] https://ec.europa.eu/jrc/en/publication/eur-scientific-and-technical-research-reports/lue-user-guide-tool-calculate-land-use-efficiency-and-sdg-113-indicator-global-human