Introduction
The following report describes the workflow for calculation of indicator 11.3.1 within the framework of the GEOSTAT 3 project, work package 2, by Statistics Norway
We have calculated indicators based on different data and concepts:
- Urban settlement delimitation by national sources and methodology
- GEOSTAT Urban clusters
- GHSL built-up for the whole country
We have utilised the formula as defined by JRC and the original formula in the UN metadata on the indicator.
Data status
- Population data from the National Registry[1] geocoded to address point location: The National Registry, under the responsibility of the Norwegian Tax Administration, can be geo-enabled by use of geocoded authoritative address and/or building data from the NMCA (Kartverket). Statistics Norway has a “statistical copy” of the National Registry, updated on a daily basis, hence geocoded population can be obtained for any point of time.
- Geographic delimitation of urban areas (localities – urban settlements) following national methodology, produced by Statistics Norway. Data on localities is national authoritative data available under open data licenses. (2000-2017)
- Urban cluster grid from Eurostat based on the GEOSTAT population grids. Data for the year 2006 and 2011 has been used.
- GHSL – Global human settlement layer. Data on built-up and population based on satellite imagery and other sources. (1975, 1990, 2000, 2014)
Processes
We have tried out different methodologies in addition to the national proxy indicator. In part this is to highlight differences and similarities between data sources.
In the testing for Norway of indicator 11.2.1 we saw that population figures for urban clusters was very similar to the nationally delimitated urban settlements (=> 5 000 residents), but on more detailed level differences are bigger.
a) Geocoding population data
The “statistical copy” of the national population register at Statistics Norway includes references to address, dwelling ID and Real property ID at unit record level (e.g. at the level of each individual unit). Data is collected by the Tax administration and transferred daily to Statistics Norway. Hence, geo-enabling population to address location is quite straightforward and can be deployed for any point in time using the authoritative address register from the NMCA. A copy of the address register is kept in Statistics Norway and the location of each address is stored as attribute information in the statistical databases (Oracle technology). Extracts are made regularly for unit record data with building location and address location. Address location with aggregated population by age and sex is served as a geometry table and feature class for use in desktop GIS software.
Some 99.8 percent of the population can be directly geocoded to the level of address location. For different reasons the remaining 0.2 percent cannot. When conducting the calculations only population assigned to address location is regarded.
For total population, figures are in the StatBank: https://www.ssb.no/en/statbank/table/05803/.
b) Delimitation of urban agglomerations:
In principle, this step has already been completed prior to the indicator analysis. Two different concepts/data sources have been tested:
- Classification of urban areas based on national data (Norwegian “urban settlements”).
- Classification of urban areas on European data on urban clusters (using data from Eurostat based on the grid cluster method).
Statistics Norway has recurrently delineated the geographical extent of urban areas (“urban settlements” or “localities”) as part of the production of urban official statistics every ten years since the 1960 census. Digital boundaries have been delimitated almost every year since 2000. A locality consists of a group of buildings normally not more than 50 metres apart, and must fulfil a minimum criterion of having at least 200 inhabitants. Thus, localities include the largest cities as well as small settlements with 200 inhabitants as the lower threshold. The delimitation is conducted as an automated workflow involving high quality authoritative geospatial data from the NSDI in combination with point-based population data geocoded to the level of address location. The delimitation follows closely the land use as identified in the national land use/ land cover map; https://www.ssb.no/en/natur-og-miljo/statistikker/arealstat/aar. See also https://kartkatalog.geonorge.no/metadata/statistisk-sentralbyra/arealbruk/a965a979-c12a-4b26-90a0-f09de47dbecd.
The result from the delimitation of urban settlements is a national polygon dataset representing the urban extent of each locality (some 1 000 in Norway). Updated boundaries are now made annually. Data is available under open data license agreements https://kartkatalog.geonorge.no/metadata/statistisk-sentralbyra/tettsteder-2017/9b4fdbcb-d682-4cea-8e10-e152bbeb481e). StatBank: https://www.ssb.no/en/statbank/list/arealbruk/
To enable comparison between national and European data, a cut-off has been applied to the national data, taking into account only those urban clusters in national data having 5 000 inhabitants or more.
Converting Urban clusters
The clusters are 1 km grids with population (2011) contiguous (8 neighbours) grids with at least 300 inhabitants each, 5 000 in all.
The clusters can be downloaded from Eurostat: http://ec.europa.eu/eurostat/web/gisco/geodata/reference-data/population-distribution-demography/clusters.
Preprocessing included extracting data for Norway, projecting to UTM 33 ETRS89 and converting to vector format. Eventually, we delimitated the urban clusters anew for 2006 and 2011.
GHSL built-up
The Global Human Settlement (GHS) framework produces global spatial information about the human presence on the planet over time. This in the form of built up maps, population density maps and settlement maps. This information is generated with evidence-based analytics and knowledge using new spatial data mining technologies. The framework uses heterogeneous data including global archives of fine-scale satellite imagery, census data, and volunteered geographic information. The data is processed fully automatically and generates analytics and knowledge reporting objectively and systematically about the presence of population and built-up infrastructures.
The general methodology behind GHSL data introduces concepts of GHS BUILT-UP, GHS POP, and the GHS Settlement Model. The main datasets are offered for download as open and free data.
The data consists of multitemporal products that offer an insight into the human presence in the past: 1975, 1990, 2000, and 2014. See http://ghsl.jrc.ec.europa.eu/index.php.
In the testing we have used the GHS BUILT-UP which has information layers on built-up presence as derived from Landsat image collections (GLS1975, GLS1990, GLS2000, and ad-hoc Landsat 8 collection 2013/2014). A quality grid for the built-up (GHSL BUILT-UP QUALITY) has also been studied, as well as the population grids (GHSL-POP).
Although GHSL built-up exists for almost all of Norway for the different years it has some obvious weaknesses. In part agricultural areas are classified as built-up. The same is the case for some forest areas and gravel/ bare rock areas. This is noticeable in western parts of Norway as well as in parts of northern Norway. The cities of Trondheim and Tromsø are not represented with built-up in a reliable way. Despite this, we have used the data for testing purposes. Because of the quality issues we could not have used the data for ordinary statistical production at the present time, and there exists other options for Europe (although not with such a long time series – Copernicus products).
Preprocessing included extracting data for Norway, projecting to UTM 33 ETRS89 and converting to vector format. ArcGIS Modelbuilder production lines where set up for this task.
Calculation of indexes
We discussed which indicator to use. In the test we have given figures for two indicator definitions (UN Habitat – in the metadata description, and an indicator proposed by the JRC), in addition to the basic statistics.
In Norway, the environmental authorities have an indicator which is a bit simpler altogether; urban settlement area per resident. The statistics plotted by time will then yield some of the same information as the more complicated indexes (https://www.ssb.no/en/natur-og-miljo/artikler-og-publikasjoner/byer-og-miljo–225511).
Formula recommended by UN Habitat – LCRPGR
The formula proposed by UN Habitat in the metadata description for indicator 11.7.1 is called the Ratio between the land use growth rate and population growth rate (LCRPGR)[2]. In the metadata of the indicator, the concept of Land use growth includes all aspects of human exploitation; from expansion of built-up areas to use of land for agriculture, forestry or other economic activities.
The formula is described as:
The formula refers to the concepts “urban agglomeration” and “cities”; we have calculated figures for urban settlements (total area of the urban settlements) but also for all built-up land as represented by GHSL. The figures for GHSL will also include building activity outside of urban growth, such as roads and other major construction activity.
Formula recommended by JRC – Land Use Efficiency (LUE)[3]
The GHSL team at JRC has developed the Land Use Efficiency tool (LUE). It is designed to be used with GHSL data, but it can be adapted to other input data. The reasons for developing this alternative indicator are:
- To better capture the situation where population growth is zero or negative, or where loss of territory because of catastrophic events etc.
- Bypassing the problem of missing common definitions of urban area.
LUE can be estimated with different time intervals upon the availability of the observations. In order to ensure the comparability of the results at different times, it is recommended to normalise the values to obtain the variation a 10-year average change, which divides the indicator by n (the number of years that separate the observations), and then multiply by 10.
The formula is:
JRC has developed a QGIS tool, for calculations of both LUE and the previously described formula LCRPGR. The tool produces a geoTIFF output file and the results of both indicators are summarized in a numerical form in a .csv file. However, in this test we have used Excel to calculate the formula.
Results
We present statistics based on national urban settlements as well as GHSL for testing purposes.
The source statistics
Statistics for urban settlements has been produced since the 1960’ies in conjunction with the censuses every 10th year. Primary focus was the population statistics and not area of settlements. From 2000 onwards, Statistics Norway has delimitated urban settlements (almost) every year after production was established as automatic and GIS based. As new data sources became available, new methodology was developed and adjusted statistics published from 2013 onwards. We can see this as a drop in urban settlement area as the delimitation more closely follows the land use.
Figure 1. Urban settlement area and population. 2000-2017
The GHSL exists for 1975, 1990, 2000 and 2014. National official statistics has been used for total population for Norway. The GHSL built-up includes all built-up parts of the country, also outside urban settlements. At the other hand, only major and densely built-up are included.
Figure 2. Built-up according to GHSL and total population. 1975, 1990, 2000 and 2014. Norway
Urban cluster data was recalculated as there seem to be a bit discrepancy between the downloaded data when compared to address population.
Figure 3. Urban cluster 2006 and 2011. Norway
Ratio between the land use growth rate and population growth rate (LCRPGR)
An equal land use growth rate and population growth rate result in index value of 1. If land use growth is higher the resulting index is above 1. The trend based on the two data sources is fairly comparable, but GHSL have all values above 1. GHSL is based on the total population. The urban population has had a higher growth.
Figure 4. LCRPGR indicator based on urban settlements 5000 (national data) and all built-up (GHSL)
Land use efficiency indicator (LUE)
The land use efficiency indicator is zero if land use growth and population growth is equal. Higher land use growth yields negative values, while higher population growth results in positive indicator values.
In the Norway-case, again the trend is somewhat similar when comparing data from GHSL and urban settlements (above 5 000 residents), see figure 5.
Figure 5. Land use efficiency indicator based on urban settlements 5000 (national data) and all built-up (GHSL)
Figures for the whole country may mask out differences between regions or urban settlements. Many of the biggest urban settlements in Norway show the same trend, but with some exceptions (figure 6).
Figure 6. Land use efficiency indicator. Urban settlements with at least 50 000 residents. 2004-2017
To see the land use change in more detail one can look at single urban settlements (figure 7).
Figure 7. Population and urban settlement 2013-2017. Oslo
The indicator is most fit for looking at changes in not too much detail, because expansion areas (and industry areas) will not necessarily be inhabited and thus yield no indicator value (figure 7). This will to a lesser extent be the case with GHSL data, as it incorporates only densely built-up areas and the population distribution will be based on top down methodology and thus the population tend to be more extensively distributed, (populate areas not necessarily populated when using point based bottom up figures). When looking in to detail (for Norway) it might be just as relevant to make figures for population growth in relation to number of, and area of, dwellings and buildings, for example.
We have also calculated the indicators based on Urban cluster data 2006 and 2011, which give one indicator figure only. It is difficult to make any trend comparison, but the indicator value is close to the national figure for urban settlements >= 5 000 for the same period.
It would be necessary to combine the built-up from GHSL with buildings/ cadastre, roads and other information to calculate the figures we want. Figure 8 highlight this issue. In part residential areas will be classified as not built-up because of lower density.
Figure 8. Built-up from GHSL (38m) compared with building outline and roads
GHSL should be combined with register and/ or other map data to mask out other built-up areas not included in the GHSL built-up. If not, it will only take into account densely built-up areas. It is also fundamental problems with this data for parts of Norway. However, if this kind of building activity follows all building activity, the indicator (LUE) will give an interpretable trend, although not the correct indicator values.
To get reliable results, it is important to have population data on detailed level, preferably geocoded unit record level. This is most important when publishing results on regional level or for urban clusters/ urban settlements. The results for this indicator should indeed be published on low regional level to get to grips with the causes of change.
Statistics for land use change and population change respectively should accompany the indicator figures/ maps to understand the indicator figures. The land use efficiency indicator is about change, not status; one monitors whether the land use efficiency get better or worse, not if the land use is efficient or not.
More information
Contact information
[1] https://www.skatteetaten.no/en/person/national-registry/
[2] https://unstats.un.org/sdgs/metadata/files/Metadata-11-03-01.pdf
[3] https://ec.europa.eu/jrc/en/publication/eur-scientific-and-technical-research-reports/lue-user-guide-tool-calculate-land-use-efficiency-and-sdg-113-indicator-global-human