Introduction
Task given: to calculate the proportion of population that has convenient access to public transport, by sex, age and persons with disabilities.
Solution: we analysed access to public transport from every mandatory aspect. Access to public transport for disabled persons was not taken into account, as such data are missing. Additionally we analysed access to public transport based on the national criterion, developed by an expert in accessibility studies in Estonia, Veiko Sepp. The national criterion adopts a policy-oriented perspective and takes into account varying mobility needs in different types of settlements.
Data status
- As public transport data, open data in the public transport register were used (reference time March 2018). The dataset is available to everyone at any time (https://www.mnt.ee/eng/public-transportation/public-transport-information-system). The open data of the public transport register consist in an extract of the data entered in the national public transport register; the structure of the data content is simplified, descriptions, timetables and locations of stops on domestic public transport routes are included. The data fit to Google GTFS (General Transit Feed Specification) data model. Link to download: peatus.ee/gtfs. Estonian Road Administration updates the information once a day before 6:00 am.
- In Estonia, population statistics are based on the administrative data of the population register. The address data of the register can be geocoded to building level using the Address Data System maintained by Estonian Land Board (NMCA – in Estonia: Maa-amet). Statistics Estonia has a “statistical copy” of the population register, made at the beginning of each year. Since 2016,instead of the place of residence recorded in the census, the place of residence recorded in the population register is used. Population data of 01.01.2017 was used in the current analysis.
- For urban and high-density cluster data, Eurostat clusters were used, which are based on GEOSTAT 2011 grid population. Downloaded from:http://ec.europa.eu/eurostat/web/gisco/geodata/reference-data/population-distribution-demography/clusters
- We used settlement types to calculate the Estonian SDG of convenient accessibility to public transportation in urban and rural areas. The types were determined using the method of population grid data clusters (similar to Eurostat http://ec.europa.eu/eurostat/statistics-explained/index.php/Degree_of_urbanisation_classification_-_2011_revision), applying Estonian thresholds. The majority cluster type determined the type of the settlement unit. In urban areas, population density exceeds 200 people per square kilometre and the number of population exceeds 5,000. The remaining areas are rural areas. The method has been in use since 2018.
Processes
a) Geocoding the population data
The population database in Statistics Estonia is based on the data collected into the population register, transferred to Statistics Estonia as at 1 January. The population register was interfaced with Address Data System (ADS, maintained by NMCA) in June 2014, and today about 99% of addresses have been linked with ADS data. Each unit record in the population register includes standardized address data, an address ID and address object ID. Statistics Estonia has the copy of ADS address data, which is updated daily via X-Road (https://e-estonia.com/solutions/interoperability-services/x-road). Therefore, it is easy to geocode the address data from the population register. Each unit record data from the population register are linked with ADS to get the coordinates using the address object identifier as a key. The population database in Statistics Estonia is stored in an Oracle database. The address-points geometries are kept in eGEOStat, which is Statistics Estonia’s geodatabase (Oracle database, ArcGIS Server). The data are linked in the Data Warehouse Department. The data are available to internal users in accordance with the data usage rules.
In the current analysis, 01.01.2017 population data were used. About 96.7% of the population was automatically geocoded to the building level. As it is possible to have an incomplete address in the population register (place of residence data is known only at settlement unit or municipality level), 3% of the population was geocoded only to settlement, city district or municipality level (Table 1). These people are linked to the population weighted centroid of corresponding unit level. About 0.1% of the population was geocoded manually to building level. For these, the direct match was impossible due to outdated address data. In the geodatabase, each record has information about the matching type, describing the geocoding quality – whether the address has been geocoded to building, cadastral unit or municipality level. This enables to extract the data with lower quality depending on requirements of an analysis.
In the analysis of public transport accessibility, all address-points were used. This is because most people who were geocoded to village/municipality level live in larger cities, where the distances between stops are short.
Table 1: Metadata describing geocoding quality at unit record level, 01.01.2017
Quality Code | Number of people geocoded | % |
Direct match to building level | 1,272,733 | 96.7 |
Direct match to cadastral unit | 559 | 0.0 |
Direct match to city district unit | 15,659 | 1.2 |
Direct match to settlement unit | 11,236 | 0.9 |
Direct match to municipality unit | 14,730 | 1.1 |
Indirect match | 718 | 0.1 |
Total population | 1,315,635 | 100 |
b) Delimitation of urban areas
Two different urban areas have been tested:
- Urban cluster – clusters of urban areas in Europe formed on the basis of the population density grid map with Population and Housing Census 2011 data. Data downloaded from: http://ec.europa.eu/eurostat/web/gisco/geodata/reference-data/population-distribution-demography/clusters
- Localities – localities have been calculated on the basis of the data of Population and Housing Census 2011 and represent national data.
Localities represent areas where the distance between buildings is less than 200 meters and the number of population in such building groups amounts to more than 200 persons. In Statistics Estonia, the localities were calculated for the first time on the basis of Population and Housing Census 2011 point-based data. The building data were received from Estonian Topographical Database, maintained by NMCA. The results were published in Statistics Estonia’s map application https://estat.stat.ee/StatistikaKaart/VKR#, where the data can be viewed and downloaded. Also, metadata description can be found there. After the census, the localities have not been updated.
For current analysis, localities with population over 5,000 and over 50,000 were selected. The reason for this was that the same thresholds are used for European urban clusters and high-density clusters (Figure 1), making the data comparable.
Figure 1. Urban clusters (European dataset) and localities (national)
The above-mentioned national localities and European urban areas do not overlap (Figure 2, Table 2). Even though the total number of population in localities outnumbers the population of European urban clusters by about 53,000 inhabitants, this does not mean that localities everywhere cover the urban clusters. Differences can be found in both directions – localities do not include areas which have been included in urban clusters and vice versa. For instance, Kiviõli locality (population 5,687) has been left out from the European dataset. At the same time, Ihaste locality with a population of 3,481 has been left out from the current analysis, as it has not merged with a neighbouring locality, and the threshold for the analysis has been set to 5,000. In the European dataset, however, Ihaste locality has been combined with the neighbouring areas.
Figure 2. Mismatch of national localities (population ≥ 5,000) and Eurostat’s urban clusters
Table 2. Number of population in national and European urban spatial datasets
Data source | Number of population in urban areas |
Eurostat urban clusters | 829,568 |
National localities (≥5,000 inhabitants) | 882,885 |
c) Selection and preparation of public transport stops
The transport data had to be linked, because there were 11 .txt files with different information.
Filename | Required | Defines |
agency.txt | Required | One or more transit agencies that provide the data in this feed. |
stops.txt | Required | Individual locations where vehicles pick up or drop off passengers. |
routes.txt | Required | Transit routes. A route is a group of trips that are displayed to riders as a single service. |
trips.txt | Required | Trips for each route. A trip is a sequence of two or more stops that occurs at specific time. |
stop_times.txt | Required | Times that a vehicle arrives at and departs from individual stops for each trip. |
calendar.txt | Required | Dates for service IDs using a weekly schedule. Specify when service starts and ends, as well as days of the week where service is available. |
calendar_dates.txt | Optional | Exceptions for the service IDs defined in the calendar.txt file. If calendar.txt includes ALL dates of service, this file may be specified instead of calendar.txt. |
fare_attributes.txt | Optional | Fare information for a transit organization’s routes. |
fare_rules.txt | Optional | Rules for applying fare information for a transit organization’s routes. |
shapes.txt | Optional | Rules for drawing lines on a map to represent a transit organization’s routes. |
feed_info.txt | Optional | Additional information about the feed itself, including publisher, version, and expiration information. |
We used only the required files, except agency.txt.
We linked the relevant tables, using ‘one-to-many’ connections. The final database had to contain the trip ID, departure time, stop ID, route ID, start and end date of the trips, information about on what days the transport is available.
There was a confusion about the start and end dates of trips and what trips we should include in the analysis, because there were trips which started already in 2000 and trips which ended in 2025. First, we thought it was only necessary to take into account trips operating from 01.01.2017 – 31.12.2017, but in this case, a lot of trips were left out. After some investigation, it turned out that all the trips are relevant to 2017, and, therefore, all the trips that were in the transport dataset were taken into account.
For analysis, we chose Wednesday and included all the trips operating on Wednesdays.
The next step was to create a crosstable, where each row showed the STOP ID and columns showed the operating hours. Next, we selected only hours 6:00 am – 8:00 pm and hours with at least one trip. 2,192 stops out of 16,394 stops remained (Figure 3).
Figure 3. Stops having at least one departure per hour between 6:00 am – 8:00 pm on Wednesdays
d) Computation of service areas
We then exported the x,y coordinates (from transport data) to ArcMap and created points for the 2,192 stops. Service areas were calculated using Buffer tool (ArcGIS) – 500m buffers around stops were made.
e) Calculation of population within service areas
Next, the centroids of buildings (01.01.2017 population data) within the buffer zone were selected by location. After that, houses in the buffer zone were selected by location that are within:
1) High-density (HD) clusters
2) Urban clusters
3) Rural areas
As we geocode population data every year, we did not have to do any extra work. Sex and age are linked to geocoded population data. Therefore, it was easy to calculate age groups and summarize persons living in high-density clusters, urban clusters and rural areas with ArcMap (Spatial Join tool).
Results
Results based on GEOSTAT criteria
National level
In Estonia, 64% of the total population (844,659 persons) had convenient access to public transportation. For some inexplicable reason, accessibility is lowest among the 15–24 age group, but the difference is only some percentage points. The best accessibility is among children aged 0–14. Convenient access in the context of the current analysis means that the stop is located within 500m on a straight line from the house and the frequency of public transport is at least 1 departure per hour between 6:00 am – 8:00 pm. (Tables 1–3)
There was a slightly smaller share of men than women having convenient access to public transportation: 62% of men and 66% of women had convenient access to public transportation. Comparing men and women by age groups, it is interesting that there are no differences in accessibility between boys and girls aged 0–14. For older age groups, the accessibility of women is better compared to that of men.
Urban cluster and high-density urban cluster (Eurostat)
Public transportation is conveniently accessible to almost all (495,541 persons, i.e. 99%) the city residents living in high-density urban clusters – in the city centres of Tallinn, Tartu and Narva. Adding to those city centres the peripheral areas of cities and smaller cities, convenient accessibility declines by 7% from 99% to 92%. It means that in urban clusters, 92% of the population (759,660) live closer than 500 meters to public transportation stops with at least one departure between 6:00 am – 8:00 pm. There are no differences between age groups and sexes in terms of convenient accessibility in high-density clusters and only 1% difference in urban clusters.
Outside the urban clusters, here classified as rural areas, public transportation is conveniently accessible for only 17% of the population. In those areas, the difference between age groups and sexes is the biggest – up to 3%: 14% of men aged 65 and over have convenient access to public transport, compared to 17% of women of the same age. Men over 65 have the worst accessibility to public transportation.
Localities (national)
Public transport is conveniently accessible to 97% of the population living in localities with population over 50,000. The difference between sexes and age groups is not remarkable. In localities with population over 5,000, accessibility is convenient for 89–91% of people, depending on sex and age group. Accessibility is the worst for older men aged over 65.
Table 1. Proportion of population having convenient access to public transport by age
TOTAL | 0–14 | 15–24 | 25–64 | 65 and over | |
TOTAL | 64% | 65% | 62% | 64% | 64% |
European classification | |||||
HD clusters | 99% | 99% | 99% | 99% | 99% |
Urban clusters | 92% | 91% | 92% | 92% | 92% |
Rural areas (outside urban clusters) | 17% | 20% | 16% | 17% | 16% |
National classification | |||||
Localities (≥50,000) | 97% | 97% | 98% | 97% | 98% |
Localities (≥5,000) | 90% | 89% | 90% | 90% | 90% |
Rural areas (outside localities) | 12% | 14% | 11% | 12% | 11% |
Table 2. Proportion of men having convenient access to public transport by age
TOTAL | 0–14 | 15–24 | 25–64 | 65 and over | |
TOTAL | 62% | 65% | 60% | 61% | 61% |
European clusters | |||||
HD clusters | 99% | 99% | 99% | 99% | 99% |
Urban clusters | 91% | 91% | 91% | 91% | 91% |
Rural areas | 16% | 20% | 16% | 16% | 14% |
National classification | |||||
Localities (≥50,000) | 97% | 97% | 97% | 97% | 98% |
Localities (≥5,000) | 89% | 89% | 89% | 89% | 90% |
Rural areas (outside localities) | 11% | 14% | 11% | 11% | 10% |
Table 3.Propotrion of women having convenient access to public transport by age
TOTAL | 0–14 | 15–24 | 25–64 | 65 and over | |
TOTAL | 66% | 65% | 64% | 67% | 66% |
European clusters | |||||
HD clusters | 99% | 99% | 99% | 99% | 99% |
Urban clusters | 92% | 91% | 92% | 92% | 92% |
Rural areas | 19% | 20% | 17% | 19% | 17% |
National classification | |||||
Localities (≥50,000) | 97% | 97% | 98% | 97% | 98% |
Localities (≥5,000) | 90% | 89% | 90% | 90% | 91% |
Rural areas (outside localities) | 12% | 14% | 11% | 12% | 11% |
Comparison between different classifications (European and national)
Using different classifications for urban areas, the results are quite similar. When the European classification was used, the values obtained were some percentage points higher compared to the values obtained with the national classification.
Results based on national criteria of convenient accessibility
We calculated convenient accessibility to public transport using national criteria as well. This was done to test whether the method and data used in GEOSTAT3 project are suitable to start using public transport accessibility as an Estonian SDG indicator. We collaborated with Veiko Sepp, who gave an overview of the criteria for convenient access to public transport within the context of national indicators for sustainable development. As those criteria were adapted to Estonia, they were somewhat different from the criteria that were given in the GEOSTAT3 project. The national criteria adopt a policy-oriented perspective and take into account varying mobility needs in different types of settlements.
The following criteria were used to describe convenient access to public transport in the context of national indicators for sustainable development:
1) Urban areas
– at least 6 trips per hour (very good access)
– at least 2 trips per hour (good access)
– Euclidian distance
– Service area around each stop 400m
– Between 6:00 am – 8:00 pm on a working day (we picked Wednesday)
2) Rural areas
– at least 3 trips per day (at least 1 trip in range 6:00 am – 9:00 am, 3:00 pm – 6:00 pm and 6:00 pm – 8:00 pm) (good access)
– at least 6 trips per day between 6:00 am – 8:00 pm on a working day (we picked Wednesday) regardless of the time of day (very good access).
– Euclidian distance
– Service area around each stop 1,000m
– Between 6:00 am – 8:00 pm on a working day (we picked Wednesday)
The results are not comparable with the analysis described above. In addition to different accessibility criteria, also criteria for rural-urban distribution varied.
Table 4. Number of stops regarding national criteria
STOPS | |
Urban areas with very good access – at least 6 trips per hour | 650 |
Urban areas with good access – at least 2 trips per hour | 1,557 |
Rural areas – 3 trips per day | 6,698 |
Rural areas – 6 trips per day | 6,315 |
457,434 (50%) persons living in a city settlement region had very good access (6 trips per hour) to public transportation. 682,419 (75%) persons living in a city settlement region had good access (2 trips per hour) to public transportation stops. In rural settlement regions, 316,799 (78%) persons had basic level access (3 trips per day) to public transportation stops and 250,521 (62%) had convenient access (6 trips per day).
The difference between men and women in terms of convenient access to public transportation was bigger in rural areas than in urban areas. The biggest difference between men and women occurred in rural settlement regions with 3 departures from the stop, with 75% of men and 81% of women having convenient access to public transportation.
Table 5. Proportion of population having convenient access to public transport by age
Total | 0–14 | 15–24 | 25–64 | 65 and over | |
Urban areas with very good access – at least 6 trips per hour | 50% | 47% | 50% | 51% | 52% |
Urban areas with good access –
at least 2 trips per hour |
75% | 73% | 75% | 75% | 75% |
Rural areas – 3 trips per day | 78% | 81% | 78% | 78% | 77% |
Rural areas – 6 trips per day | 62% | 64% | 62% | 61% | 61% |
Table 6. Proportion of men having convenient access to public transport by age
Total | 0–14 | 15–24 | 25–64 | 65 and over | |
Urban areas with very good access –
at least 6 trips per hour |
49% | 47% | 48% | 50% | 50% |
Urban areas with good access –
at least 2 trips per hour |
74% | 73% | 74% | 75% | 75% |
Rural areas – 3 trips per day | 75% | 81% | 77% | 74% | 73% |
Rural areas – 6 trips per day | 60% | 64% | 62% | 59% | 58% |
Table 7. Proportion of women having convenient access to public transport by age
Total | 0–14 | 15–24 | 25–64 | 65 and over | |
Urban areas with very good access –
at least 6 trips per hour |
51% | 47% | 51% | 52% | 52% |
Urban areas with good access –
at least 2 trips per hour |
75% | 73% | 76% | 76% | 76% |
Rural areas – 3 trips per day | 81% | 82% | 78% | 82% | 80% |
Rural areas – 6 trips per day | 64% | 64% | 62% | 64% | 63% |
The results are published at Statistics Estonia Web Map Application.
According to the expert’s view, the single criterion approach to public transport accessibility could be a viable option for international comparative measurement. From the national perspective, the approach is too generalizing to have any relevance for mobility/transport policy, and the differentiation of mobility needs on criterion level is recommended. Therefore, the national criteria was developed, which differentiates between two types of settlements:
- Larger urban settlements, where intracity mobility needs are predominant in people’s daily mobility patterns and due to the scope of territory, intracity urban public transport has considerable share in satisfying that need – these are Tallinn, Tartu, Narva, Pärnu cities and the conurbation of Kohtla-Järve and Jõhvi;
- Other settlements, including both rural areas (according to national settlement types) and small towns, where public transportation has a comparatively small share in intrasettlement mobility and the key role of public transport is providing access to other settlements (with more jobs and services) most often locating in higher levels of settlement hierarchy.
In addition, the national criteria takes into account that satisfactory (and economically sensible) frequency levels for intracity public transport and for the public transport between rural (incl. small towns) and urban settlements are very different (see above). It should also be noted that 3 trips per day to and from a rural settlement is a very basic level for public transport accessibility.
To sum up, the expert believes that the national criteria suggested here are suitable for the evaluation of one aspect in sustainable development in Estonia for now. The key challenge concerning such a criteria is that viable mobility solutions are in motion as well – the more flexible transport solutions (e.g. car-sharing, transport-on-demand) are spreading at various speed in urban and rural areas, and it is increasingly difficult to draw a line between public and private transport.
Meanwhile, the national criteria suggested here would need some more specifications as well – first, in terms of defining more clearly the connections between types of settlements measured (e.g. only between a settlement and its closest higher level settlement) and second, in terms of required frequency levels between various types of settlements within settlement hierarchy (e.g. x frequency level between local and areal centres, and y frequency level between rural village and local centre, differentiated in regional spatial plans).
Evaluation
The aim of the analysis was to test the suitability / applicability of ESGF principles for the calculation of geostatistical indicators. The tested common indicator was SDG indicator 11.2.1., which measures the accessibility to public transport in cities.
The analysis confirmed that the use of ESGF principles for the collection and storage of data is reasonable and helps to calculate geostatistical indicators. In particular, the applicability of principles 1, 2, 3 and 4 was assessed. Principle 5 was outside the scope of the grant.
As some ESGF principles have already been applied in Statistics Estonia, the analysis was easy to conduct because certain stages could be skipped.
In conclusion, the main strengths are:
- Using administrative data (population register) for population statistics enables to update the data annually.
- Development of the address standard (managed by the Land Board), development of various services on its basis and interfacing state registers with ADS have improved the quality of address data, due to which geocoding is easy and feasible on a yearly basis.
- Availability of open public transport data from a reliable source (Estonian Road Administration)
Contact information: