[id: 70]

Intrinsic Assessment of Positional Accuracy for OpenStreetMap Road Data

Short description: The goal of this thesis is to develop intrinsic quality indicators to assess the positional accuracy of OSM road data.

Keywords: Volunteered Geographic Information (VGI), OpenStreetMap, intrinsic data quality, road data

Topic at: TU Munich

Staff involved: Wangshu Wang (wangshu.wang@tum.de) ; Francis Andorful (Heidelberg University: francis.andorful@uni-heidelberg.de)

Description:

[Figure above: Increasing buffer Model (IBM) for assessing positional accuracy illustrated by Forghani and Delavar (2014)]

Due to the crowdsourced nature of OpenStreetMap (OSM) and the lack of quality control during the contribution, the data quality issue has become a research focus (Yan et al., 2020). Understanding and addressing these data quality issues can facilitate unlocking OSM's full potential for diverse applications. OSM data quality assessment methods can be divided into two broad categories: extrinsic and intrinsic. Extrinsic quality assessment methods compare OSM data with a reference dataset (e.g., authoritative data sources). This is where most initial research on OSM data quality started from (Girres & Touya, 2010; Haklay, 2010; Zielstra & Zipf, 2010). Yet, a reference dataset may not always be available. On this ground, researchers called for attention to the intrinsic indicators of OSM data quality (Barron et al., 2014) and proposed intrinsic data quality measures based on the data, data history, and metadata (Barron et al., 2014; Nejad et al., 2022; Sundaram et al., 2021).

The positional accuracy of roads in OpenStreetMap (OSM) has traditionally been evaluated using extrinsic comparisons with official datasets, employing models such as the Increasing Buffer Model (IBM) (Forghani & Delavar, 2014; Goodchil & Hunter, 1997). In our analysis, we observed a potential correlation between the positional accuracy of a road segment and the number of nodes present, as well as the interval between these nodes. “Good” or trustworthy dataset seems to reveal certain patterns for road data. Due to the intrinsic topological constraints of road data, if the topologically checked data follow a certain pattern in terms of node distribution and the interval between the nodes, the positional accuracy could be approximated from the intrinsic patterns.

First, the student reviews relevant literature on extrinsic positional accuracy to identify common patterns and influencing factors, such as the interval between nodes. The student then examines methods for deriving these patterns from official datasets, using machine learning techniques to determine optimized variables or variable combinations for assessing positional accuracy. Based on these findings, the student proposes methods to assess the positional accuracy by approximating the patterns in OSM data to those observed in official datasets. An evaluation of this approach with a case study concludes the thesis.

Skills required for the project:

Requirements are good knowledge in one programming language and an interest in machine learning.

Literature/references:

Barron, C., Neis, P., & Zipf, A. (2014). A comprehensive framework for intrinsic OpenStreetMap quality analysis. Transactions in GIS, 18(6), 877–895. https://doi.org/10.1111/tgis.12073
Forghani, M., & Delavar, M. (2014). A quality study of the OpenStreetMap dataset for Tehran. ISPRS International Journal of Geo-Information, 3(2), 750–763. https://doi.org/10.3390/ijgi3020750
Goodchild, M. F., & Hunter, G. J. (1997). A simple positional accuracy measure for linear features. International Journal of Geographical Information Science, 11(3), 299–306. https://doi.org/10.1080/136588197242419
Girres, J., & Touya, G. (2010). Quality assessment of the French OpenStreetMap dataset. Transactions in GIS, 14(4), 435–459. https://doi.org/10.1111/j.1467-9671.2010.01203.x
Haklay, M. (2010). How Good is Volunteered Geographical Information? A Comparative Study of OpenStreetMap and Ordnance Survey Datasets. Environment and Planning B: Planning and Design, 37(4), 682–703. https://doi.org/10.1068/b35097
Nejad, R. G., Abbaspour, R. A., & Chehreghan, A. (2022). Spatiotemporal VGI contributor reputation system based on implicit evaluation relations. Geocarto International, 37(26), 12014–12041. https://doi.org/10.1080/10106049.2022.2063406
Sundaram, R. C., Naghizade, E., Borovica-Gajić, R., & Tomko, M. (2021). Can you fixme? An intrinsic classification of contributor-identified spatial data issues using topic models. International Journal of Geographical Information Science, 36(1), 1–30. https://doi.org/10.1080/13658816.2021.1893323
Yan, Y., Feng, C., Huang, W., Fan, H., Wang, Y., & Zipf, A. (2020). Volunteered geographic information research in the first decade: a narrative review of selected journal articles in GIScience. International Journal of Geographical Information Science, 34(9), 1765–1791. https://doi.org/10.1080/13658816.2020.1730848
Zielstra, D., & Zipf, A. (2010). A comparative study of proprietary geodata and volunteered geographic information for Germany. In 13th AGILE international conference on geographic information science (Vol. 2010, pp. 1-15). Portugal: Guimarães.