[id: 70]
Short description: The goal of this thesis is to develop intrinsic quality indicators to assess the positional accuracy of OSM road data.
Keywords: Volunteered Geographic Information (VGI), OpenStreetMap, intrinsic data quality, road data
Topic at: TU Munich
Staff involved: Wangshu Wang (wangshu.wang@tum.de) ; Francis Andorful (Heidelberg University: francis.andorful@uni-heidelberg.de)
Description:
Due to the crowdsourced nature of OpenStreetMap (OSM) and the lack of quality control during the contribution, the data quality issue has become a research focus (Yan et al., 2020). Understanding and addressing these data quality issues can facilitate unlocking OSM's full potential for diverse applications. OSM data quality assessment methods can be divided into two broad categories: extrinsic and intrinsic. Extrinsic quality assessment methods compare OSM data with a reference dataset (e.g., authoritative data sources). This is where most initial research on OSM data quality started from (Girres & Touya, 2010; Haklay, 2010; Zielstra & Zipf, 2010). Yet, a reference dataset may not always be available. On this ground, researchers called for attention to the intrinsic indicators of OSM data quality (Barron et al., 2014) and proposed intrinsic data quality measures based on the data, data history, and metadata (Barron et al., 2014; Nejad et al., 2022; Sundaram et al., 2021).
The positional accuracy of roads in OpenStreetMap (OSM) has traditionally been evaluated using extrinsic comparisons with official datasets, employing models such as the Increasing Buffer Model (IBM) (Forghani & Delavar, 2014; Goodchil & Hunter, 1997). In our analysis, we observed a potential correlation between the positional accuracy of a road segment and the number of nodes present, as well as the interval between these nodes. “Good” or trustworthy dataset seems to reveal certain patterns for road data. Due to the intrinsic topological constraints of road data, if the topologically checked data follow a certain pattern in terms of node distribution and the interval between the nodes, the positional accuracy could be approximated from the intrinsic patterns.
First, the student reviews relevant literature on extrinsic positional accuracy to identify common patterns and influencing factors, such as the interval between nodes. The student then examines methods for deriving these patterns from official datasets, using machine learning techniques to determine optimized variables or variable combinations for assessing positional accuracy. Based on these findings, the student proposes methods to assess the positional accuracy by approximating the patterns in OSM data to those observed in official datasets. An evaluation of this approach with a case study concludes the thesis.
Requirements are good knowledge in one programming language and an interest in machine learning.
Literature/references: