Several urban studies have been increasingly relying on spatial data provided by Volunteered Geographic Information (VGI) sources. The matching of features across different VGI projects may serve to assess and improve the reliability and completeness of VGI data. In a recent study, we first provide a short discussion on the similarity measures often used for matching points-of-interests (POIs). This discussion leads to the argument that no single measure is completely effective when dealing with VGI data and that a reasonable aggregation of these measures is necessary. We then propose a matching strategy based on a graph whose nodes and edges represent the POIs and their possible matching pairs, respectively. Each edge has a ‘final weight’ that can be an aggregation of the different similarities or simply the value of one of them. The matching consists in extracting all possible subsets of edges from the graph in which no node occurs more than once. It than selects the subset with the highest sum of final weights. As a first evaluation of this strategy, we conducted an experiment with food-related POIs from OpenStreetMap and Foursquare. We tested different similarity measures and linear combinations of them as final weights from the graph edges. The results show that: (1) spatial and semantic similarities perform poorly, (2) string similarities are just above 90% accurate and (3) the highest matching accuracy was achieved when considering string and spatial similarities together.
More details and results will be given in the paper accepted for the AGILE 2017 conference in Wageningen.
Novack, T., Peters R. and A. Zipf (2017 accepted): A graph-based strategy for matching points-of-interests from different VGI sources. AGILE 2017, International Conference on Geographic Information Science. May 9-12. Wageningen, NL.
DFG Project http://www.geog.uni-heidelberg.de/gis/dfg_en.html