Several geospatial applications require comprehensive semantic information from points-of-interest (POIs). However, this information is frequently dispersed across different collaborative mapping platforms. Surprisingly, there is still a research gap on the conflation of POIs from this type of geo-dataset. In a recent paper by Novack et al. (2018), we focus on the matching aspect of POI data conflation by proposing two matching strategies based on a graph whose nodes represent POIs and edges represent matching possibilities. We demonstrate how the graph is used for
(1) dynamically defining the weights of the different POI similarity measures we consider;
(2) tackling the issue that POIs should be left unmatched when they do not have a corresponding POI on the other dataset and
(3) detecting multiple POIs from the same place in the same dataset and jointly matching these to the corresponding POI(s) from the other dataset.
The strategies we propose do not require the collection of training samples or extensive parameter tuning. They were statistically compared with a “naive”, though commonly applied, matching approach considering POIs collected from OpenStreetMap and Foursquare from the city of London (England). In our experiments, we sequentially included each of our methodological suggestions in the matching procedure and each of them led to an increase in the accuracy in comparison to the previous results. Our best matching result achieved an overall accuracy of 91%, which is more than 10% higher than the accuracy achieved by the baseline method.
It is important to point out that neither the edges final weight computation nor the matching strategies we proposed require time-costly collection of training samples. Because of that, our methods can be more easily integrated into broader workflows with goals beyond the POI conflation step. Furthermore, unsupervised POI matching methods tend to be more transferable than supervised methods, which, although possibly more effective in a specific area, involve the risk of over-fitting and therefore of poor transferability.
Novack, T.; Peters, R.; Zipf, A. (2018): Graph-Based Matching of Points-of-Interest from Collaborative Geo-Datasets. ISPRS Internat. Journal of Geo-Inf. 2018, 7, 117. doi:10.3390/ijgi7030117
Related selected earlier Work:
- Kuo, C.-L., T.C. Chan, I.-C. Fan, A. Zipf (2018): Efficient Method for POI/ROI Discovery Using Flickr Geotagged Photos. ISPRS Int. J. Geo-Inf. 2018, 7(3), 121; doi:10.3390/ijgi7030121.
- Westerholt, R., Steiger, E., Resch, B. and Zipf, A. (2016): Abundant Topological Outliers in Social Media Data and Their Effect on Spatial Analysis. PLOS ONE, 11 (9), e0162360. DOI:10.1371/journal.pone.0162360.
- Jonietz, D., Zipf, A. (2016,): Defining fitness-for-use for crowdsourced points of interest (POI). ISPRS Internat. Journal of Geo-Information. 2016. 5(9), 149; DOI:10.3390/ijgi5090149
- Rousell A. and Zipf A. (2017): Towards a landmark based pedestrian navigation service using OSM data. International Journal of Geo-Information, ISPRS IJGI, 6(3): 64.
- Steiger, E., Westerholt, R., Resch, B. and Zipf, A. (2015): Twitter as an indicator for whereabouts of people? Correlating Twitter with UK census data. Computers, Environment and Urban Systems (CEUS), 54, pp. 255–265. Elsevier. doi:10.1016/j.compenvurbsys.2015.09.007
- Steiger, E., Resch, B., Zipf, A. (2015): Exploration of spatiotemporal and semantic clusters of Twitter data using unsupervised neural networks. International Journal of Geographical Information Science (IJGIS), Taylor & Francis. doi:10.1080/13658816.2015.1099658
- Steiger, E.; Porto de Albuquerque, J.; Zipf, A. (2015): An advanced systematic literature review on spatiotemporal analyses of Twitter data. Transactions in GIS, 19(6): 809–834. Wiley. doi:10.1111/tgis.12132
- Sun, Y., Fan, H., Bakillah, M. & Zipf, A. (2013): Road-based Travel Recommendation Using Geo-tagged Images. Computers, Environment and Urban Systems (CEUS). Volume 53, Pages 110-122. Elsevier. https://doi.org/10.1016/j.compenvurbsys.2013.07.006