Mining and correlating traffic events from human sensor observations with official transport data using self-organizing-maps

Cities are complex systems, where related Human activities are increasingly difficult to explore within. In order to understand urban processes and to gain deeper knowledge about cities, the potential of location-based social networks like Twitter could be used a promising example to explore latent relationships of underlying mobility patterns. In a recently published paper (Steiger et al. 2016), we therefore present an approach using a geographic self-organizing map (Geo-SOM) to uncover and compare previously unseen patterns from social media and authoritative data. The results, which we validated with Live Traffic Disruption (TIMS) feeds from Transport for London, show that the observed geospatial and temporal patterns between special events (r = 0.73), traffic incidents (r = 0.59) and hazard disruptions (r = 0.41) from TIMS, are strongly correlated with traffic-related, georeferenced tweets.
Hence, we conclude that tweets can be used as a proxy indicator to detect collective mobility events and may help to provide stakeholders and decision makers with complementary information on complex mobility processes.

Our results suggest that particularly special events, such as concerts, demonstrations, sports events, etc. are well reflected within Twitter and provide complementary information about possible collective movements, since people talk about the event beforehand and follow similar mobility patterns (Steiger et al., 2015a). These complex events especially, are hard to forecast from classic detectors and therefore social media can be used to enrich existing information. This newly gained knowledge may support decision-makers during traffic events in a way that social media and official authorities complement each other.

The results answer how and when tweets should be used for extracting mobility behavior: Answering “How” the presented SOM framework analyzes the temporal, spatial and textual dimension of each tweet in a combined manner. Furthermore the results can be easily compared with official data to underline the significance of social media for human mobility analysis. Answering “When” the results show which traffic disruption categories are reflected in social media (special events, traffic incidents and hazards), demonstrating in what traffic analysis scenarios social media can be used for as an additional source of information. In opposition to the geospatial-temporal distribution, the textual information from tweets can only be used marginally to semantically enrich traffic disruption information, due to the detailed resolution of traffic conditions (see comparison of the most frequent terms in Fig. 3a). Nonetheless, the results demonstrated the effectiveness of the proposed methodology to uncover similar characteristics and latent disruption patterns from official data and georeferenced tweets (see Fig. 5a). This implies the practical use of tweets to detected real-time traffic events which can add information when no official traffic data sources are available, especially for unplanned events such as demonstrations. Tweets here help to detect these events but also to extract the underlying mobility pattern in order to estimate the effect (severity/intensity) on the infrastructure.

Steiger, E., B. Resch, J. Porto de Albuquerque, A. Zipf (2016): Mining and correlating traffic events from human sensor observations with official transport data using self-organizing-maps. Transportation Research Part C: Emerging Technologies. Vol 73, Dec.16, pp 91–104.

Related Work:

  • Steiger, E., Resch, B. and Zipf, A. (2015) Exploration of spatiotemporal and semantic Clusters of Twitter data using unsupervised neural networks. International Journal of Geographical Information Science (IJGIS).
  • Steiger, E. Westerholt, R., Resch, B. and Zipf, A.(2015): Twitter as an indicator for whereabouts of people? Correlating Twitter with UK census data. Computers, Environment and Urban Systems (CEUS), 45, pp. 255–265.
  • Steiger, E., de Albuquerque, J. P. Zipf, A.(2015): An advanced systematic literature review on spatiotemporal analyses of Twitter data. Transactions in GIS.
  • Steiger, E., Ellersiek, T. Resch, B. Zipf, A.(2015): Uncovering latent mobility patterns from Twitter during mass events. In: Strobl, J., Blaschke, T., Griesebner, G. (Hrsg.): full paper GI_Forum – Journal for Geographic Information Science, 1-2015, pp. 525-534.
  • Steiger, E., Westerholt, R. Zipf, A. (2015): Research on social media feeds – A GIScience perspective. In: C. Capineri et al., (eds.) European Handbook of Crowdsourced Geographic Information, Ubiquity Press.
  • Steiger, E., Ellersiek, T. Zipf, A. (2014): Explorative public transport flow analysis from uncertain social media data. Third ACM SIGSPATIAL International Workshop on Crowdsourced and Volunteered Geographic Information (GEOCROWD) 2014. In conjunction with ACM SIGSPATIAL 2014. Dallas, TX, USA.
  • Steiger, E., Lauer, J. Ellersiek, T. Zipf, A. (2014): Towards a framework for automatic geographic feature extraction from Twitter Eighth International Conference on Geographic Information Science Vienna.
  • Li, M., Westerholt, R., Fan, H. and Zipf, A. (2016): Assessing spatiotemporal predictability of LBSN: A case study of three Foursquare datasets. GeoInformatica, volume and issue pending.
  • Westerholt, R., Steiger, E., Resch, B. and Zipf, A. (2016): Abundant Topological Outliers in Social Media Data and Their Effect on Spatial Analysis. PLOS ONE, 11 (9), e0162360. DOI: 10.1371/journal.pone.0162360.
  • Westerholt, R., Resch, B. and Zipf, A. (2015): A local scale-sensitive indicator of spatial autocorrelation for assessing high- and low-value clusters in multi-scale datasets. International Journal of Geographical Information Science, 29 (5), 868-887. DOI: 10.1080/13658816.2014.1002499