Georeferenced user-generated datasets like those extracted from Twitter and similar social media feeds are increasingly gaining the interest of spatial analysts. Such datasets oftentimes reflect a wide array of real-world phenomena. However, each of these phenomena takes place at a certain spatial scale. Therefore, user-generated datasets are of multi-scale nature. Such datasets cannot be properly dealt with using the most common analysis methods, because these are typically designed for single-scale datasets where all observations are expected to reflect one single phenomenon (e.g., crime incidents). In a new paper, which will appear in IJGIS soon, we focus on the popular local G statistics. We propose a modified scale-sensitive version of that widely used statistic. Furthermore, our approach comprises an alternative neighborhood definition that enables to extract certain scales of interest. We compared our method with the original one on a real-world Twitter dataset. Our experiments show that our approach is able to better detect spatial autocorrelation at specific scales, when multiple scales are contained in the analysed dataset. Based on the findings of our research, we identified a number of scale-related issues that our approach is able to overcome. Thus, we demonstrate the multi-scale suitability of the proposed solution.
The corresponding paper is currently in the production process, and thus not yet published (it is expected to be available online from the end of January). A pre-production copy of the paper can be found here:
Westerholt, R., Resch, B., and Zipf, A. (2014): A local scale-sensitive indicator of spatial autocorrelation for assessing high- and low-value clusters in multi-scale datasets. International Journal of Geographical Information Science, issue pending, pp. pending. doi:10.1080/13658816.2014.1002499.