Among semi-automated methods and pre-processed data products, crowdsourcing is another tool which can help to collect information on human settlements and complement existing data, yet it’s accuracy is debated. Whereas the potential of crowdsourced datasets for training of machine learning algorithms has been explored recently, only few work has been done towards utilizing machine learning techniques to enhance the crowdsourcing workflow itself. In recent research we investigated a novel approach that incorporates logistic regression to aggregate crowdsourced classification on human settlements from the MapSwipe app. For a case study containing 941,589 mapping tasks, we analysed to what degree such an approach can improve data quality utilizing intrinsic context factors such as user agreement, user characteristics and spatial characteristics of the results.
The results have shown that a logistic regression based aggregation of crowdsourced classifications produced significantly higher quality data than common approaches that use soft majority agreement. The findings pronounce that the integration of machine learning techniques into existing crowdsourcing workflows can become a key point for the future development of crowdsourcing applications. However, regarding the limited geographic scope of this research, further validation of the automated classification and its transferability need to be addressed in future investigations.
Intelligent crowdsourcing approaches can dynamically derive data quality indicators to improve the task allocation process. For instance, for tasks reaching a high credibility no further classification should be obtained, whereas uncertain tasks should be repeated or validation should be prioritized. This could reduce the amount of required crowdsourced classifications while maintaining high quality. The setting bears great potential for features where fully automated techniques still fail to produce reasonable data quality.
Herfort, B., Zipf, A. (2018 accepted): Enhancing Crowdsourced Classification on Human Settlements Utilizing Logistic Regression Aggregation and Intrinsic Context Factors. VGI-ALIVE Workshop. at AGILE 2018. Lund, Sweden.
MapSwipe Analytics: http://mapswipe.heigit.org/ by HeiGIT