New DFG Project on Data Integration from Location Based Social Networks

Recently a new project has been accepted to be funded by the Deutsche Forschungsgemeinschaft (DFG). It deals with an agent-based and quality-aware integration of geo-social network data, thus investigating data integration as a collaborative negotiation process.

Web-based social networks are now a significant social phenomenon. The user-generated data produced through them, including the location-based data, represents an important economic asset. Most of the spatial data being generated consists in points-of-interest (POIs). While POIs generally contain point geometries, they also include rich semantic information. However, as of today, we lack knowledge about the spatial data generated through these networks, their characteristics, quality and potential usage. Some studies deal with crowdsouring projects, especially OpenStreetMap (OSM). Still, there has been so far little focus on spatial data extracted from social networks. Preliminary investigations show that the different networks have complementary content. Therefore, methods for the fusion of spatial data extracted from geosocial networks are needed to enable the integrated use of the data, whereby the richness and quality of the resulting data could be be increased. Therefore, in the proposed project, we will develop methods for evaluating the quality and methods for integration of user-generated spatial data from different geosocial networks. In contrast with conventional data sources, the quality of the data extracted from geosocial networks is characterized by a high spatial heterogenity, while traditional data captured by authorities and companies usually are more homogeneous. Therefore, new methods to incorporate data quality into data integration must be developed. One key aspect that will be investigated to account for data quality, without having to systematically assess every data entry, is the contributors’ profile and behavior. Studies have demonstrated that some elements of a contributor’s profile and behavior are linked to the quality of his or her contributions. Therefore, we will conduct our own study to detect the relation between data quality and contributors’ profiles. Then, the integration process will be modeled as an agent-based negotiation process based on Game Theory where agents representing contributors and their profile will negotiate the final representation of integrated data, in a way to maximize quality of resulting data. The method will be evaluated by comparing integrated data to conventional data sources. In summary, the main project’s outcomes and contributions include new methods for the evaluation of user-generated spatial data from geo-social networks, keys findings concerning the relation between contributors’ profiles and data quality, as well as new methods for integration of these heterogeneous data sources using agent-based approaches. Since the quality of the different data sources is constantly changing in space and time, the focus is on the development of suitable methods so that the quality investigations, and the integration process can be repeated anytime.

Fig. POIs from different geosocial networks in Heidelberg