Researchers at HeiGIT (Heidelberg Institute for Geoinformation Technology) have publicly released a first of its kind planet-scale dataset on road surface type (paved or unpaved) using state-of-the-art GeoAI methods based on street-view imagery from Mapillary, in order to support humanitarian response, urban planning and progress towards the Sustainable Development Goals.
Road surface information plays an essential role across multiple sectors, influencing everything from transportation safety to economic growth and environmental sustainability. Knowing whether a road is paved or unpaved can impact decision-making for route planning and emergency responses. For instance, unpaved roads, especially when poorly maintained or affected by weather, increase the risk of accidents. Emergency services need accurate road surface information to choose the safest and most efficient routes, especially in regions with limited infrastructure.
Beyond safety, this data is also key to optimizing supply chains, supporting agricultural operations, and improving industrial logistics. Poor road conditions can lead to delays, higher transportation costs, and reduced efficiency, hindering economic development. Accurate road surface information is particularly crucial in rural or underdeveloped areas, where it can significantly enhance routing and logistics planning, improving both safety and economic outcomes.
However, mapping is only the first step. Adding detailed attributes, such as road surface types, enhances the accuracy and utility of OpenStreetMap (OSM), transforming it into a more powerful resource for decision-making and supporting services worldwide.
Distribution of road surface quality predicted based on Mapillary data: (a) total pavedness (defined as the ratio of total paved roads w.r.t total OSM roads
for each zoom 8 tile), (b) and (c) pavedness per tile, calculated for urban and rural areas respectively.
Currently, only 33% of roads in OpenStreetMap (OSM) currently include surface type information, with even greater data gaps in developing regions. To address this, HeiGIT has released a global dataset identifying road surface types (paved or unpaved), now covering 36.7 million kilometers of roads worldwide. This expanded dataset increases global road surface coverage from 33% to about 36%, with notable improvements in North America. Despite this progress, significant gaps remain, highlighting the potential of integrating data from other open platforms like Panoramax, hosted by the French OSM community.
The Dataset has been meticulously curated with the help of big data, machine learning and geospatial analysis. The team has made significant strides in improving global road surface classification using cutting-edge deep learning techniques. A major challenge was posed by the diversity of road imagery, especially from crowdsourced platforms like Mapillary, which provides vast street-view images worldwide. To produce a robust training dataset, the team organized a mapathon in July 2023 using the HeiGIT CrowdMap web application. Thirty volunteers came together and labeled 20,000 random Mapillary images from 39 countries in sub-Saharan Africa, classifying them as “paved,” “unpaved,” or “bad imagery.” This labeling process created a reliable training set for the model.
Along the way, the team uncovered important trends in both urban and rural infrastructure related to global road surface data and Mapillary coverage. Mapillary’s global coverage remains limited, with only 3.48% of OpenStreetMap (OSM) roads covered on average. Urban areas show better coverage at 8.88%, while rural regions fall behind with just 2.65%. However, critical roads like motorways and trunks have significantly higher coverage, reaching 45%, and cities in Western Europe, North America, and Australia boast coverage rates as high as 70%.
In terms of global road surface trends, the analysis revealed notable differences between urban and rural areas. In most urban regions, paved roads make up 60-80% of the network. In contrast, rural areas, especially in Africa and Asia, display much more variety in road surface types. Paved road coverage in these regions drops below 40%, with countries like Pakistan, Nepal, Rwanda, and Mozambique standing out for their lower paved road ratios. The study also showed a strong correlation between road infrastructure and development levels, with countries that have higher Human Development Index (HDI) scores generally featuring more paved roads. Lower-HDI regions, particularly in rural areas, showed greater variation in surface quality.
The analysis also includes a visual map that illustrates road surface conditions and the extent of data available in OSM. Developed regions like North America and Europe are well-documented, with mostly paved roads, while regions in Africa, South America, and parts of Asia have more unpaved roads and less comprehensive surface data.
Road surface predictions based on Mapillary data for rural and urban areas (left/right half of the pie chart respectively) along with the percentage of roads
in OpenStreetMap(OSM) with surface information (choropleth map) at country level. Paved and unpaved road information is marked by blue and orange segments and the size of the semi-circle refers to the total length of predicted roads for that country, where a larger size indicates a more extensive road network.
As road infrastructure remains a critical metric for socio-economic development, our global road surface dataset will provide valuable insights, helping to build a more connected and resilient world. The generated dataset is openly available in the The Humanitarian Data Exchange and the scientific publication as a preprint. This enables further analysis in geospatial applications and computer vision modeling. Ultimately, our road surface dataset is an essential resource for researchers, planners, and humanitarian organizations, bridging the gap in global infrastructure data and supporting goals related to transportation safety, economic growth, and environmental protection.
However, the utility of any dataset depends on the quality of the underlying data. Ensuring the reliability of OpenStreetMap (OSM) data is crucial, as it forms the foundation for many applications in humanitarian aid, urban planning, and economic development. Our OQAPI tool visualizes OSM data quality by analyzing intrinsic indicators (e.g., edit history) and extrinsic comparisons (e.g., external datasets), helping users assess whether OSM meets their specific requirements.
To complement this, the ohsome dashboard allows users to analyze and monitor OSM data quality over time, identifying gaps and informing where supplementary data—such as street-level imagery—can be used to improve data completeness. Better data quality in OSM also directly supports applications like openrouteservice, which relies on accurate OSM data to generate optimal routes for various mobility needs.
To keep up with future developments and releases related to mobility, humanitarian aid, climate action and data analytics, follow the social media channels and stay up to date on our blog.
Title image: Visualization of road surface classification. Warmer colors (red, orange, and yellow) denote higher intensity activation levels, whereas cooler colors (e.g., blue and green) denote lower activation levels.