Road detections from Microsoft Maps aerial imagery
Перейти к файлу
USMissingRoadsDiscovery c746a25f54
Add files via upload
2023-03-15 10:23:25 -07:00
images Add files via upload 2023-03-15 10:23:25 -07:00
.gitignore Initial commit 2021-04-23 20:31:11 +00:00
CODE_OF_CONDUCT.md Initial CODE_OF_CONDUCT.md commit 2021-04-23 13:31:14 -07:00
LICENSE create LICENSE file 2022-09-22 19:07:25 -07:00
README.md Update README.md 2023-03-15 10:21:56 -07:00
SECURITY.md Initial SECURITY.md commit 2021-04-23 13:31:15 -07:00

README.md

Introduction

Bing Maps is releasing mined roads around the world. We have detected 48.8M km of all roads and 1165K km of roads missing from OSM. Mining is performed with Bing Maps imagery between 2020 and 2022 including Maxar and Airbus. The data is freely available for download and use under the Open Data Commons Open Database License (ODbL).

Data

Mining status

Date All ML derived Roads ML derived Roads missing from OSM
Region Length in '000 Km Region Length in '000 Km
20 May 2020 United States 9,308 United States 818
21 Mar 2021 South America 4,480 South America 98
21 Jan 2022 Caribbean Islands 232 Caribbean Islands 5
03 Mar 2022 Middle East 3,444 Middle East 84
05 Apr 2022 Central Asia 1,204 Central Asia 28
18 Apr 2022 Northern Africa 1,077 Northern Africa 24
28 Apr 2022 Western Africa 982 Western Africa 32
28 Apr 2022 Central Africa 324 Central Africa 6
12 May 2022 Eastern Africa 1,151 Eastern Africa 31
12 May 2022 Southern Africa 1,506 Southern Africa 40
08 Jun 2022 Europe 10,212 N/A N/A
03 Jul 2022 Oceania 1,947 N/A N/A
27 Jul 2022 Central America 1,376 N/A N/A
03 Aug 2022 Canada 1,832 N/A N/A
13 Aug 2022 South Asia 3,723 N/A N/A
12 Sep 2022 Southeastern Asia 2,744 N/A N/A
19 Sep 2022 North Asia 2,259 N/A N/A
27 Feb 2023 Japan 1,105 N/A N/A

FAQ

What is the GeoJson format?

GeoJSON is a format for encoding a variety of geographic data structures. For Intensive Documentation and Tutorials, Refer to GeoJson Blog

Data generation details:

The road extraction is done in four stages (full drop went through two stages and OSM missing set went through all four):

  1. Semantic Segmentation – Recognizing road pixels on the aerial image using Convolutional Neural Network (CNN).
  2. Geometry Generation - A series of algorithms and processes transforming output of semantic segmentation into roads in geometry format.
    • Image postprocessing
    • Thinning
    • Connectivity improvement
    • Graph construction
    • Finalizing road shapes and network quality
    • Stiching road geojsons between neighboring images where needed
  3. Conflation & Cutting - Excluding roads and parts of roads that already exist in the road network (OSM).
  4. Classification - A classifier to filter out low-confidence roads and predict a road type.

Neural network architecture and dataset

Our network was based on UNet and ResNet and the following papers [U-Net] (https://arxiv.org/abs/1505.04597), [Res U-Net] (https://arxiv.org/pdf/1512.03385.pdf), [Res U-Net] (https://arxiv.org/pdf/1711.10684.pdf). The model was trained on 512x512 images, it is fully-convolutional, which allows images of any size (that is divisable by 64) be processed by the model (constrained by GPU memory, 1088x1088 in our case). The training set consists of 20000 labeled images. Majority of the satellite images cover diverse areas all around the world. To achieve a good set representation, we have enriched the set with samples from various areas covering mountains, glaciers, forests, deserts, beaches, coasts, etc. Images in the set are of 1088x1088 pixel size with 100 cm/pixel resolution. The training is done with Keras toolkit.

Metrics

We measure intermediate stage metrics to track performance of our models. Pixel metric measures performance of the the Convolutional Neural Network and APLS metric (Average Path Length Similarity) measures overall connectivity after geometry generation stage.

Metric Precision Recall
Pixel 85.24% 82.81%
APLS 87.53% 79.33%

Data Vintage

The vintage of the roads depends on the vintage of the underlying imagery. Because Bing Imagery is a composite of multiple sources it is difficult to know the exact dates for individual pieces of data.

How good is the data?

The Osm Missing Data went through a final classifier to ensure that the precision is at least 95% (90% for USA now - to be updated to 95% in 2022). After classifier filters out potentially bad roads we remeasure the precision and make sure that it is 95% before releasing results

Why is the data being released?

Microsoft has a continued interest in supporting a thriving OpenStreetMap ecosystem.

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

Legal Notices

Microsoft, Windows, Microsoft Azure and/or other Microsoft products and services referenced in the documentation may be either trademarks or registered trademarks of Microsoft in the United States and/or other countries. The licenses for this project do not grant you rights to use any Microsoft names, logos, or trademarks. Microsoft's general trademark guidelines can be found at http://go.microsoft.com/fwlink/?LinkID=254653.

Privacy information can be found at https://privacy.microsoft.com/en-us/

Microsoft and any contributors reserve all other rights, whether under their respective copyrights, patents, or trademarks, whether by implication, estoppel or otherwise.