none
Geocoding millions of records RRS feed

  • Question

  • I have searched the forum but have not found a topic that discusses the fastest way to geocoding millions of address.  I have several rest SSIS packages that geocode thousands of records at once, but my new data source has 85 million records to geocode.  Using the rest service would take far too long.  What is the fastest way to geocode a huge number of addresses at once?  Any advice or links on this topic would be appreciated.

    Thank you!

    Tuesday, January 10, 2017 4:45 PM

Answers

  • You will want to use the Batch Geocoder in the Bing Spatial Data Services. This API allows you to pass in 200,000 rows of data per request. Here is some documentation on this API:

    https://msdn.microsoft.com/en-us/library/ff701733.aspx

    https://msdn.microsoft.com/en-us/library/gg585136.aspx

    There are limits to this API depending on the type of Bing Maps key you use.

    Basic Key limits:

    • Applicable data source and geocode jobs (see the list of applicable jobs in Get Job List) that use Basic keys from the same Bing Maps Account have the following account limits:

      • You can have a total of 2 jobs in process at the same time.

      • You can run a total of 50 jobs in a 24 hour period.

    • Data that is geocoded or uploaded to a data source must use UTF-8 encoding, and can have a maximum of 50 entities. Compressed data files are accepted.

    Enterprise Key limits:

    • Applicable data source and geocode jobs (see the list of applicable jobs in Get Job List ) that use Enterprise keys from the same Bing Maps Account have the following account limits. Jobs that use Basic keys that belong to an Enterprise account have the limits described above and are also included in the following limits.

      • You can have a total of 10 jobs in process at the same time. This limit also includes jobs run with Basic keys.

      • You can run a total of 50 jobs in a 24 hour period. This limit also includes jobs run with Basic keys.

    • Data that is geocoded or uploaded to a data source must use UTF-8 encoding, and can have up to 300 MB of uncompressed data and a maximum of 200,000 entities. Compressed data files are accepted, but the uncompressed limit applies.

    Putting this all together this means that in a single day you can geocode up to 2500 addresses using a basic key, or 10M using an Enterprise key. Given the volume of addresses you need to geocode you will need an Enterprise license as it is well beyond the free limits and would be the only way you would be able to process all your data in a reasonable amount of time.

    I have a library that makes it fairly easy to use the Bing Spatial Data Services from .NET code. I plan to make it open source when I get a chance. Send me an email at richbrun at microsoft.com and I'll send you over a copy.


    [Blog] [twitter] [LinkedIn]

    Tuesday, January 10, 2017 5:19 PM
  • There is some sample code here: https://msdn.microsoft.com/en-us/library/ff701729.aspx However, the library I mentioned would be a lot less work.

    [Blog] [twitter] [LinkedIn]

    Tuesday, January 10, 2017 8:56 PM

All replies

  • You will want to use the Batch Geocoder in the Bing Spatial Data Services. This API allows you to pass in 200,000 rows of data per request. Here is some documentation on this API:

    https://msdn.microsoft.com/en-us/library/ff701733.aspx

    https://msdn.microsoft.com/en-us/library/gg585136.aspx

    There are limits to this API depending on the type of Bing Maps key you use.

    Basic Key limits:

    • Applicable data source and geocode jobs (see the list of applicable jobs in Get Job List) that use Basic keys from the same Bing Maps Account have the following account limits:

      • You can have a total of 2 jobs in process at the same time.

      • You can run a total of 50 jobs in a 24 hour period.

    • Data that is geocoded or uploaded to a data source must use UTF-8 encoding, and can have a maximum of 50 entities. Compressed data files are accepted.

    Enterprise Key limits:

    • Applicable data source and geocode jobs (see the list of applicable jobs in Get Job List ) that use Enterprise keys from the same Bing Maps Account have the following account limits. Jobs that use Basic keys that belong to an Enterprise account have the limits described above and are also included in the following limits.

      • You can have a total of 10 jobs in process at the same time. This limit also includes jobs run with Basic keys.

      • You can run a total of 50 jobs in a 24 hour period. This limit also includes jobs run with Basic keys.

    • Data that is geocoded or uploaded to a data source must use UTF-8 encoding, and can have up to 300 MB of uncompressed data and a maximum of 200,000 entities. Compressed data files are accepted, but the uncompressed limit applies.

    Putting this all together this means that in a single day you can geocode up to 2500 addresses using a basic key, or 10M using an Enterprise key. Given the volume of addresses you need to geocode you will need an Enterprise license as it is well beyond the free limits and would be the only way you would be able to process all your data in a reasonable amount of time.

    I have a library that makes it fairly easy to use the Bing Spatial Data Services from .NET code. I plan to make it open source when I get a chance. Send me an email at richbrun at microsoft.com and I'll send you over a copy.


    [Blog] [twitter] [LinkedIn]

    Tuesday, January 10, 2017 5:19 PM
  • Thank you for the reply.  Are there any links that have practical examples of how to go about using the Bing Spatial Data Services Batch Geocoder?  I'm hoping to accomplish this through an SSIS package.

    Thank you

    Tuesday, January 10, 2017 8:02 PM
  • There is some sample code here: https://msdn.microsoft.com/en-us/library/ff701729.aspx However, the library I mentioned would be a lot less work.

    [Blog] [twitter] [LinkedIn]

    Tuesday, January 10, 2017 8:56 PM