locked
How to bulk insert documents into Azure Search Service? RRS feed

  • Question

  • Hi,

    We have about 2.5 million documents, currently stored in a table in a SQL database on Azure.

    We would like to insert these documents into an Azure search service, hosted in the same datacentre.

    What is the best way to accomplish this?

    It would be painful to do this by reading each row and inserting using the REST API. Is there a better way?

    Thanks.

    Noel


    Monday, November 3, 2014 5:08 PM

Answers

  • Hi Noel,

    This is a top request in uservoice,  and in fact it is #2 according to votes.  This is something we are indeed researching right now.  Our thoughts are that we would likely start with support for public accessible SQL Server (meaning that it would not initially index a SQL Server behind a corporate firewall).  The good news is that this sounds like it would suit your needs?

    A question for you is the following.  This would be a scheduled sync which means that it will not be able to provide near real-time updates of changes in Azure Search.  Is this acceptable?  How long of a delay is acceptable for you?  Realistically, if you need near real-time updates, the only practical way to do this would be to align your updates to SQL Server to also post the updates to Azure Search.  Other options include using a combination of SQL Server Integrated Change Tracking along with Worker Roles.  For this latter I hope to have a blog post on this topic in the next week or so.  I'm happy to provide an early version of this if you would like (email -> Liam [DOT] Cavanagh [AT] microsoft [DOT] com).

    Liam


    Sr. Program Manager, SQL Azure Strategy - Blog


    Tuesday, November 4, 2014 5:28 PM
    Moderator

All replies

  • The REST API is the only API.  Have you seen the sample that shows how to load records from SQL Server to Azure Search?  Here is the link: http://azure.microsoft.com/en-us/documentation/articles/search-create-first-solution/   

    Heidi Steen (MSFT)

    Monday, November 3, 2014 6:35 PM
  • @Heidi, thanks for the link.

    It's useful for people learning about the service service.

    I was hoping for an enterprise-level solution for this, as it's a problem that most people are likely to encounter.

    Monday, November 3, 2014 9:26 PM
  • I've opened a request on the user voice site:

    Permit bulk loading of documents from SQL server

    Tuesday, November 4, 2014 11:13 AM
  • Hi Noel,

    This is a top request in uservoice,  and in fact it is #2 according to votes.  This is something we are indeed researching right now.  Our thoughts are that we would likely start with support for public accessible SQL Server (meaning that it would not initially index a SQL Server behind a corporate firewall).  The good news is that this sounds like it would suit your needs?

    A question for you is the following.  This would be a scheduled sync which means that it will not be able to provide near real-time updates of changes in Azure Search.  Is this acceptable?  How long of a delay is acceptable for you?  Realistically, if you need near real-time updates, the only practical way to do this would be to align your updates to SQL Server to also post the updates to Azure Search.  Other options include using a combination of SQL Server Integrated Change Tracking along with Worker Roles.  For this latter I hope to have a blog post on this topic in the next week or so.  I'm happy to provide an early version of this if you would like (email -> Liam [DOT] Cavanagh [AT] microsoft [DOT] com).

    Liam


    Sr. Program Manager, SQL Azure Strategy - Blog


    Tuesday, November 4, 2014 5:28 PM
    Moderator