Limitations and how to increase quota RRS feed

  • Question

  • Hi guys

    I'm working on some tests to define a solution for an organization and evaluating some azure features.

    They want to move data from On-premise MS SQLs (800 Data Bases, 350 Tables per Data Base), hourly / or maybe each 4 hours (that depends on of the cost)

    I have some questions:

    1- What is the cost to increase the limit for the Requests? How you can see they will have an important number of activities, and the current limit is 200MB per request, it is limiting the number of activities can I include per pipeline.

    2- Is feasible to define a Data Gateway and use it over several pipelines? In order to track, monitor and manage the process I want to split the processing factories into 400 DataFactories (each one with 1400 Data Sets and 800 activities). The final objective is store historical data into a Data Lake (10TB initial loading - JSON format)

    3- What is the limit of data set inputs per data factory activities, I need to coordinate 200 activities and have a control point to now when the activities are completed. Or is there another way to let an activity know that these 200 are done?

    Monday, March 6, 2017 4:58 PM

All replies

  • Hi there,

    Thank you for your questions - they are indeed some of the limitations we are working on removing, so that the large-scale data movement patterns (such as the one you described) can be gracefully handled.  That said, I do want to work with you on possible workarounds in the interim and to understand what further operations will be taken once the historical data has made it into the ADLS - will you need to do incremental loads, process using U-SQL, compute aggregates and load into SQL DW, etc?

    It would be great if you can reach out to me at shwang@microsoft.com and we can discuss offline.

    Tuesday, March 7, 2017 2:30 AM