Custom Activity Scale Approach RRS feed

  • Question

  • I am deciding on the best approach for our IoT processing.  We would like to use Data Factory to create analytic relational data from our raw telemetry.  

    Although we have installations throughout the world, most of our data comes from North America, so we generally have 10 times the load during the day than we have in the night.

    I have implemented the "Use custom activities in an Azure Data Factory pipeline" by Sreedhar Pelluru sample (sorry can't provide the link as I am not verified on this site).  I’m not sure it would work well for our processing, as I would have to provision a pool to handle my maximum day time load.  It would therefore be 9/10 underutilized during the night.

    I am considering creating two pipelines.  The first pipeline would be time based (eg. run once per hour) similar to your example.  It’s job would be to retrieve the number of new records, and then spawn multiple custom data factory pipelines to process the load.  Each batch task would process x (eg. 10,000) contiguous records.  I believe that this would scale well.

    Any thoughts you have on this approach would be appreciated.  Also, any links you think would be useful would also be appreciated.

    Also wondering, can we use a custom activity to launch a "Job Manager" batch task? https://azure.microsoft.com/en-us/documentation/articles/batch-api-basics/#job-manager-task  
    Thursday, August 11, 2016 3:02 PM

All replies

  • Hi Terry,

    Sorry for the late response. You do not need to pay for the VMs in the pool all the time. You can set the pool to auto scale. The pool can have 0 VMs when not being used. This way your money gets better utilized. Read more here to how to do so: https://azure.microsoft.com/en-us/documentation/articles/batch-automatic-scaling/ .

    Keeping the above in mind, does your approach change?

    Yes you can launch a "Job Manager" batch task through the custom activity in ADF. Its your code, you can do whatever you decide to in it :)

    Thanks, Harish

    Thursday, September 1, 2016 11:22 PM