locked
How to schedule azure batch jobs? RRS feed

  • Question

  • Hi.

    I created an azure batch job app using Azure Batch .net client, following the instructions on http://azure.microsoft.com/en-us/documentation/articles/batch-dotnet-get-started/.

    All the concepts are not very clear to me yet, and it lacks a complete documentation (I understand it is still new/under development).

    Where do I upload the azure batch program?

    How do I schedule it?

    Thursday, January 1, 2015 7:20 PM

Answers

  • Hi

    If I understand correctly you want to try to only use Batch to run both the web crawling jobs and tasks as well as the client code that creates the jobs and tasks.

    I'm pretty sure you can do what you want using Batch.  I'll give you the high-level concepts here and some documentation pointers, then hopefully you can find the low-level info in the docs.

    • To host the client code you should use a Job Manager which is specified when creating a Work Item.  The Job Manager is a special task that is run first when a job is created.
    • To have a job created and run every two hours you need to specify a schedule with the Work Item. A recurrence interval can be specified so a job is created every 2 hours.
    • To only create the VM's when needed then you should specify an Auto-Pool when creating your Work Item.  You will need at least two VM's - one to run the Job Manager task and at least one to run the crawler tasks.  Pool lifetime configuration can be specified so that the pool lifetime is tied to each job that is created every 2 hours - the pool is created when the job is created and the pool is deleted when the job completes.

    You'll find these concepts described in the REST API reference for Add Workitem - http://msdn.microsoft.com/en-us/library/azure/dn820179.aspx

    Assuming you're using the C# client API, you'll need some client code to bootstrap and create the Workitem, and will need to find the corresponding C# API's in that documentation - http://msdn.microsoft.com/en-us/library/azure/dn865466.aspx

    Hope that helps.

    Regards, Mark

    Tuesday, January 6, 2015 3:05 AM

All replies

  • hi Tsayao,

    I can understand your confusion with Scheduler and its being in development. 

    Take a look at Mario's blog post here, its explained well in detail and very crisp and clear, 

    I hope this helps you get started with Azure Batch service.

    -----------------------------------

    Please mark as answered if it helped.


    Vishal Narayan Saxena http://twitter.com/vishalishere http://www.ogleogle.com/vishal/

    Sunday, January 4, 2015 4:49 AM
  • Hi Vishal,

    Sorry, I am still very lost on all the concepts of Azure Batch. I will try to clarify what I want to do.. 

    I am building a web crawler that crawls online stores to find good deals and post this deals on a specific web-site. The crawler is a .net program that uses HtmlAgilityPack, .net HttpClient and Sql Server. It's currently running as a WebJob, but i think it's getting too big to be an Azure WebJob.

    So I've stumbled upon Azure Batch, which seemed what I want (I have very strong doubts now).

    I have a very limited budget, so I don't want to provision all the virtual machines and pay for idle time. So I want to create the pool, copy the files I need from the storage account, run it and then destroy the pool. The idea is to pay just for the processing time, that is why Azure Batch makes sense to me. Otherwise I would just get a VM.

    So I want to build a .net program which uses Azure Batch .net Client which will create a VM, run my crawler and then destroy it. It should do it at each 2 hours, and run for about 20 minutes.

    I am very lost yet, but the strongest doubts are:

    I have a .net batch client program. Where do I upload it? Or am I supposed to run it remotely from Azure? If that's the case, it looses all the sense to me.

    After uploading it, where do I schedule it? I want to say: Azure, run it for me in each 2 hours.

    Basically I need what Azure WebJobs does, but I need it to scale better.

    Thank you.


    Sunday, January 4, 2015 6:42 PM
  • Hi

    If I understand correctly you want to try to only use Batch to run both the web crawling jobs and tasks as well as the client code that creates the jobs and tasks.

    I'm pretty sure you can do what you want using Batch.  I'll give you the high-level concepts here and some documentation pointers, then hopefully you can find the low-level info in the docs.

    • To host the client code you should use a Job Manager which is specified when creating a Work Item.  The Job Manager is a special task that is run first when a job is created.
    • To have a job created and run every two hours you need to specify a schedule with the Work Item. A recurrence interval can be specified so a job is created every 2 hours.
    • To only create the VM's when needed then you should specify an Auto-Pool when creating your Work Item.  You will need at least two VM's - one to run the Job Manager task and at least one to run the crawler tasks.  Pool lifetime configuration can be specified so that the pool lifetime is tied to each job that is created every 2 hours - the pool is created when the job is created and the pool is deleted when the job completes.

    You'll find these concepts described in the REST API reference for Add Workitem - http://msdn.microsoft.com/en-us/library/azure/dn820179.aspx

    Assuming you're using the C# client API, you'll need some client code to bootstrap and create the Workitem, and will need to find the corresponding C# API's in that documentation - http://msdn.microsoft.com/en-us/library/azure/dn865466.aspx

    Hope that helps.

    Regards, Mark

    Tuesday, January 6, 2015 3:05 AM