locked
Stdout Directory Not Found RRS feed

  • Question

  • I have been using LINQToHPC, and am using HPC Job manager for monitoring my jobs. I want to check the logs dumped by the vertices. In job manager when I click on "Open Stdout Directory" it throws me an error "Tasks StdOut directory is not set".

    How do we set the StdOut directory for a Task ?. Is this specific to Task submitted through LinqToHPC ?

    I want to use stdout logs for diagnosing my job.

    Thanks

    Ankush

    Tuesday, November 29, 2011 10:40 AM

Answers

  • Sure, happy to help.

    First, yes there is an API to the job scheduler (this backs the job manager UI), which is documented here: http://msdn.microsoft.com/en-us/library/cc904930(v=VS.85).aspx. You can accomplish your goal from Question #2 (list of jobs submitted by a user) by creating a filter for job user and submission time, and then getting the list of jobs that meet the filter criteria (look at IScheduler.GetJobList here: http://msdn.microsoft.com/en-us/library/microsoft.hpc.scheduler.ischeduler.getjoblist(v=VS.85).aspx)

    As for your first question, using the job scheduler API will allow you to do high-level monitoring: the same level that you can see in the UI. For example, you can get up-to-date job progress (rough approximation of the percentage of vertices completed), allocated nodes, etc. That said, getting individual vertex completion status is not available through this API. The only way to get this level of detail is to monitor a separate log produced by the graph manager task and perform the necessary node-to-vertex coorelation, which is not at all easy. Alternatively, you could just pull all the vertex stdout logs as long as you filter them such that you throw out any failures when looking for successes and vice versa,, which may not be trivial.

    Before investigating however, I would suggest rethinking why exactly you need this output, and whether you could potentially use another mechanism to communicate any required information back to the client. LINQ is a dataflow engine of sorts, so the most natural way to get information out of lambdas is to include it in the object produced in the operation (for example, if you're producing LineRecords, you could build a simple serializable class that holds a LineRecord and a string for what is currently console output, and produce instances of that class instead). Another possibility is leveraging DSC as a data store for output files where you redirect your console output to a file and add that file to DSC - the trick here would be coordinating between the client and vertices on what fileset names to use and how to manage/detect failures. The best solution probably depends on your exact scenario.

    Hopefully that's helpful in guiding your thoughts here.

    Jeremy

    • Marked as answer by Ankush Desai Wednesday, December 14, 2011 5:11 AM
    Wednesday, November 30, 2011 10:22 AM
  • Thanks a lot Jeremy!

    I had another question

    I am submitting jobs on HPC using LINQToHPC and I want to monitor them,

    Question 1 : I want to check which vertices successfully ran the job. Is there a way by which I read this information into my application, we can check through job manager but I want this information at runtime so that the application can go into the successful nodes working directory and retrieve necessary information which I have printed on console.

    Question 2: Is there a API support for Job Manager so that I can browser the list of jobs submitted to a HPC cluster. I want to retrieve the list of all jobs submitted by a User within a day.

     

    Thanks and Regards

    Ankush

    • Marked as answer by Ankush Desai Wednesday, December 14, 2011 5:11 AM
    Wednesday, November 30, 2011 9:48 AM

All replies

  • Hi Ankush,

    All L2H Task output is written to logs on each node allocated to the job. The logs are placed in the following directory:

    \\[node]\HpcTemp\[username]\[jobid]\[vertexid]

    There are stdout, stderr, and infrastructure logs located at this location on each of the allocated nodes. Any console output in your lambdas will be written to the stdout.txt log.

    Additionally, you can control the infrastructure log level by setting the log level on the HpcLinqConfiguration you use to submit your job.

    Jeremy

     

    Wednesday, November 30, 2011 7:38 AM
  • Thanks a lot Jeremy!

    I had another question

    I am submitting jobs on HPC using LINQToHPC and I want to monitor them,

    Question 1 : I want to check which vertices successfully ran the job. Is there a way by which I read this information into my application, we can check through job manager but I want this information at runtime so that the application can go into the successful nodes working directory and retrieve necessary information which I have printed on console.

    Question 2: Is there a API support for Job Manager so that I can browser the list of jobs submitted to a HPC cluster. I want to retrieve the list of all jobs submitted by a User within a day.

     

    Thanks and Regards

    Ankush

    • Marked as answer by Ankush Desai Wednesday, December 14, 2011 5:11 AM
    Wednesday, November 30, 2011 9:48 AM
  • Sure, happy to help.

    First, yes there is an API to the job scheduler (this backs the job manager UI), which is documented here: http://msdn.microsoft.com/en-us/library/cc904930(v=VS.85).aspx. You can accomplish your goal from Question #2 (list of jobs submitted by a user) by creating a filter for job user and submission time, and then getting the list of jobs that meet the filter criteria (look at IScheduler.GetJobList here: http://msdn.microsoft.com/en-us/library/microsoft.hpc.scheduler.ischeduler.getjoblist(v=VS.85).aspx)

    As for your first question, using the job scheduler API will allow you to do high-level monitoring: the same level that you can see in the UI. For example, you can get up-to-date job progress (rough approximation of the percentage of vertices completed), allocated nodes, etc. That said, getting individual vertex completion status is not available through this API. The only way to get this level of detail is to monitor a separate log produced by the graph manager task and perform the necessary node-to-vertex coorelation, which is not at all easy. Alternatively, you could just pull all the vertex stdout logs as long as you filter them such that you throw out any failures when looking for successes and vice versa,, which may not be trivial.

    Before investigating however, I would suggest rethinking why exactly you need this output, and whether you could potentially use another mechanism to communicate any required information back to the client. LINQ is a dataflow engine of sorts, so the most natural way to get information out of lambdas is to include it in the object produced in the operation (for example, if you're producing LineRecords, you could build a simple serializable class that holds a LineRecord and a string for what is currently console output, and produce instances of that class instead). Another possibility is leveraging DSC as a data store for output files where you redirect your console output to a file and add that file to DSC - the trick here would be coordinating between the client and vertices on what fileset names to use and how to manage/detect failures. The best solution probably depends on your exact scenario.

    Hopefully that's helpful in guiding your thoughts here.

    Jeremy

    • Marked as answer by Ankush Desai Wednesday, December 14, 2011 5:11 AM
    Wednesday, November 30, 2011 10:22 AM