none
Using CycleCloud and Slurm - jobs stuck in pending state RRS feed

  • Question

  • We have some software that uses Slurm to submit jobs to a queue and it works as expected on our on-site cluster as well as a variety of our clients' Slurm setups.

    The issue we are seeing is that when we submit a multi-node job on CycleCloud's Slurm, the correct number of resources spin up, however, the jobs never seem to transition into a "Running" state. They remain stuck in "Pending(Resources)" state.

    I have run a test script that does the bare minimum to submit multi-node jobs. These properly spin up the appropriate number of resources and run the job. So, clearly, something in our configuration must be off.

    Can anyone share some pointers of where to track reasons for jobs getting stuck in a pending state?

    Thanks,

    Eric

    Monday, May 20, 2019 2:29 PM

All replies

  • This forum is for Azure Stack, a hybrid cloud platform that lets you use Azure services from your company's or service provider's datacenter. 

    For Azure CycleCloud issues, please create a Support Request. If you do not have a support plan please email me at AzCommunity@microsoft.com with your Subscription ID and a link to this post, and will can enable a one-time free support request for your subscription. 

    Monday, May 20, 2019 9:38 PM
    Moderator