locked
Error while downloading resource files to a node RRS feed

  • Question

  • Hi,

    I'm creating jobs with hundreds of tasks (from C#). Each task has 60 resource files that have to be downloaded to the node from blob storage. In most cases this works fine, but sometimes tasks fail with the following error:

    BlobDownloadMiscError
    Miscellaneous error encountered while downloading one of the specified Azure Blob(s)
    Fix: Rerun task
    BlobSource: https://xxxxxxstorage.blob.core.windows.net/rawdata-xxxxxxxx-f5bf-462d-8a27-06c7f63b27b6/images/3_002/02140_63661040156808701_36_399.4jls?sv=2017-04-17&sig=AoYMB%2Bat4kqPH6rj6hYXR0DJlvKARR2XXXX%3D&spr=https&se=2018-05-19T00%3A59%3A48Z&srt=co&ss=b&sp=rl
    FilePath: /mnt/batch/tasks/workitems/job-xxxxxxxx-f5bf-462d-8a27-06c7f63b27b6-201805120054/job-1/imgproc-00171/wd/02140_63661040156808701_36_399.4jls
    Message: OSError(25, 'Inappropriate ioctl for device')

    When checking the blob, it seems to exist in blob storage. The SAS token is valid for 7 days.

    I have no idea what "Inappropriate ioctl for device" means. The only thing I see is that it is usually the last tasks of a job that fail, so it might have something to do with downscaling too early maybe.

    Any ideas how to fix this?

    Regards,

    Marcel

    Monday, May 14, 2018 3:36 PM

All replies

  • Can you tell if the job is actually failing or just showing as failed in the portal? 
    Monday, May 14, 2018 10:26 PM
  • Looks like it actually failed. There are no logs, no output and the job ran for just 1 second (supposed to be 30+ minutes).
    Monday, May 14, 2018 10:41 PM
  • hmm how often are you seeing these failures? Anything that you notice different when these ones fail? 

    I looked up the error you mentioned to try and find out more: 

    https://msdn.microsoft.com/en-us/library/windows/desktop/aa363219(v=vs.85).aspx

    Appears to reference the device input/ output control. So I am wondering if it has something to do with the physical batch nodes and not so much the blob itself. 

    Tuesday, May 15, 2018 12:55 AM
  • If you give us the <region, account, job, task> IDs and a time window we can look at the logs if the VM is still around.

    Since you are using VirtualMachineConfiguration (Linux here) AND if your pool is recent, there is a new api that can be called to egress the node agent logs: UploadComputeNodeBatchServiceLogsAsync  

    You can open a CSS ticket for the investigation too (better in general) but we will need a CSS ticket for you to give us any egressed logs.  We can examine those logs and try to determine what might have happened.

    d



    • Edited by DarylMsft Friday, May 18, 2018 12:44 AM linux means VMConfig
    Friday, May 18, 2018 12:26 AM
  • Any update on this? 
    Monday, May 21, 2018 9:50 PM
  • I got the errors again. I created a support request in the Azure portal.

    Support request ID 118070918540938

    Monday, July 9, 2018 11:13 AM