none
I'm not able to download large file since one day (2019-03-02)

    Question

  • Hello,

    I'm an user of Azure Data Lake store and I send large files every day. We use the same script based on curl since one year ago.

    Since yesterday I'm not able to send large file (2019-03-02). An error occur after some API calls (https://t.co/DOfRzvo4tb). We have not changed our script recently.

    I can reproduce the issue from three servers to two Data Lake stores attached to two different subscriptions. I'm using an express route.

    Example: I have to send a file of 500Mb. I send packets of 26214400 bits. I retrieve a token and after that I send successive calls : 

    >>>> https://xxx.azuredatalakestore.net/webhdfs/v1//file-products.csv?op=CREATE
    >>>> https://xxx.azuredatalakestore.net/webhdfs/v1/file-products.csv?op=CREATE&write=true

    ... start loop

    >>>> https://xxx.azuredatalakestore.net/webhdfs/v1//file-products.csv?op=APPEND
    >>>> https://xxx.azuredatalakestore.net/webhdfs/v1/file-products.csv?op=APPEND&append=true

    ... end loop

    An error occurs after some loops (sometimes 10 loops, sometimes 25 loops) on the previous calls. The last call before the fail seems to be a timeout ... It fails after 10 calls or 40 calls it's random.

    Error generated by curl : "SSL read: error:00000000:lib(0):func(0):reason(0), errno 104"

    I tried to desactivate the following option without success :

    CURLOPT_SSL_VERIFYHOST => '0'

    CURLOPT_SSL_VERIFYPEER => '1',

    I have no problem to delete or list files. Only to append files ...

    Unfortunately I have not the credentials to open a ticket.

    Best.

    Sunday, March 3, 2019 2:41 AM

Answers

  • Ticket number : 119030421000624

    > Are you using a Self-Hosted Integration Runtime?  We are currently fixing a bug found in a recent release which only affects the Self-Hosted IR (3.14.6980.2).

    Nope. It's not self-hosted.

    When I fixed a CURLOPT_TIMEOUT value (https://curl.haxx.se/libcurl/c/CURLOPT_TIMEOUT.html) to 30s, the error is was not the same. It was a timeout error.

    "Operation timed out after 30000 milliseconds with 0 bytes received"

    The file was not appended. 

    Since midnight, the errors don't occur anymore. Using packets of 4M instead of 25M is a best usage.

    We think that it was a network issue. 


    • Marked as answer by bastgau Wednesday, March 6, 2019 8:54 AM
    Tuesday, March 5, 2019 4:52 PM

All replies

  • Additionnal information.

    It seems that we use a data lake based at Dublin.

    • 10  ae103-0.icr02.dub08.ntwk.msn.net (104.44.11.59)  18.483 ms ae120-0.icr01.dub07.ntwk.msn.net (104.44.11.76)  18.591 ms ae102-0.icr02.dub07.ntwk.msn.net (104.44.11.56)  18.814 ms
    • 11  25.67.35.74 (25.67.35.74)  18.074 ms  18.034 ms 25.67.39.11 (25.67.39.11)  17.953 ms
    • 12  10.10.133.81 (10.10.133.81)  20.637 ms 10.10.148.169 (10.10.148.169)  19.817 ms 10.10.140.217 (10.10.140.217)  20.033 ms
    • 13  25.65.32.253 (25.65.32.253)  18.162 ms 25.65.32.221 (25.65.32.221)  18.190 ms 25.65.32.232 (25.65.32.232)  18.184 ms
    • 14  25.73.147.89 (25.73.147.89)  18.500 ms 25.73.128.253 (25.73.128.253)  18.432 ms 25.73.128.255 (25.73.128.255)  18.334 ms
    • 15  10.10.194.55 (10.10.194.55)  18.390 ms 10.10.194.51 (10.10.194.51)  18.208 ms 10.10.192.47 (10.10.192.47)  18.364 ms

    And we use the data lake Gen1

    Sunday, March 3, 2019 2:49 AM
  • Are you using a Self-Hosted Integration Runtime?  We are currently fixing a bug found in a recent release which only affects the Self-Hosted IR (3.14.6980.2).
    Tuesday, March 5, 2019 1:00 AM
    Moderator
  • Ticket number : 119030421000624

    > Are you using a Self-Hosted Integration Runtime?  We are currently fixing a bug found in a recent release which only affects the Self-Hosted IR (3.14.6980.2).

    Nope. It's not self-hosted.

    When I fixed a CURLOPT_TIMEOUT value (https://curl.haxx.se/libcurl/c/CURLOPT_TIMEOUT.html) to 30s, the error is was not the same. It was a timeout error.

    "Operation timed out after 30000 milliseconds with 0 bytes received"

    The file was not appended. 

    Since midnight, the errors don't occur anymore. Using packets of 4M instead of 25M is a best usage.

    We think that it was a network issue. 


    • Marked as answer by bastgau Wednesday, March 6, 2019 8:54 AM
    Tuesday, March 5, 2019 4:52 PM
  • Thank you for keeping us informed.  Please let us know if you still face an issue.
    Tuesday, March 5, 2019 11:35 PM
    Moderator