Azure data lake gen2 -Validating uploaded file(MD5?) RRS feed

  • Question

  • I need to validate the files that I upload to ADLG2 and cannot find a good way...

    There is an ContentMD5 property but I don't see any why to change it or update it

    even when I try to get the HEAD for the file the property is non existing  (even that in the Storage Explorer the property exist)

    Is there any why to auto generate it on the Azure side? or some different way to validate the file? (I don't want to upload a file then download it and compare them)

    • Edited by Eldar2 Monday, May 4, 2020 5:42 PM
    Monday, May 4, 2020 5:40 PM

All replies

  • Hi there,

    Here is a good article on how to calculate and check Blob MD5 checksums.

    You can write an azure logic app to do the validation. You can set up a trigger for every blob that gets uploaded (https://docs.microsoft.com/en-us/azure/connectors/connectors-create-api-azureblobstorage)

    A C# example for Azure Blobs, uses the following path to get the hash:

    // Validate MD5 Value
    var md5Check = System.Security.Cryptography.MD5.Create();
    md5Check.TransformBlock(retrievedBuffer, 0, retrievedBuffer.Length, null, 0);     
    md5Check.TransformFinalBlock(new byte[0], 0, 0);
    // Get Hash Value
    byte[] hashBytes = md5Check.Hash;
    string hashVal = Convert.ToBase64String(hashBytes);

    and it works...

    The MD5 hash is saved as base64 string.

    Ref - https://stackoverflow.com/a/31185048/10653466

    Hope this helps.

    Tuesday, May 5, 2020 12:41 PM
  • Hi there,

    Just wanted to check - was the above suggestion helpful to you? If yes, please consider upvoting and/or marking it as answer. This would help other community members reading this thread.
    Monday, June 1, 2020 1:01 PM