locked
Dataflow: sha2 result toBinary anomaly RRS feed

  • Question

  • Hi

    I am using the sha2 function in a derived column and the result is an hexadecimal string such as 

    "e4f1bd4b1acb7ee170152bb6625346341dfd03548c5fb17415aa67c8eab59a99"

    I created to 2 derived columns the first:

    iif(isNull($$), null()+'',concat('0x',sha2(256, $$)))

    and second:

    toBinary(sha2(256, $$))

    and sent the output to a CSV file and a parquet

    In the CSV I see what is expected

    0x5574a4b66d6489389a73ca906652a792ab856cc1af0252a3ea6babae3e29dd35,

    5574a4b66d6489389a73ca906652a792ab856cc1af0252a3ea6babae3e29dd35

    but in the parquet one I have

    0x5574a4b66d6489389a73ca906652a792ab856cc1af0252a3ea6babae3e29dd35

    e4f1bd4b1acb7ee170152bb6625346341dfd03548c5fb17415aa67c8eab59a99

    How could that be?

    Thanks!

    Paul


    Friday, August 23, 2019 1:23 PM

All replies

  • Hello Paul and thank you for your inquiry.  This looks like quite a puzzle!

    Parquet is the file format which contains multiple encodings (i.e. integers stored as integers, not characters representing integers), right?  What does your output schema look like?  Are there any transformations after the derived columns or sink side rules?

    While you check that, I'm going to try setting up a reproduction.  I am assuming that all settings other than file format and location are identical.

    Friday, August 23, 2019 10:15 PM
  • Paul, I attempted to reproduce your anomaly.  Everything checked out on my end.  Did you find anything?
    Monday, August 26, 2019 8:26 PM
  • Paul, are you still in need of assistance?
    Tuesday, August 27, 2019 6:18 PM