locked
Polybase, empty string becomes NULL #sqldatawarehouse #polybase #datafactory #usetypedefault RRS feed

  • Question

  • I have a simple table in a Azure SQL Database that I want to copy to SQL Data Warehouse with ADF.

    Problem: I can't make the exact representation/copy of a row if the row contains decimals which are NULL and/or varchar which are NULL.

    If I un-check "use type default", then copy a row with column which has empty string (''), result is: NULL. Expected result: Empty string.

    If I check  "use type default", empty string becomes empty string, but then numerics which are null in source fails when writing to sink with "Error converting data type VARCHAR to DECIMAL".

    Empty string is not the same as NULL value.

    Has anyone else been having the same problem?

    Wednesday, November 13, 2019 6:52 AM

All replies

  • Hi Henrik,

    Here's a similar issue that might help you as well - https://social.msdn.microsoft.com/Forums/en-US/6bbb8196-6709-4fdc-a1f8-ca6fec78632e/copy-activity-error-converting-datatype-from-varchar-to-decimal?forum=AzureDataFactory

    The right way to do is to unselect "Use type default" option (as false) in copy activity sink -> PolyBase setings. "USE_TYPE_DEFAULT" is a PolyBase native configuration which specifies how to handle missing values in delimited text files when PolyBase retrieves data from the text file. For more info, please refer this doc.

    To handle null values, copy activity provides sqlWriterStoredProcedure in SQL database sink, you can define how to apply data in TSV file into your target table in the store procedure. For example, you could replace the NULL value with 0 in the store procedure, which depends on your business logic.

    Hope this helps.


    Wednesday, November 13, 2019 7:18 AM
  • Chirag,


    Plague or cholera? I read your post prior to asking this question, and I believe that I posted explicitly that checking the "use type default" causes an error, and unchecking the flag causes NULL values for blank string. I suspect the documentation is not in sync with how the product works.

    Regarding the strategy of using sqlWriterStoredProcedureName: 1) This is not applicable for SQL Data Warehouse. 2) Replacing blanks in the source of the the pipeline with some awkward character, for then later doing a update/replace on the data in the sink is unworthy. Seriously.

    Wednesday, November 13, 2019 8:37 AM
  • Hi Henrik,

    Thanks for further clarifying the ask. I am working internally with the teams to get an answer and will update you as soon as I get one.

    Friday, November 15, 2019 11:15 AM
  • After some thinking, I would like to clearify:

    The documentation seems to be OK in regards to how Polybase works.The problem is that a blank string becomes NULL, and that you can't really mitigate this without replacing blanks with some other character during in-flight queries, and the re-replacing the rows in question when data has landed in the sink.

    This gives us ETL, not ELT. This also makes ADF not able to provide a true copy of the data.

    This really doesn't sound like a complicated problem for the product team to solve.

    Friday, November 15, 2019 12:07 PM
  • Use dataflow to solve ETL. It will write to DW the right way and you can handle nulls or empty strings
    • Proposed as answer by dataflowuser Friday, November 15, 2019 6:48 PM
    Friday, November 15, 2019 6:48 PM
  • Hi Henrik,

    The Data Flow feature of Azure Data Factory is the recommended way to achieve what you want to. It offers a lot of options to transform the data into your desired format. Please have a look at the following doc for details :

    https://docs.microsoft.com/en-us/azure/data-factory/transform-data#transform-natively-in-azure-data-factory-with-data-flows

    Hope this helps.

    Monday, November 18, 2019 8:09 AM
  • Hi Henrik,

    Just wanted to check - was the above suggestion helpful to you? If yes, please consider upvoting and/or marking it as answer. This would help other community members reading this thread.

    Thursday, November 21, 2019 10:24 AM
  • Hi Henrik,

    We haven't heard back from you in quite some time - was the above suggestion helpful to you? If yes, please consider upvoting and/or marking it as answer. This would help other community members reading this thread.

    Monday, November 25, 2019 9:45 AM