locked
Mapping Data Flow - CSV File with Comma in Numeric Values RRS feed

  • Question

  • Hi,

    I have a mapping data flow that loads data from a csv file into an azure sql database. The csv file is defined like this:

    CSV File settings

    The source has the following fields definition:

    source(output(
    Value1 as string,
    Value2 as string,
    Date as date 'dd/MM/yyyy',
    Value3 as string,
    Weight as decimal(10,4))

    The file contains data, for the decimal column, such as 1,026, 912, 1,051, 1,082, etc

    Data Values

    However, it seems like the fact that the numeric value contains a comma messed up the mapping data flow and all values with commas are lost as it can be seen in the source data preview shown below. This makes little sense to me since it's clearly defined in the settings of the file that values are enclosed by double quotes (quote character) so the fact that the value has a comma should make any difference.

    Source Data Preview

    Any ideas?

    Thanks

    Pedro




    Thursday, April 2, 2020 10:41 PM

Answers

  • Hello Pedro , 

    Thanks for the query , i think the issue which you are seeing for the fact that DF is not able to convert "1,234" "1234" .

    I did tried to use a derived  column activity and  we need to replace "," . Please try the expression 

    toDecimal(replace(Column_3,",",""),9,4)

    The below image should help you implement the solution . Please do let me know how it goes .


    Thanks Himanshu

    Tuesday, April 7, 2020 12:31 AM

All replies

  • Hello Pedro , 

    Thanks for the query , i think the issue which you are seeing for the fact that DF is not able to convert "1,234" "1234" .

    I did tried to use a derived  column activity and  we need to replace "," . Please try the expression 

    toDecimal(replace(Column_3,",",""),9,4)

    The below image should help you implement the solution . Please do let me know how it goes .


    Thanks Himanshu

    Tuesday, April 7, 2020 12:31 AM
  • Hello Pedro , 

    We have not received a response from you.  Are you still facing the issue?  If you found a solution, would you please share it here with the community?  Otherwise, let us know and we will continue to engage with you on the issue.


    Thanks Himanshu

    Wednesday, April 8, 2020 8:41 PM
  • Hi Himanshu,

    Apologies for the late reply. Yes, the solution you provided works fine but I do see it more as a workaround than necessarily something that data flows should provide in itself.

    I did find out that if instead of decimal you use a different numeric type such as double, you can use specify the format (mask) which will make it work. However, even when using those masks (the longest one for double being ###,###,###,###.###) you still lose some digits if you have a number very big or if you have more than 3 decimal places. For example, if you have a number like 345,026.2967, it will get rounded to 345026.297

    With that said, don't really understand why we don't have the capability of defining a format for the decimal data type or why the masks don't deal with very big numbers considering a decimal or a double (in a SQL Server database) can hold a lot of digits. From my perspective and considering these limitations, your approach may actually be the best one, deal with the value as a string and convert it after, that way we won't lose any digits.

    Pedro

    Sunday, April 12, 2020 8:09 PM
  • Have you tried specifying format in projection tab of the source. It has worked for me but you have to do some trial and error on what works.

    toDecimal also allows you to specify a format string. Documentation is bit fuzzy. You have try many combinations before things work.
    Monday, April 13, 2020 4:21 AM
  • Hey dataflowusr,

    Yes, as I mentioned above I did specify a format (mask) in the projection tab of the source but you can't use a decimal datatype for that, it has to be one of the others.

    Monday, April 13, 2020 8:42 PM