none
ADF-DF Multiline option with delimited files not working properly RRS feed

  • Question

  • Hi ADF,

    I have made a test (csv) file (see figure 1) with three columns; A, B and C. The first row has regular data, but the second row has a linebreak in column B. This value is also quoted. The file is stored in the data lake store.

    Figure 1

    Now going to ADF. I made a dataset of the file, with first row as header and imported the schema. All other options are kept on default. Thereafter, I made the dataflow in which I selected the dataset and selected the multiline rows option (see figure 2).

    Figure 2

    Finally, when checking the data preview, I encounter an (possible) error. Column C contains NULL values and a new column (without name) is made with the containing’s of column C (see figure 3).  On a side note, the multiline rows option did work somewhat, as column B does contain the multiline value.

    Figure 3

    Am I missing something, or is this ADF behaviour not correct?

    Monday, December 2, 2019 2:59 PM

All replies

  • I think on the Source Dataset , you have marked the "first row as header " option , please remove that it should work fine . 

    Thanks Himanshu

    Tuesday, December 3, 2019 12:26 AM
    Moderator
  • I checked and it works without marking "first row as header", but will a future release also work with selecting the header option?
    Tuesday, December 3, 2019 7:23 AM
  • That should work with first row of header. If you have a sample for which it does not work, post it here.
    • Proposed as answer by dataflowuser Wednesday, December 4, 2019 6:20 PM
    Wednesday, December 4, 2019 6:20 PM
  • Figure 1 of the original post is the sample I used. It is just a notepad saved as CSV, containing the following lines:
    (the first row with capitals is the header row)

    A,B,C a,b,c a,"b b",c

    Thursday, December 5, 2019 7:45 AM