locked
how to handle multiple double quotes in CSV file single column RRS feed

  • Question

  • My source file has nvarchar and numeric columns. Numeric column has thousand separator, do identify the value comes with double quotes. When i use the quoteChar ("\"") in file format the numeric value works fine. same time the nvarchar column (Name) has multiple double quotes between the data, if i use the quoteChar the values are split into further more columns based on the number of double quotes. is there any fix/solution for this?
    Tuesday, December 18, 2018 5:28 AM

All replies

  • Please check the response provided on Stack Overflow for your query. You may continue your discussion there to avoid duplication.

    Tuesday, December 18, 2018 9:38 AM
  • Hi Jegan,

    In addition to what AshokPeddakotla-MSFT answered,

    You could either ask the creators of your input file to use a single delimiter (like most people do) or else you have to do some pre-processing.

    For pre-processing, you can do one of the following :-

    • You could this with a U-SQL step which corrects the file and then import that file into your SQL DB
    • If your file is on blob storage, then you can now BULK INSERT into Azure SQL DB, and having just given this a quick test, BULK INSERT can cope with multiple delimiters. Eg:
    DROP TABLE IF EXISTS #tmp
    
    CREATE TABLE #tmp (
        a VARCHAR(50),
        b VARCHAR(50),
        c VARCHAR(50),
    )
    GO
    
    BULK INSERT #tmp
    FROM 'D:\Data Lake\USQLDataRoot\input\input91.txt'
    WITH ( FIELDTERMINATOR  = '\"\"', DATA_SOURCE = '<yourDataSource>' );

    • Since ADF supports copy data from HTTP endpoint,so I suggest you following steps:
      1. Using Azure Functions to read your file via stream way line by line.
      2. Replace all the '\"\"' with '\"' and set it in the response.
      3. ADF set column delimiter to '|' and copy data from HTTP endpoint.

      Hope this helps.


    Tuesday, December 18, 2018 9:47 AM
  • Solution for this kind of problem is here: https://blog.wanchap.com/2019/06/handling-double-quoted-csvs-in-azure.html

    Ratomir Vukadin - Software Engineer

    Tuesday, March 10, 2020 5:56 PM