none
U-SQL extract column names from csv file

    Question

  • Hi,

    I would like to extract data from a csv file that contains the column names in the first row.

    Normally, if my data columns were 'name' and 'age', I would do:

    EXTRACT 
    name string, age int 
    FROM "/filename.csv" 
    USING Extractors.Csv(skipFirstNRows:1)

    However, what if I want to get the column names from the first row of the file, instead of hardcoding the columns 'name' and 'age'? Is there a way to do this in U-SQL?

    Monday, June 26, 2017 6:36 PM

All replies

  • Currently, such functionality does not exist.

    If you feel it should be added, please upvote the request for this capability at: https://feedback.azure.com/forums/327234-data-lake/suggestions/11684883-enable-code-u-sql-to-read-header-row

    • Proposed as answer by Jorg KleinMVP Wednesday, June 28, 2017 10:48 AM
    Tuesday, June 27, 2017 1:22 PM
  • I agree. It's inefficient to extract the header from a file just so you can include it out in your output with something like the following code. I guess another option is to have a separate file that contains just the first line of your input file and query it to get the header line.

    @select_input =
        SELECT p_ono,
               supplierno,
               contact
        FROM @input
        WHERE p_ono LIKE "BA%"
    ORDER BY p_ono
    FETCH 100 ROWS;
    
    @header_row =
        SELECT p_ono,
               supplierno,
               contact
        FROM @with_header
    WHERE p_ono == "P_ONO";
    
    @all_output =
        SELECT *
        FROM @header_row
        UNION ALL
        SELECT *
        FROM @select_input

    Thursday, August 31, 2017 12:05 PM
  • One other thing worth noting: if you preview files in Visual Studio such as through the Cloud Explorer or the Job Graph, you'll find a "Create EXTRACT Script" button. Clicking that creates a generic extract statement with all columns of your previewed file selected. There's a check box that says "File Has Header Row" and if the file does have a header row and you select it, then meaningful names are used in your select.

    This is my favorite way to create detailed EXTRACT statements.


    Thursday, August 31, 2017 1:18 PM