none
Vertex SV1_Extract[0][0] Failed

    Question

  • I have a very simple CSV file with one column of (year) with the value 2014 repeated across 471k rows.  

    My Script Looks as follows:

    @flightDelays =

        EXTRACT Year int

        FROM "/Zoiner/ExtentAligned/On_Time_On_Time_Performance_2014_1_OneColumn.csv"

        USING Extractors.Csv();

    @results =

        SELECT *

        FROM @flightDelays;

    OUTPUT @results

        TO "/Zoiner/Output/On_Time_On_Time_Performance_2014_1.csv"

    USING Outputters.Csv();

    I've ensured that I uploaded the CSV file using Visual Studio and selected the As Row-Structured file option so that I don't get any issues with extent mis-alignment. 

    There are no null or missing values in the column.

    The file DOES contain a header (with the column name). 

    Whenever I run the script, I get the following error:

    ERROR

    VertexFailedFast. Vertex failure triggered quick job abort. Vertex failed: SV1_Extract[0][0] with error: Vertex user code error.

    DESCRIPTION
    Vertex failed with a fail-fast error

    RESOLUTION

    DETAILS

    Vertex SV1_Extract[0][0].v2 {43B20D9E-E63F-48AF-8E9A-FFFAE288FCB8} failed 
    Error:
    Vertex user code error
    exitcode=CsExitCode_StillActive Errorsnippet=An error occurred while processing adl://adlmvp.azuredatalakestore.net/Zoiner/ExtentAligned/On_Time_On_Time_Performance_2014_1_OneColumn.csv 



    Any insights greatly appreciated!


    Friday, April 8, 2016 5:55 PM

Answers

  • You probably run into the issue that the header row is a string and not castable to the requested int. Until we provide the skipFirstNRow capabilities, I refer you to the following options to handle the header row (or write a custom extractor):

    @r1 =
      EXTRACT id string,
              name string,
              street string,
              city string,
              zip string,
              age string
      FROM "/temp/CsvWithHdr.csv"
      USING Extractors.Csv();
    
    @option1_knownheader =
       SELECT Int32.Parse(id) AS id,
              name,
              street,
              city,
              zip,
              string.IsNullOrEmpty(age) ? (Int16?) null: (Int16?) Int16.Parse(age) AS age
       FROM @r1
       WHERE id != "Id";
    
    OUTPUT @option1_knownheader
    TO "/temp/opt1.csv"
    USING Outputters.Csv();
    
    @option2_tryparse =
       SELECT Int32.Parse(id) AS id,
              name,
              street,
              city,
              zip,
              String.IsNullOrEmpty(age) ? (Int16?) null : (Int16?) Int16.Parse(age) AS age
       FROM @r1
       WHERE ((Func<string, bool>)(p => { Int32 dummy; return Int32.TryParse(p, out dummy); }))(id);
    
    OUTPUT @option2_tryparse
    TO "/temp/opt2.csv"
    USING Outputters.Csv();


    Michael Rys

    Monday, April 11, 2016 11:36 PM
    Moderator

All replies

  • You probably run into the issue that the header row is a string and not castable to the requested int. Until we provide the skipFirstNRow capabilities, I refer you to the following options to handle the header row (or write a custom extractor):

    @r1 =
      EXTRACT id string,
              name string,
              street string,
              city string,
              zip string,
              age string
      FROM "/temp/CsvWithHdr.csv"
      USING Extractors.Csv();
    
    @option1_knownheader =
       SELECT Int32.Parse(id) AS id,
              name,
              street,
              city,
              zip,
              string.IsNullOrEmpty(age) ? (Int16?) null: (Int16?) Int16.Parse(age) AS age
       FROM @r1
       WHERE id != "Id";
    
    OUTPUT @option1_knownheader
    TO "/temp/opt1.csv"
    USING Outputters.Csv();
    
    @option2_tryparse =
       SELECT Int32.Parse(id) AS id,
              name,
              street,
              city,
              zip,
              String.IsNullOrEmpty(age) ? (Int16?) null : (Int16?) Int16.Parse(age) AS age
       FROM @r1
       WHERE ((Func<string, bool>)(p => { Int32 dummy; return Int32.TryParse(p, out dummy); }))(id);
    
    OUTPUT @option2_tryparse
    TO "/temp/opt2.csv"
    USING Outputters.Csv();


    Michael Rys

    Monday, April 11, 2016 11:36 PM
    Moderator
  • I should have guessed that it was trying to convert the values in the header row- your solution makes total sense and worked for me, thank you.

    As an aside, I would support any effort surfacing a more user friendly error message like "Cannot cast value 'Year' to int32", my psychic debugger doesn't always work ;-)

    Tuesday, April 19, 2016 12:54 PM
  • Agreed. Can you please file an item at http://aka.ms/adlfeedback?

    Michael Rys

    Wednesday, April 20, 2016 6:05 PM
    Moderator