none
Error while inserting to a table

    Question

  • Hi ,  im attemping to load a table based on the movie lens 1 million row data set, it works fine locally and ive been able to play around as a text file in ADL but im getting an error on this script ( error is below )

    Any ideas ?

    Dave



    DROP TABLE IF EXISTS Ratings;
    CREATE TABLE Ratings(
        MovieID int,
        UserID int,
        Rating int,
        Timestamp string,
           INDEX [idx Ratings] CLUSTERED(UserID ASC,MovieID ASC)
                                             PARTITIONED BY HASH(UserID)
    );

    DECLARE @INPUTDIR string =  "/movies/ml-1m/";
    DECLARE @RATINGSInp string = @INPUTDIR + "ratings.dat";

    @RatingsRaw =
        EXTRACT RatingLine string
        FROM @RATINGSInp
        USING Extractors.Tsv( rowDelimiter: "\n");

    @RatingsSplit =
        SELECT RatingLine,new SQL.ARRAY<string>(
               RatingLine.Split(new string[]{"::"},StringSplitOptions.None)) AS Ratings
        FROM @RatingsRaw;

    @Ratings =
        SELECT int.Parse(Ratings[0]) AS MovieID,
               int.Parse(Ratings[1]) AS UserID,
               int.Parse(Ratings[2]) AS Rating,
               Ratings[3] AS Timestamp
    FROM @RatingsSplit;

    INSERT INTO Ratings
    SELECT *
    FROM @Ratings;





    ------------------------------------------------------

    Vertex user code error
    exitcode=CsExitCode_StillActive Errorsnippet=

    INNERERROR:  

      "diagnosticCode": 195887111,
      "severity": "Error",
      "component": "RUNTIME",
      "source": "User",
      "errorId": "E_RUNTIME_USER_EXPRESSIONEVALUATION",
      "message": "Error while evaluating expression tmp_0[3]",
      "description": "Inner exception from user expression: Index was out of range. Must be non-negative and less than the size of the collection.\r\n\nParameter name: index\nCurrent row dump: \tRatingLine:\t\"58::4::961510553\"\r\n\n",
      "resolution": "",
      "helpLink": "",
      "details": "==== Caught exception System.ArgumentOutOfRangeException\n\n   at System.Collections.Generic.List`1.get_Item(Int32 index)\r\n\n   at ___Scope_Generated_Classes___.SqlFilterTransformer_3.Process(IRow row, IUpdatableRow output) in d:\\data\\CCS\\jobs\\3b2fad1e-7e65-4dc6-8acd-734c169551e8_v0\\__ScopeCodeGen__.dll.cs:line 97"

    Dave Ballantyne ---- http://sqlblogcasts.com/blogs/sqlandthelike/

    Sunday, February 7, 2016 9:04 PM

Answers

  • Hi Dave

    While looking at your code, I have the following questions:

    1. You create an array and then access the array with a fixed index. Are you sure that every row will produce at least the given amount of 4 entries into the array? Or may you get arrays that are shorter?
    2. How large in MB or GB is the original file? How did you upload it? 
      If the file is larger than about 250MB and you uploaded it as a binary and not as a row-oriented file, then you may have run into an issue that we are currently fixing that is commonly known as the row-extend boundary alignment problem.


    Michael Rys

    Thursday, February 11, 2016 10:27 PM
    Moderator

All replies

  • Curiously, Locally I converted Ratings to a CSV locally (using the above transformation and OUTPUT to Outputters.Csv())  , uploaded that to ADL and then Imported that into a table  and that worked fine ?!?!

    Dave Ballantyne ---- http://sqlblogcasts.com/blogs/sqlandthelike/

    Monday, February 8, 2016 8:12 AM
  • Hi Dave

    While looking at your code, I have the following questions:

    1. You create an array and then access the array with a fixed index. Are you sure that every row will produce at least the given amount of 4 entries into the array? Or may you get arrays that are shorter?
    2. How large in MB or GB is the original file? How did you upload it? 
      If the file is larger than about 250MB and you uploaded it as a binary and not as a row-oriented file, then you may have run into an issue that we are currently fixing that is commonly known as the row-extend boundary alignment problem.


    Michael Rys

    Thursday, February 11, 2016 10:27 PM
    Moderator