locked
ADFv2 Redirect Incompatible Rows RRS feed

  • Question

  • I am sending incompatible rows into a separate file instead of aborting the copy activity but have two comments and issue.  

    1) This does not work if trying to store the file in a Data Lake.  The activity succeeds but the row is never written to a file.  It only works in a BLOB storage.  I should also mention that you can't store your U-SQL files in a Data Lake either, the files fail.  You can only store them in BLOB storage.  I hope these issues are fixed because it makes sense to store Error Files and U-SQL files in the Data Lake if you are already leveraging it for your data files in your project. 

    2) We can set a Folder Path but a file name is not yet supported.  This means every file name is randomly generated and you can't specify the directory down to a file name.  What I would like to see is the ability to store the error file in the same directory (in the Data Lake) as the data file that errored.  I am partitioning the Data Lake by date for incremental files and it makes sense to then keep the error file in that same directory.  Or at least allow me to specify a file name that matches the successful file name.  For instance RAW>Date>MyInputFile.csv  and RAW>Date>ErrorLog>MyInputFile.csv.

    Monday, March 12, 2018 4:58 PM

All replies

  • When you send incompatible rows to a separate file it would also be great if there was a message returned from the activity that made the failed row visible.  Right now the activity status returns Succeeded, which is fine, but it would be nice if there was a warning that could be captured within the error.message property or a new property like output.message.  We can tell that a row has been skipped by seeing the skippedRow property has been set, but we have to manually check the logfile to see why.  The error is contained only in the log file and not captured in the error property.  

    A better option might be to set activity error to contain the error message, set activity Status to "Warning" instead of "Succeeded", and then also set skippedRow to +1.

    • Edited by AzureFrank Monday, March 12, 2018 7:34 PM
    Monday, March 12, 2018 7:28 PM
  • 1. Can you share your run id? We support storing the error files in data lake and there might be some permission issues.

    2. Thanks for the feedback. The error files might be multiple files rather than single one, so we store all the error files under the path as [copy-activity-run-id] with [auto-generated-GUID].csv as file name for each error file. 

    Tuesday, March 13, 2018 2:17 AM
  • How did you monitor the activity result? The skipped information can be found in the output, below is one sample:

    "output": {
               
    "dataRead": 95,
               
    "dataWritten": 186,
               
    "rowsCopied": 9,
               
    "rowsSkipped": 2,
               
    "copyDuration": 16,
               
    "throughput": 0.01,
               
    "redirectRowPath": "https://myblobstorage.blob.core.windows.net//myfolder/a84bf8d4-233f-4216-8cb5-45962831cd1b/",
               
    "errors": []
            },

    Tuesday, March 13, 2018 2:22 AM
  • 1. Can you share your run id? We support storing the error files in data lake and there might be some permission issues.

    2. Thanks for the feedback. The error files might be multiple files rather than single one, so we store all the error files under the path as [copy-activity-run-id] with [auto-generated-GUID].csv as file name for each error file. 

    Thank you for the reply.  I should also mention I am using the UI and not powershell.  

    I created a test dataset and pipeline to show the problem.  The pipeline run ID is  0abb3aa8-f7b7-4719-9cd2-6b308a639ca1

    The activity does not fail.  It moves my file from a RAW folder in the Data Lake into a STAGING folder.  The "bad" row of data is removed from the file in the STAGING folder.  So it's partially working.  But it is not showing up in an "error" file in the Data Lake.  There is no file created in my set folder and there is no file anywhere with the auto-generated-guid.csv.

    How did you monitor the activity result? The skipped information can be found in the output, below is one sample: "output": { "dataRead": 95, "dataWritten": 186, "rowsCopied": 9, "rowsSkipped": 2, "copyDuration": 16, "throughput": 0.01, "redirectRowPath": "https://myblobstorage.blob.core.windows.net//myfolder/a84bf8d4-233f-4216-8cb5-45962831cd1b/", "errors": [] },

    These are the properties I am logging.

    • Edited by AzureFrank Tuesday, March 13, 2018 2:42 PM
    Tuesday, March 13, 2018 2:37 PM
  • Yes, at least in the file which combines the error rows from all files, please put in a file path field which identifies the exact file to which the error row belongs.
    Saturday, June 22, 2019 7:29 AM
  • Hi rahulbiswas500,

    Thank you for your feedback.

    Please feel free to post your idea/suggestion in the user voice forum: Azure Data Factory feedback forum

    By posting your idea in the forum, will give an opportunity to other members of the community to 'up-vote' your post which will increase priority. Microsoft product engineers will closely monitor the features requested by users, will review and take appropriate action.

    Hope this helps...


    [If a post helps to resolve your issue, please click the "Mark as Answer" of that post and/or click Answered "Vote as helpful" button of that post. By marking a post as Answered and/or Helpful, you help others find the answer faster. ]

    Monday, June 24, 2019 7:12 PM