locked
Error when pulling data from Salesforce to ADLS using ADF RRS feed

  • Question

  • I am having an issue in ADF where I get the following error when pulling data from Salesforce into ADLS:

    { "errorCode": "2200", "message": "Failure happened on 'Source' side. ErrorCode=UserErrorUnclassifiedError,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Unknown error.,Source=Microsoft.DataTransfer.ClientLibrary.Odbc.OdbcConnector,''Type=System.Data.Odbc.OdbcException,Message=ERROR [22007] [Microsoft][Support] (40481) Invalid date format for 'false'. The format should be [-]YYYY-MM-DD.,Source=Microsoft Salesforce ODBC Driver,'", "failureType": "UserError", "target": "CopyToADLS" }

    I am able to run the query successfully in the Salesforce Workbench.  Here is my where clause on my SOQL statement:

    where SystemModStamp > 2018-06-30T06:00:36Z

    I have tried to set the copy activity to skip incompatible rows and I still get the error.  This only stated happening Wednesday after ADF V2 was changed to Generally Available, not sure if something changed then.  The odd thing is that I only get this error for certain objects in Salesforce (about half of what I am pulling in).

    Monday, July 2, 2018 2:00 PM

Answers

  • So, I just looked back at my original support request over a year ago with Microsoft.  And this was a defect that Microsoft fixed in the Azure Integration Runtime and was resolved for me over a year ago.  We're not using ADLS Gen 1 anymore, so not sure if this specific issue is broken again.
    • Marked as answer by haveyoumetcp Monday, September 16, 2019 10:52 PM
    Monday, September 16, 2019 10:52 PM

All replies

  • Hi, ADF GA shouldn’t be a problem. The error message indicates there’s invalid format in query. Could you provide the format of the ‘SystemModStamp’? Seems it’s ‘YYYY-MM-DD’. If so, please reference https://docs.microsoft.com/en-us/azure/data-factory/control-flow-expression-language-functions#expressions and use function ‘formatDateTime’ to align format. Feel free to post other question.
    Tuesday, July 3, 2018 12:29 AM
  • Hi,

    1. From the error message, seems you are using ODBC connector to access salesforce. There is actually a salesforce connector in ADF. You could have a try. It is also supported in UI.

    2. For the failed activity, could you provide the activity run id?

    Thanks


    Tuesday, July 3, 2018 4:33 AM
  • I am using the Salesforce connector in ADF.

    Here is one of the activity Id's that failed:

    135cf844-fe4c-4bd9-8002-b9bf92a75be1

    Tuesday, July 3, 2018 2:25 PM
  • No, the problem is not with the query as I am able to run the exact same query within the Salesforce Workbench.  I am using formatDateTime to set the correct format.  When I look at the logs for that run is actually says that Rows read was 1,001 and rows written was 1,000.  So, in that case it wouldn't be an issue with the query.

    This is the output from a failed activity:

    { "dataRead": 2577632, "dataWritten": 0, "filesWritten": 0, "rowsRead": 1001, "rowsCopied": 1000, "copyDuration": 12, "throughput": 209.768, "errors": [ { "Code": 9602, "Message": "Failure happened on 'Source' side. ErrorCode=UserErrorOdbcOperationFailed,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=ODBC error 'HY000'.,Source=Microsoft.DataTransfer.ClientLibrary.Odbc.OdbcConnector,''Type=System.Data.Odbc.OdbcException,Message=ERROR [HY000] [Microsoft][Support] (40321) String value '<a href=\"https://c.na88.content.force.com/servlet/servlet.FileDownload?file=0151Y00000FQNdc\" target=\"_blank\">Click here for script</a>' resulted in an invalid numeric.,Source=Microsoft Salesforce ODBC Driver,'", "EventType": 0, "Category": 5, "Data": { "FailureInitiator": "Source" }, "MsgId": null, "ExceptionType": null, "Source": null, "StackTrace": null, "InnerEventInfos": [] } ], "effectiveIntegrationRuntime": "DefaultIntegrationRuntime (East US 2)", "usedDataIntegrationUnits": 4, "usedParallelCopies": 1, "executionDetails": [ { "source": { "type": "Salesforce" }, "sink": { "type": "AzureDataLakeStore" }, "status": "Failed", "start": "2018-07-03T06:06:40.7462359Z", "duration": 12, "usedDataIntegrationUnits": 4, "usedParallelCopies": 1, "detailedDurations": { "queuingDuration": 3, "timeToFirstByte": 1, "transferDuration": 8 } } ] }

    Tuesday, July 3, 2018 2:28 PM
  •  From the runId as well as the error message your saw, this activity is failed due to String value '<a href=\"https://c.na88.content.force.com/servlet/servlet.FileDownload?file=0151Y00000FQNdc\" target=\"_blank\">Click here for script</a>'
    resulted in an invalid numeric.

    Seems like you are convert string values to numeric? But you have some data in the source can't be converted to numeric.


    Tuesday, July 3, 2018 3:33 PM
  •  From the runId as well as the error message your saw, this activity is failed due to String value '<a href=\"https://c.na88.content.force.com/servlet/servlet.FileDownload?file=0151Y00000FQNdc\" target=\"_blank\">Click here for script</a>'
    resulted in an invalid numeric.

    Seems like you are convert string values to numeric? But you have some data in the source can't be converted to numeric.


    I'm not doing any conversions.  I am pulling a query from Salesforce into a CSV file into Azure data lake store.  There is no schema defined anywhere, not even in the ADF pipeline.  And this was working fine until Wednesday.

    It looks like there are multiple errors depending on the object being queried in Salesforce:

    • ....Click here for script</a>' resulted in an invalid numeric.....
    • ..... Invalid date format for 'false'. The format should be [-]YYYY-MM-DD.,Source=Microsoft Salesforce ODBC Driver....

    So, here's a little more information:

    What this process is doing is pulling delta loads from Salesforce into Azure Data Lake Store.  And if I run a full historical load that works fine; both without a where clause and with a where clause of SystemModStamp > 1900-01-01T00:00:00Z.

    Then, I am able to run delta loads after that with no problem.  However, when I set the date back to what's it was originally getting the error at, I get the error again.  So, I have a workaround but I just wonder if this is going to happen again.  Seems like there is a bug somewhere in ADF with this.

    Tuesday, July 3, 2018 5:47 PM
  • Update:

    This is still happening.

    RunIds: 

    • b53aa4db-65e3-4a12-9402-beda24807179
    • 582596a0-d107-450c-9d9d-7f9cdfd6cacd
    Wednesday, July 4, 2018 5:41 PM
  • No, the problem is not with the query as I am able to run the exact same query within the Salesforce Workbench.  I am using formatDateTime to set the correct format.  When I look at the logs for that run is actually says that Rows read was 1,001 and rows written was 1,000.  So, in that case it wouldn't be an issue with the query.

    This is the output from a failed activity:

    { "dataRead": 2577632, "dataWritten": 0, "filesWritten": 0, "rowsRead": 1001, "rowsCopied": 1000, "copyDuration": 12, "throughput": 209.768, "errors": [ { "Code": 9602, "Message": "Failure happened on 'Source' side. ErrorCode=UserErrorOdbcOperationFailed,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=ODBC error 'HY000'.,Source=Microsoft.DataTransfer.ClientLibrary.Odbc.OdbcConnector,''Type=System.Data.Odbc.OdbcException,Message=ERROR [HY000] [Microsoft][Support] (40321) String value '<a href=\"https://c.na88.content.force.com/servlet/servlet.FileDownload?file=0151Y00000FQNdc\" target=\"_blank\">Click here for script</a>' resulted in an invalid numeric.,Source=Microsoft Salesforce ODBC Driver,'", "EventType": 0, "Category": 5, "Data": { "FailureInitiator": "Source" }, "MsgId": null, "ExceptionType": null, "Source": null, "StackTrace": null, "InnerEventInfos": [] } ], "effectiveIntegrationRuntime": "DefaultIntegrationRuntime (East US 2)", "usedDataIntegrationUnits": 4, "usedParallelCopies": 1, "executionDetails": [ { "source": { "type": "Salesforce" }, "sink": { "type": "AzureDataLakeStore" }, "status": "Failed", "start": "2018-07-03T06:06:40.7462359Z", "duration": 12, "usedDataIntegrationUnits": 4, "usedParallelCopies": 1, "detailedDurations": { "queuingDuration": 3, "timeToFirstByte": 1, "transferDuration": 8 } } ] }


    As there were 1001 rows read from Salesforce and 1000 written into ADLS, the issue should be in the 1001st row in Salesforce.  Why not take a look at that row from Salesforce Development Console?
    Thursday, July 5, 2018 3:45 PM
  • No, the problem is not with the query as I am able to run the exact same query within the Salesforce Workbench.  I am using formatDateTime to set the correct format.  When I look at the logs for that run is actually says that Rows read was 1,001 and rows written was 1,000.  So, in that case it wouldn't be an issue with the query.

    This is the output from a failed activity:

    { "dataRead": 2577632, "dataWritten": 0, "filesWritten": 0, "rowsRead": 1001, "rowsCopied": 1000, "copyDuration": 12, "throughput": 209.768, "errors": [ { "Code": 9602, "Message": "Failure happened on 'Source' side. ErrorCode=UserErrorOdbcOperationFailed,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=ODBC error 'HY000'.,Source=Microsoft.DataTransfer.ClientLibrary.Odbc.OdbcConnector,''Type=System.Data.Odbc.OdbcException,Message=ERROR [HY000] [Microsoft][Support] (40321) String value '<a href=\"https://c.na88.content.force.com/servlet/servlet.FileDownload?file=0151Y00000FQNdc\" target=\"_blank\">Click here for script</a>' resulted in an invalid numeric.,Source=Microsoft Salesforce ODBC Driver,'", "EventType": 0, "Category": 5, "Data": { "FailureInitiator": "Source" }, "MsgId": null, "ExceptionType": null, "Source": null, "StackTrace": null, "InnerEventInfos": [] } ], "effectiveIntegrationRuntime": "DefaultIntegrationRuntime (East US 2)", "usedDataIntegrationUnits": 4, "usedParallelCopies": 1, "executionDetails": [ { "source": { "type": "Salesforce" }, "sink": { "type": "AzureDataLakeStore" }, "status": "Failed", "start": "2018-07-03T06:06:40.7462359Z", "duration": 12, "usedDataIntegrationUnits": 4, "usedParallelCopies": 1, "detailedDurations": { "queuingDuration": 3, "timeToFirstByte": 1, "transferDuration": 8 } } ] }


    As there were 1001 rows read from Salesforce and 1000 written into ADLS, the issue should be in the 1001st row in Salesforce.  Why not take a look at that row from Salesforce Development Console?

    There is nothing wrong with row 1001.  I think it just has something to do with is misaligning the columns or something.  This was working without any issue before last Wednesday when ADF V2 was set to Generally Available and Data Lake Store was set to V1.  Nothing in my code changed.

    I can set the date back for an incremental pull to any previous date and I still get this error, so row 1001 is not always the same row.


    • Edited by haveyoumetcp Thursday, July 5, 2018 4:57 PM typo
    Thursday, July 5, 2018 4:04 PM
  • Could you please open an Azure support ticket for this case?  The related engineer will look deep into your case and figure out the reason.
    Friday, July 6, 2018 8:50 AM
  • Did you ever get a resolution?  I have having exact same issue reads 2001 wrote 2000.

    thanks!

    Wednesday, August 28, 2019 1:35 PM
  • Same error at 2001 reads, written 1992
    "dataRead": 433623820, "dataWritten": 428833001, "sourcePeakConnections": 1, "sinkPeakConnections": 2, "rowsRead": 2001, "rowsCopied": 1992, "copyDuration": 731, "throughput": 579.29,

    Monday, September 16, 2019 6:29 PM
  • So, I just looked back at my original support request over a year ago with Microsoft.  And this was a defect that Microsoft fixed in the Azure Integration Runtime and was resolved for me over a year ago.  We're not using ADLS Gen 1 anymore, so not sure if this specific issue is broken again.
    • Marked as answer by haveyoumetcp Monday, September 16, 2019 10:52 PM
    Monday, September 16, 2019 10:52 PM