none
Issue with copying Rest API data to Azure SQL DB RRS feed

  • Question

  • When I copy Rest API data to Azure SQL, there is weird issue. If I use ADF V2 Copy Wizard to generate the pipeline, it can insert 3 following records into my Azure SQL DB table.

    taskId taskName error error_description
    1 abc success success
    2 cde success success
    3 efg sucess success

    If I do any tiny modification like modifying the pipeline description on this pipeline, after that, if I run the updated pipeline, it can only copies following 1 record into the azure sql table.

    taskId taskName error error_description
    null null success success

    note:
    Rest API returned data format is as followings:

    {
        "data": [       
            {
                "taskId": 1,
                "taskName": "abc"
            },
            {
                "taskId": 2,
                "taskName": "efg"
            },
            {
                "taskId": 3,
                "taskName": "ggg"
            }
        ],
        "error": "success",
        "error_Description": "successful"
    }
    

    The pipeline json code is as followings:

    {
        "name": "pipeline3",
        "properties": {
            "activities": [
                {
                    "name": "Copy_v9u",
                    "type": "Copy",
                    "policy": {
                        "timeout": "7.00:00:00",
                        "retry": 0,
                        "retryIntervalInSeconds": 30,
                        "secureOutput": false,
                        "secureInput": false
                    },
                    "userProperties": [
                        {
                            "name": "Source",
                            "value": "api/getTask"
                        },
                        {
                            "name": "Destination",
                            "value": "[abc].[table1]"
                        }
                    ],
                    "typeProperties": {
                        "source": {
                            "type": "RestSource",
                            "httpRequestTimeout": "00:01:40",
                            "requestInterval": "00.00:00:00.010"
                        },
                        "sink": {
                            "type": "AzureSqlSink",
                            "preCopyScript": "truncate table abc.table1"
                        },
                        "enableStaging": false,
                        "translator": {
                            "type": "TabularTranslator",
                            "mappings": [
                                {
                                    "source": {
                                        "path": "taskId"
                                    },
                                    "sink": {
                                        "name": "taskId",
                                        "type": "Int32"
                                    }
                                },
                                {
                                    "source": {
                                        "path": "taskName"
                                    },
                                    "sink": {
                                        "name": "taskName",
                                        "type": "String"
                                    }
                                },
                                {
                                    "source": {
                                        "path": "$.error"
                                    },
                                    "sink": {
                                        "name": "error",
                                        "type": "String"
                                    }
                                },
                                {
                                    "source": {
                                        "path": "$.error_Description"
                                    },
                                    "sink": {
                                        "name": "error_Description",
                                        "type": "String"
                                    }
                                }
                            ]
                        }
                    },
                    "inputs": [
                        {
                            "referenceName": "SourceDataset_v9u",
                            "type": "DatasetReference"
                        }
                    ],
                    "outputs": [
                        {
                            "referenceName": "DestinationDataset_v9u",
                            "type": "DatasetReference"
                        }
                    ]
                }
            ]
        },
        "type": "Microsoft.DataFactory/factories/pipelines"
    }
    

    On the mapping of the copy activity, I checked the "Collection Reference" since the "data"on json result contains the array data.

    I think it is caused by current ADF doesn't save collectionReference on the json file and once I republish it, the collectionReference information is lost somehow. My previous project is working file, I compared the pipeline json file, I found previous one uses "schemaMapping" instead of "mappings" and previous one contains "collectionReference", however current one doesn't contains "collectionReference". Please see following code for 3 months' ago pipeline copying Rest API to sql db:

    "translator": {
                            "type": "TabularTranslator",
                            "schemaMapping": {
                                "$.count": "count",
                                "$.next": "next",
                                "$.previous": "previous",
                                "product_id": "product_id",
                                "product_name": "product_name",
                                "city": "city",
                                "country": "country"
                            },
                            "collectionReference": "$.results"
                      }
    

    Can you please advise the solution to resolve this issue?

    Sunday, May 19, 2019 11:15 AM

All replies

  • Can you please point me to the wizard which you are using ? 

    Well I think that once you update the ADF the schemamapping setting is getting lost . As mentioned by you the JSON payload is 

    {
        "data": [       
            {
                "taskId": 1,
                "taskName": "abc"
            },
            {
                "taskId": 2,
                "taskName": "efg"
            },
            {
                "taskId": 3,
                "taskName": "ggg"
            }
        ],
        "error": "success",
        "error_Description": "successful"
    }

    and once you update the adf this is the output .

    null null success success ,

    please note that both of the members ( taskId and TaskName )of the object is lost . 

    You can read more on the schema mapping here .

    https://docs.microsoft.com/en-us/azure/data-factory/copy-activity-schema-and-type-mapping#alternative-schema-mapping


    Thanks Himanshu

    Tuesday, May 21, 2019 10:37 PM
    Moderator
  • Thanks for reporting this issue. The 'collectionReference' property indeed got lost after republishing the pipeline. The fix for this issue will be rolled out to product at the end of this week. Please feel free to create pipeline from Copy wizard as a workaround. Thanks.

    Also, as you can see, we previously used "schemaMapping" to describe mapping relationship between source and sink. Now it's replaced by "mappings". Both of them are working and it is not the cause of the issue you met.

    The useful doc link for you: https://docs.microsoft.com/en-us/azure/data-factory/copy-activity-schema-and-type-mapping

    Wednesday, May 22, 2019 2:30 AM