locked
Error in handling multiple table columns from Azure SQL Db in Dataflows RRS feed

  • Question

  • i Everyone,

     

    I am new to Data Factory / Data Flow services and have been trying to do an incremental copy from a list of SQL tables to another set of SQL Tables in Azure. I am trying to achieve this by doing transformation over Dataflows. To an extent I got an idea from following this article.

     

    However this article takes only one table as a source while in my case I have to take a list of SQL tables. To do that I have used Lookup and For-each activity as described below and I will like to think that I have used the ADF parameters in correct way.

    But still I am unable to get my desired source table copied into sink tables. Below are the Steps I have done till now:

     

    1. My source table in SQL db ends with "_Staging" like TableA_Staging,TableB_Staging etc and Sink tables are TableA, TableB and so on.
    2. All tables have different columns with Load Date and Table name column as common columns. Load date is the date column which will be used to do incremental update.
    3. I have a watermark table with 3 columns as Table name which contain all source table names, Watermark which contain value as "Load_Date" and WatermarkValue which contain value of Load date of each table. Same as highlighted in article above.

    In DataFactory Pipeline:

    1. Added a lookup activity with Query to get all tables ending with _Staging. as " SELECT table_name FROM information_schema.tables WHERE table_type= 'base table' and table_name like '%Staging%'"
    2. Added a Foreach activity with input from lookup activity and then adding the dataflow activity inside the for each. @activity('LookupAllStagingTable').output.value

     

    In Data Flow:

     

    1. Defined a Source with a Parameter dbo.dataset@tablename

    2. Defined a Sink with a Parameter dbo.dataset@mytable

    3. In Dataflow defined 3 parameter as: mytablename to item().table_name from the Lookup, Load_Date to store String as concat("LOAD_DATE") and TableName to store string concat("TableName")

    4. In Source block in dataflow added the source with Query as concat('Select * FROM', $mytablename)

    5. Added a derived block and added Load_Date to ToString(byName($Load_Date) ,and Table name to ToString(byName($TableName).

    6. Doing a Join to my Watermark table with Table name from both source and watermark table. ( same as in article above)

    7. Then add a filter where Load_Date > WatermarkValue.

    8. Finally adding a sink with Sink dataset parameter defined earlier.

     

    Validation is successful but now I get error in my Join statement on debugging

     

    " Only conditional expressions are allowed\",\"Details\":\"Job '4312f1db-f640-4e03-91ab-8ccc1159ed86 failed due to reason: DF-JOIN-003 at Join 'Join1'(Line 23/Col 31): Only conditional expressions are allowed\"}",

     

    I am unable to proceed further from here. I tried multiple ways of concanating strings but still unsure on this. here is the screen shot:

     

    And below is screen shot from derived, join and filter section


    zzzSharePoint

    Thursday, January 2, 2020 3:30 PM

All replies

  • What are the column names that you are looking to Join on? The error being returned seems to indicate that the left and right sides are not resolving to columns. For the Join to work successfully, they must each resolve to a column in both streams.
    Friday, January 3, 2020 7:39 PM
  • hi,

    the column i am looking to join is "NameTable". it contains the name of the source table itself and i am joining it with Watermark table which has a column as Tablename which contain all the table names of source. if you see the link i have shared in the question, here they are joining by table name.

    same way i would like to do only difference is i have my column already defined in my data


    zzzSharePoint

    Friday, January 3, 2020 10:04 PM
  • Post the dataflow DSL script clicking on the script button.
    Tuesday, January 7, 2020 6:44 AM
  • Hi,

    Here you go:

    {
    "name": "dataflow1",
    "type": "ExecuteDataFlow",
    "dependsOn": [],
    "policy": {
    "timeout": "7.00:00:00",
    "retry": 0,
    "retryIntervalInSeconds": 30,
    "secureOutput": false,
    "secureInput": false
    },
    "userProperties": [],
    "typeProperties": {
    "dataflow": {
    "referenceName": "dataflow1",
    "type": "DataFlowReference",
    "parameters": {
    "myTableName": {
    "value": "'@{item().table_name}'",
    "type": "Expression"
    },
    "LOAD_DATE": {
    "value": "'@{concat('LOAD_DATE')}'",
    "type": "Expression"
    },
    "NameTable": {
    "value": "'@{concat('NameTable')}'",
    "type": "Expression"
    }
    },
    "datasetParameters": {
    "SqlStagingData": {
    "tablename": {
    "value": "@item().table_name",
    "type": "Expression"
    }
    },
    "Sinkorigdata": {
    "mytable": {
    "value": "@split(item().table_name,'_Staging')[0]",
    "type": "Expression"
    }
    }
    }
    },
    "compute": {
    "coreCount": 8,
    "computeType": "General"
    }
    }
    }

    zzzSharePoint

    Tuesday, January 7, 2020 9:27 AM
  • Hello ,

    We have not received a response from you.  Are you still facing the issue?  If you found a solution, would you please share it here with the community?  Otherwise, let us know and we will continue to engage with you on the issue.


    Thanks Himanshu

    Wednesday, January 15, 2020 8:45 PM