none
modifiedDatetimestart - can't be added programmatically? RRS feed

  • Question

  • Hello.

    I am trying to create a dataset with the property modifiedDatetimeStart and modifiedDatetimeEnd programmatically - it works when I modify in UI, but unless we can make it work programmatically we can't integrate that into our release cycle.

    Issue I am seeing is that when I publish dataset and pipeline programmatically, modifiedDatetimeStart and modifiedDatetimeEnd disappears.

    This is the whole dataset definition:

    {
      "name": "sourcedataset",
      "properties": {
        "linkedServiceName": {
          "referenceName": "sourceADL",
          "type": "LinkedServiceReference"
        },
        "type": "AzureDataLakeStoreFile",
        "typeProperties": {
          "fileName": {
            "value": "@dataset().fileName",
            "type": "Expression"
          },
          "folderPath": {
            "value": "@dataset().folderPath",
            "type": "Expression"
          },
          "modifiedDatetimeStart": {
            "value": "@dataset().startTime",
            "type": "Expression"
          },
          "modifiedDatetimeEnd": {
            "value": "@dataset().endTime",
            "type": "Expression"
          }
        },
        "parameters": {
          "fileName": {
            "type": "String"
          },
          "folderPath": {
            "type": "String"
          },
          "startTime": {
            "type": "String"
          },
          "endTime": {
            "type": "String"
          }
        }
      }
    }

    with parameters provided from pipeline:

                "parameters": {
                  "folderPath": {
                    "value": "/folderpath/",
                    "type": "Expression"
                  },
                  "fileName": {
                    "value": "*",
                    "type": "Expression"
                  },
                  "startTime": {
                    "value": "@pipeline().parameters.windowStart",
                    "type": "Expression"
                  },
                  "endTime": {
                    "value": "@pipeline().parameters.windowEnd",
                    "type": "Expression"
                  }
                }

    and this is what I see after publishing:

    Any help will be appreciated.


    byungjoon yoon, software engineer at servicelink

    Wednesday, June 12, 2019 8:36 PM

Answers

  • Hello again, byungjoon yoon.  I have good news, I have a workaround.  I have tried this myself, and it works.

    First, have your correctly created one (by UI) deployed.

    Next, download the resource by: (powershell)

    $dataset = Get-AzureRmResource -ResourceType "Microsoft.DataFactory/factories/datasets" -ResourceGroupName my-factory-resource-group -Name "my-factory-name/my_dataset_name" -ApiVersion "2018-06-01"
    

    Secondly, extract the definition with: (powershell)

    $newsonPropertiesStringFromFile = ConvertTo-Json $dataset.Properties

    Lastly create a new resource (dataset): (powershell)

    New-AzureRmResource -ResourceType "Microsoft.DataFactory/factories/datasets" -ResourceGroupName my-factory-resource-group -Name "my-factory-name/new_dataset_name" -ApiVersion "2018-06-01" -Properties (ConvertFrom-Json $newsonPropertiesStringFromFile)

    Monday, June 17, 2019 8:56 PM
    Moderator

All replies

  • Hello byungjoon yoon, and thank you for bringing this to our attention.  I have reproduced your scenario.  Thank you for all the details.  I will reach out internally to find out more information for you.
    Friday, June 14, 2019 12:15 AM
    Moderator
  • Hello again, byungjoon yoon.  I have good news, I have a workaround.  I have tried this myself, and it works.

    First, have your correctly created one (by UI) deployed.

    Next, download the resource by: (powershell)

    $dataset = Get-AzureRmResource -ResourceType "Microsoft.DataFactory/factories/datasets" -ResourceGroupName my-factory-resource-group -Name "my-factory-name/my_dataset_name" -ApiVersion "2018-06-01"
    

    Secondly, extract the definition with: (powershell)

    $newsonPropertiesStringFromFile = ConvertTo-Json $dataset.Properties

    Lastly create a new resource (dataset): (powershell)

    New-AzureRmResource -ResourceType "Microsoft.DataFactory/factories/datasets" -ResourceGroupName my-factory-resource-group -Name "my-factory-name/new_dataset_name" -ApiVersion "2018-06-01" -Properties (ConvertFrom-Json $newsonPropertiesStringFromFile)

    Monday, June 17, 2019 8:56 PM
    Moderator
  • OK, well setting this up on UI is also an issue, since there are multiple things that work if I publish, but UI doesn't like. For example, if I set output file name as empty, it works fine if I do it from set pipeline, it throws error when I try to change that on the UI. it works from time to time when I try it multiple times so I will try the route you suggested when it works... as long as we can create this on one adf and change some text and move to another it should work I guess.


    byungjoon yoon, software engineer at servicelink

    Monday, June 17, 2019 10:18 PM
  • You could also manipulate the json directly while it is on your end.  The first step was to get an example to work from.  If you are confident, you can take the result of one and manipulate it without using the UI.  This means the last command is all you really need.
    Monday, June 17, 2019 10:30 PM
    Moderator
  • OK, well setting this up on UI is also an issue, since there are multiple things that work if I publish, but UI doesn't like. For example, if I set output file name as empty, it works fine if I do it from set pipeline, it throws error when I try to change that on the UI. it works from time to time when I try it multiple times so I will try the route you suggested when it works... as long as we can create this on one adf and change some text and move to another it should work I guess.


    byungjoon yoon, software engineer at servicelink

    I'm not sure I follow what you are saying.  Is this something you would like to discuss further?
    Tuesday, June 18, 2019 12:42 AM
    Moderator
  • sure, something like this:

    These all works when I upload this using AzureRM commands, but shows up ineligible when I publish. all these are because I can't setup * as file name in output dataset, and since this is a binary copy I just put empty string at file name.

    


    byungjoon yoon, software engineer at servicelink

    Tuesday, June 18, 2019 2:27 PM
  • Also in some other pipeline this is what I see:

    But this works fine when I publish using azureRM commands, so yeah... validation logic's outdated?


    byungjoon yoon, software engineer at servicelink


    Tuesday, June 18, 2019 5:36 PM
  • Hello again, byungjoon yoon.  I have good news, I have a workaround.  I have tried this myself, and it works.

    First, have your correctly created one (by UI) deployed.

    Next, download the resource by: (powershell)

    $dataset = Get-AzureRmResource -ResourceType "Microsoft.DataFactory/factories/datasets" -ResourceGroupName my-factory-resource-group -Name "my-factory-name/my_dataset_name" -ApiVersion "2018-06-01"

    Secondly, extract the definition with: (powershell)

    $newsonPropertiesStringFromFile = ConvertTo-Json $dataset.Properties

    Lastly create a new resource (dataset): (powershell)

    New-AzureRmResource -ResourceType "Microsoft.DataFactory/factories/datasets" -ResourceGroupName my-factory-resource-group -Name "my-factory-name/new_dataset_name" -ApiVersion "2018-06-01" -Properties (ConvertFrom-Json $newsonPropertiesStringFromFile)

    OK, I tried this.

    Now I can see those things appearing, but the job is failing with this:

    { "errorCode": "2200", "message": "Failed to convert the value in 'modifiedDatetimeStart' property to 'System.Nullable`1[[System.DateTime, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]]' type. Please make sure the payload structure and value are correct. ", "failureType": "UserError", "target": "activityName" }

    This is what I uploaded:

          "modifiedDatetimeStart": {
            "value": "@dataset().startTime",
            "type": "String"
          },
          "modifiedDatetimeEnd": {
            "value": "@dataset().endTime",
            "type": "Expression"
          }
      },

                  "startTime": {

                    "value": "@pipeline().parameters.windowStart",
                    "type": "Expression"
                  },
                  "endTime": {
                    "value": "@pipeline().parameters.windowEnd",
                    "type": "Expression"
                  }

    but this shows up:

    

    

    Any idea?


    byungjoon yoon, software engineer at servicelink

    Tuesday, June 18, 2019 7:35 PM
  • Okay.  Here I think is the misunderstanding:

    You are not permitted to put the @pipeline reference directly inside the dataset definition like you have done.  Instead parameterize the dataset like is shown in the UI.  Then when the dataset is used in the pipeline's activity you can pass in the pipeline parameter to be used by the dataset.  Illustrated below.
    ACTIVITY SETTINGS:

    DATASET SETTINGS

    DATASET PARAMETERS:


    Tuesday, June 18, 2019 10:48 PM
    Moderator
  • On more consideration, I don't think I read through your reply well.  Please let me try again.

    That error message is from when you run the pipeline or when you preview or debug?

    About using * in output dataset, the empty string should work.  Omitting the fileName property entirely may also work.

    When a dataset is parameterized, it is required that in the activity using it, you provide values to all of the parameters.  This would mean, if you programmatically alter a dataset already in use by activities, you will need to update those activities to match the new parameters.

    Also AzureRM is being deprecated.  The new Az module is generally preferred.

    Tuesday, June 18, 2019 11:20 PM
    Moderator
  • Error message is when I try to publish any changes to the pipeline. That expression is an argument to spark job activity

    byungjoon yoon, software engineer at servicelink

    Wednesday, June 19, 2019 1:08 AM
  • On more consideration, I don't think I read through your reply well.  Please let me try again.

    That error message is from when you run the pipeline or when you preview or debug?

    About using * in output dataset, the empty string should work.  Omitting the fileName property entirely may also work.

    When a dataset is parameterized, it is required that in the activity using it, you provide values to all of the parameters.  This would mean, if you programmatically alter a dataset already in use by activities, you will need to update those activities to match the new parameters.

    Also AzureRM is being deprecated.  The new Az module is generally preferred.

    also 

                "startTime": {

                    "value": "@pipeline().parameters.windowStart",
                    "type": "Expression"
                  },
                  "endTime": {
                    "value": "@pipeline().parameters.windowEnd",
                    "type": "Expression"
                  }

    This part is input from pipeline. and as you can see modified enddate works correctly, but not start date. when I just modified that and republished it worked(I had to delete pipeline, republish change and publish pipeline from azureRM to make it work)

    Some Az commands I can't find how to use, so I am using azureRM alias.

    For ex, I can't find how to do this in Azmodule: Select-AzureRmSubscription


    byungjoon yoon, software engineer at servicelink

    Wednesday, June 19, 2019 1:28 PM
  • and removing the parameter for the output worked, since all these are backups I could remove filename from parameter - now at least I can publish stuff in backup pipeline :) still the issue of having starttime come in incorrectly is not resolved.

    THis is my input:

          "modifiedDatetimeStart": {

            "value": "@dataset().startTime",
            "type": "Expression"
          },
          "modifiedDatetimeEnd": {
            "value": "@dataset().endTime",
            "type": "Expression"
          }

    After creating AzureRMResource, this is the file:


    byungjoon yoon, software engineer at servicelink

    Thursday, June 20, 2019 1:38 PM
  • Any update on this?

    byungjoon yoon, software engineer at servicelink

    Tuesday, June 25, 2019 1:49 PM
  • I heard from the product group, that they plan to have an SDK update very soon.  I honestly have no idea why one date would behave differently from the other.

    If it persists I can offer you a 1-time free support ticket.

    Tuesday, June 25, 2019 4:55 PM
    Moderator
  • As long as that gets fixed soon I will be good - but you are saying you can't reproduce 2nd part, where I use New-AzureRMResource and that doesn't bring the start date correctly, right? I am using Az module with AzureRM alias, if that makes any difference. If you can't reproduce that I will open a support ticket.

    byungjoon yoon, software engineer at servicelink

    Tuesday, June 25, 2019 6:27 PM
  • I tried using AzureRm, and was unable to reproduce that publishing error, using your file from the top of this thread, (only modification was to use a Linked Service name that exists in my factory.  Odd thing I found, the LinkedServiceName was cleared when I checked in the UI.  I must have chosen an inappropriate type, yet it still uploaded as long as the name existed.)

    Do you have a support plan, or do you need me to enable a one-time free support ticket?

    Thursday, June 27, 2019 12:56 AM
    Moderator
  • I created the ticket and referenced this post. Thanks for helping.


    byungjoon yoon, software engineer at servicelink

    Thursday, June 27, 2019 5:32 PM
  • Once the issue is resolved, it would be very helpful if you could share the learnings here.
    Thursday, June 27, 2019 8:33 PM
    Moderator