none
Data Factory - On-demand spark cluster not being deleted

    Question

  • I have a pipeline in Data Factory that creates a HDInsight cluster for processing and has a set time-to-live of 30 minutes. 99% of the time this pipeline has run fine, but logging in this morning I saw a cluster had been up for 2 days instead of the usual 1 hour. 

    On the activity log for the cluster through the portal I see 2 entries for "Create or Update Cluster", the first having failed and the second succeeded. The error was a Gateway Timeout with description "The gateway did not receive a response from 'Microsoft.HDInsight' within the specified time period". About 2 hours later I see the success status in the log.

    I checked the Data Factory log and no pipelines were invoked after the original (which failed due to "Cluster reach an unexpected state when creating. Cluster information is 'Cannot get creation status'"). I checked the cluster's Tez view to see if anything had been or is running, noticing there haven't been any entries in the entire lifecycle of the cluster. 

    The question - is this a known behavior? If Data Factory isn't deleting this cluster then that means we may not be able to find out things like this until the bill comes.

    Tuesday, August 28, 2018 11:27 AM

All replies

  • I think this issue has to be looked by our support engineer. If you have a support plan, please open up a ticket or please send us an email to azcommunity@microsoft.com with following details to hook you up with free support:
    - Subscription ID :
    - URL to this thread :
    Thursday, August 30, 2018 10:35 PM
    Moderator