locked
No data slices in this table RRS feed

  • Question

  • Hi,

    I'm trying to understand what causes a Blob Storage dataset slice be defined and change to ready. Is it dependent on the creation/modified date of files in the target location?

    I have a CopyActivity pipeline that is pointing to a DataSet that uses the slice StartTime to identify a particular folder as the source. When I set the pipeline start and end times to cover the period matching the folder names, no slices are generated and the consume table just reports "No data slices in this table.". Therefore it's stuck in pending execution.

    In other example I use a CustomActivity to output data to exactly the same DataSet as above. If I use this same DataSet as the input to CopyActivity, then this time it does create the slices on the consuming table and the pipeline executes.

    In both cases CopyActivity is pointing to a DataSet that defines a list files using SliceStart as the folder name. Why does one create slices and enter ready state and the other does not?

    Thanks,

    Mike


    • Edited by TheCodeKing Thursday, April 16, 2015 3:37 PM
    Thursday, April 16, 2015 3:25 PM

Answers

  • You are correct, any dataset not produced by a pipeline activity must be marked with the "waitOnExternal" property. If you have modified the pipeline and there were slices existing in the past, then you would need to manually re-run them from the portal UI or using powershell. If the slices don't show up in the list, then check to see that your pipeline active period covers that past period (active period refers to the time range of data to be processed, not the "real" time for the pipeline to be operational).

      
    Tuesday, June 2, 2015 4:04 PM

All replies

  • A bit more info. My pipeline is set up to Run between 2015-04-08T14:00:00Z and 2015-04-16T20:00:00Z. My Blob Storage structure looks like this:

    mydata/20150408/1500.csv
    mydata/20150408/1600.csv
    mydata/20150408/1700.csv

    The modified date on the files is 16/04/2015 16:21:44. When I look at the consume table DataSet of the pipeline, which references these locations based on slice StartTime, it says:

    No data slices between 04/07/2015, 17:00 PM UTC and 04/16/2015, 17:00 PM UTC matching the statu...


    • Edited by TheCodeKing Thursday, April 16, 2015 5:36 PM
    Thursday, April 16, 2015 5:25 PM
  • OK so it looks like an input table only becomes ready if it's generated by another pipeline.

    If it's the first data source, you have to use waitOnExternal to poll for changes. I'm trying this, but my slices are in the past and it doesn't seem to be picking them up. Can/does waitOnExternal fire for previous dates? 


    Friday, April 17, 2015 8:34 AM
  • You are correct, any dataset not produced by a pipeline activity must be marked with the "waitOnExternal" property. If you have modified the pipeline and there were slices existing in the past, then you would need to manually re-run them from the portal UI or using powershell. If the slices don't show up in the list, then check to see that your pipeline active period covers that past period (active period refers to the time range of data to be processed, not the "real" time for the pipeline to be operational).

      
    Tuesday, June 2, 2015 4:04 PM