locked
How to create a periodic copy of data from OData to Azure CosmosDB RRS feed

  • Question

  • Hello all,

    I hope I am just missing something and this is easy.

    I am trying to make a periodic copy of all the data returning from an OData query into a CosmosDB collection, on a daily basis.

    The copy works fine using the copy wizard, which is A REALLY GREAT option for simple tasks.  Thanks for that.

    What isn't working for me though:  The copy just adds data each time, and I have NO WAY that I can SEE with a CosmosDB sink to "pre-delete" the data in the collection (compare to the SQL sink which has sqlWriterCleanupScript, which I could set to something like Delete * from 'table').

    I know I can create an Azure Batch and do what I need, but at this point, I'm not sure that it isn't better to do a function and forego the Azure Data Factory for this move.  I'm using ADF for replicating on-prem SQL stuff just fine, because it has the writer cleanup script.

    At this point, i'd like to just use CosmosDB but I don't see a way to do it given the way my data works.

    Here's a look at my pipeline:

    { "name": "R-------ProjectToDocDB", "properties": { "activities": [ { "type": "Copy", "typeProperties": { "source": { "type": "RelationalSource", "query": " " }, "sink": { "type": "DocumentDbCollectionSink", "nestingSeparator": ".", "writeBatchSize": 0, "writeBatchTimeout": "00:00:00"
    /// this is where a cleanup script would be great.

    }, "translator": { "type": "TabularTranslator", "columnMappings": "ProjectId:ProjectId,EnterpriseProjectTypeDescription:EnterpriseProjectTypeDescription,EnterpriseProjectTypeId:EnterpriseProjectTypeId,EnterpriseProjectTypeIsDefault:EnterpriseProjectTypeIsDefault,EnterpriseProjectTypeName:EnterpriseProjectTypeName,OptimizerCommitDate:OptimizerCommitDate,OptimizerDecisionAliasLookupTableId:OptimizerDecisionAliasLookupTableId,OptimizerDecisionAliasLookupTableValueId:OptimizerDecisionAliasLookupTableValueId,OptimizerDecisionID:OptimizerDecisionID,OptimizerDecisionName:OptimizerDecisionName,OptimizerSolutionName:OptimizerSolutionName,ParentProjectId:ParentProjectId,PlannerCommitDate:PlannerCommitDate,PlannerDecisionAliasLookupTableId:PlannerDecisionAliasLookupTableId,PlannerDecisionAliasLookupTableValueId:PlannerDecisionAliasLookupTableValueId,PlannerDecisionID:PlannerDecisionID,PlannerDecisionName:PlannerDecisionName,PlannerEndDate:PlannerEndDate,PlannerSolutionName:PlannerSolutionName,PlannerStartDate:PlannerStartDate,ProjectActualCost:ProjectActualCost,ProjectActualDuration:ProjectActualDuration,ProjectActualFinishDate:ProjectActualFinishDate,ProjectActualOvertimeCost:ProjectActualOvertimeCost,ProjectActualOvertimeWork:ProjectActualOvertimeWork,ProjectActualRegularCost:ProjectActualRegularCost,ProjectActualRegularWork:ProjectActualRegularWork,ProjectActualStartDate:ProjectActualStartDate,ProjectActualWork:ProjectActualWork,ProjectACWP:ProjectACWP,ProjectAuthorName:ProjectAuthorName,ProjectBCWP:ProjectBCWP,ProjectBCWS:ProjectBCWS,ProjectBudgetCost:ProjectBudgetCost,ProjectBudgetWork:ProjectBudgetWork,ProjectCalculationsAreStale:ProjectCalculationsAreStale,ProjectCalendarDuration:ProjectCalendarDuration,ProjectCategoryName:ProjectCategoryName,ProjectCompanyName:ProjectCompanyName,ProjectCost:ProjectCost,ProjectCostVariance:ProjectCostVariance,ProjectCPI:ProjectCPI,ProjectCreatedDate:ProjectCreatedDate,ProjectCurrency:ProjectCurrency,ProjectCV:ProjectCV,ProjectCVP:ProjectCVP,ProjectDescription:ProjectDescription,ProjectDuration:ProjectDuration,ProjectDurationVariance:ProjectDurationVariance,ProjectEAC:ProjectEAC,ProjectEarlyFinish:ProjectEarlyFinish,ProjectEarlyStart:ProjectEarlyStart,ProjectEarnedValueIsStale:ProjectEarnedValueIsStale,ProjectEnterpriseFeatures:ProjectEnterpriseFeatures,ProjectFinishDate:ProjectFinishDate,ProjectFinishVariance:ProjectFinishVariance,ProjectFixedCost:ProjectFixedCost,ProjectIdentifier:ProjectIdentifier,ProjectKeywords:ProjectKeywords,ProjectLateFinish:ProjectLateFinish,ProjectLateStart:ProjectLateStart,ProjectLastPublishedDate:ProjectLastPublishedDate,ProjectManagerName:ProjectManagerName,ProjectModifiedDate:ProjectModifiedDate,ProjectName:ProjectName,ProjectOvertimeCost:ProjectOvertimeCost,ProjectOvertimeWork:ProjectOvertimeWork,ProjectOwnerId:ProjectOwnerId,ProjectOwnerName:ProjectOwnerName,ProjectPercentCompleted:ProjectPercentCompleted,ProjectPercentWorkCompleted:ProjectPercentWorkCompleted,ProjectRegularCost:ProjectRegularCost,ProjectRegularWork:ProjectRegularWork,ProjectRemainingCost:ProjectRemainingCost,ProjectRemainingDuration:ProjectRemainingDuration,ProjectRemainingOvertimeCost:ProjectRemainingOvertimeCost,ProjectRemainingOvertimeWork:ProjectRemainingOvertimeWork,ProjectRemainingRegularCost:ProjectRemainingRegularCost,ProjectRemainingRegularWork:ProjectRemainingRegularWork,ProjectRemainingWork:ProjectRemainingWork,ProjectResourcePlanWork:ProjectResourcePlanWork,ProjectSPI:ProjectSPI,ProjectStartDate:ProjectStartDate,ProjectStartVariance:ProjectStartVariance,ProjectStatusDate:ProjectStatusDate,ProjectSubject:ProjectSubject,ProjectSV:ProjectSV,ProjectSVP:ProjectSVP,ProjectTCPI:ProjectTCPI,ProjectTitle:ProjectTitle,ProjectType:ProjectType,ProjectVAC:ProjectVAC,ProjectWork:ProjectWork,ProjectWorkspaceInternalUrl:ProjectWorkspaceInternalUrl,ProjectWorkVariance:ProjectWorkVariance,ResourcePlanUtilizationDate:ResourcePlanUtilizationDate,ResourcePlanUtilizationType:ResourcePlanUtilizationType,ProjectDepartments:ProjectDepartments,ProjectStatusDescription:ProjectStatusDescription,Location:Location,Program:Program,ScheduleKPI:ScheduleKPI,ProposedStart:ProposedStart,CostKPI:CostKPI,ProjectPhase:ProjectPhase,Sponsor:Sponsor,ProposalCosts:ProposalCosts,ProposedFinish:ProposedFinish,InvestmentCategory:InvestmentCategory,WorkKPI:WorkKPI,ProposedGoLiveDate:ProposedGoLiveDate,ExpectedBenefit:ExpectedBenefit,ProjectKPI:ProjectKPI,RaydonProjectCategory:RaydonProjectCategory,ERPProjectID:ERPProjectID,BusinessCaseApprovalStatus:BusinessCaseApprovalStatus,BudgetBusinessCaseLabor:BudgetBusinessCaseLabor,BudgetBusinessCaseODC:BudgetBusinessCaseODC,BudgetBusinessCaseMatlEquip:BudgetBusinessCaseMatlEquip,NeedProblemStatement:NeedProblemStatement,OptionsConsidered:OptionsConsidered,ProjectDescription1:ProjectDescription1,ProjectIDExecution:ProjectIDExecution,RequestType:RequestType,SelectionCommitteeBusinessCaseComments:SelectionCommitteeBusinessCaseComments,SelectionCommitteeDetailedProposalComments:SelectionCommitteeDetailedProposalComments,StrategicAlignment:StrategicAlignment,Risks1:Risks1,MasterProject:MasterProject,CostClassification:CostClassification" } }, "inputs": [ { "name": "InputDataset-shc" } ], "outputs": [ { "name": "OutputDataset-shc" } ], "policy": { "timeout": "1.00:00:00", "concurrency": 1, "executionPriorityOrder": "NewestFirst", "style": "StartOfInterval", "retry": 3, "longRetry": 0, "longRetryInterval": "00:00:00" }, "scheduler": { "frequency": "Day", "interval": 1 }, "name": "Activity-0-_Custom query_->---Project" } ], "start": "2017-04-26T20:13:27.683Z", "end": "2099-12-31T05:00:00Z", "isPaused": false, "hubName": "r-----datafactory01_hub", "pipelineMode": "Scheduled" } }

    Perhaps there's an update in the pipeline that creates parity between SQL output and CosmosDB.

    -thanks for your continued hard work
    -e


    -- me --


    • Edited by Chief Scientist Monday, June 12, 2017 1:59 PM Changed docDB to CosmosDB
    Friday, April 28, 2017 3:36 PM

All replies

  • Just to ping this one another time.

    No one answered this one.

    Is something in the plan to create the parity I described above, or do I indeed have to move this to a function?

    Seems strange to NOT have the same interface available for SQL Server output and CosmosDB.

    -thanks
    -e


    -- me --

    Monday, June 12, 2017 1:57 PM