locked
TFs 2013.4 - Large Tfs_Warehouse database - why ? RRS feed

  • Question

  • Hi,

    This weekend i had to rebuild the Warehouse and Analysis Cube. While typing this is still going on. Im suspecting one of the reasons could be that the database is very large. Currently the database is 107GB, which i find alot for reporting.

    Our primary Collection is 350GB and the 2nd largest only 25GB.

    What data resides in the warehouse database that can be this large, is there a query i can run to find out the large data in there ?
    We clean up our build logs quite frequently, because these are large too, say 50MB per build, even the CI build, so that adds up quite fast in the database. but this is a different topic.

    When monitoring the rebuild, i see its taking a very look time or even looping? in Test Management Warehouse Sync.

    <Collections>
    <Collection JobsQueued="0" JobsRunning="1" JobProcessingStatus="DataChange" Name="ProductName">
    <Jobs>
    <Job JobProcessingStatus="Idle" Name="Build Warehouse Sync">
    <LastRun Result="Blocked" EndTimeUtc="2016-08-07T14:09:13.873Z" ExecutionStartTimeUtc="2016-08-07T14:08:43.833Z" QueueTimeUtc="2016-08-07T14:08:43.073Z"><ResultMessage>[Build Warehouse Sync]: ---> TF221033: Job failed to acquire a lock using lock mode Shared, resource DataSync: [dataServer].[Tfs_Warehouse] and timeout 30.</ResultMessage></LastRun><NextRun QueueTimeUtc="2016-08-07T14:11:13.873Z" JobState="QueuedScheduled"/></Job>
    <Job JobProcessingStatus="Idle" Name="Common Structures Warehouse Sync">
    <LastRun Result="Blocked" EndTimeUtc="2016-08-07T14:09:13.867Z" ExecutionStartTimeUtc="2016-08-07T14:08:43.833Z" QueueTimeUtc="2016-08-07T14:08:43.073Z"><ResultMessage>[Common Structures Warehouse Sync]: ---> TF221033: Job failed to acquire a lock using lock mode Shared, resource DataSync: [dataServer].[Tfs_Warehouse] and timeout 30.</ResultMessage></LastRun><NextRun QueueTimeUtc="2016-08-07T14:11:13.867Z" JobState="QueuedScheduled"/></Job>
    <Job JobProcessingStatus="DataChange" Name="Test Management Warehouse Sync"><LastRun Result="Inactive" EndTimeUtc="2016-08-07T07:35:34.5Z" ExecutionStartTimeUtc="2016-08-07T07:24:54.03Z" QueueTimeUtc="2016-08-07T07:24:53.73Z"/><CurrentRun ExecutionStartTimeUtc="2016-08-07T07:40:35.017Z" QueueTimeUtc="2016-08-07T07:40:34.5Z" JobState="Running"/></Job>
    <Job JobProcessingStatus="Idle" Name="Version Control Warehouse Sync">
    <LastRun Result="Blocked" EndTimeUtc="2016-08-07T14:09:13.873Z" ExecutionStartTimeUtc="2016-08-07T14:08:43.833Z" QueueTimeUtc="2016-08-07T14:08:43.067Z"><ResultMessage>[Version Control Warehouse Sync]: ---> TF221033: Job failed to acquire a lock using lock mode Shared, resource DataSync: [dataServer].[Tfs_Warehouse] and timeout 30.</ResultMessage></LastRun><NextRun QueueTimeUtc="2016-08-07T14:11:13.873Z" JobState="QueuedScheduled"/></Job>
    <Job JobProcessingStatus="Idle" Name="Work Item Tracking Warehouse Sync">
    <LastRun Result="Blocked" EndTimeUtc="2016-08-07T14:09:13.88Z" ExecutionStartTimeUtc="2016-08-07T14:08:43.833Z" QueueTimeUtc="2016-08-07T14:08:43.073Z"><ResultMessage>[Work Item Tracking Warehouse Sync]: ---> TF221033: Job failed to acquire a lock using lock mode Shared, resource DataSync: [dataServer].[Tfs_Warehouse] and timeout 30.</ResultMessage></LastRun><NextRun QueueTimeUtc="2016-08-07T14:11:13.88Z" JobState="QueuedScheduled"/></Job></Jobs></Collection>


    Also, the event log is flooded with warnings containing exceptions.

    TF53010: The following error has occurred in a Team Foundation component or extension:
    Date (UTC): 07/08/2016 14:21:17
    Machine: appServer
    Application Domain: TfsJobAgent.exe
    Assembly: Microsoft.TeamFoundation.Framework.Server, Version=12.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a; v4.0.30319
    Service Host: 
    Process Details:
      Process Name: TFSJobAgent
      Process Id: 2148
      Thread Id: 6764
      Account name: DOMAIN\TFSSERVICE
    
    Detailed Message: TF30065: An unhandled exception occurred.
    Exception Message: Operation is not valid due to the current state of the object. (type InvalidOperationException)
    Exception Stack Trace:    at Microsoft.TeamFoundation.Framework.Server.TeamFoundationRequestContext.ValidateRequestContext(Boolean throwOnError)
       at Microsoft.TeamFoundation.Framework.Server.TeamFoundationRequestContext.To(TeamFoundationServiceHost targetHost)
       at Microsoft.TeamFoundation.Framework.Server.TeamFoundationRequestContext.To(TeamFoundationHostType hostType)
       at Microsoft.TeamFoundation.Warehouse.LockResources..ctor(TeamFoundationRequestContext requestContext)
       at Microsoft.TeamFoundation.Warehouse.WarehouseExecutionContext..ctor(TeamFoundationRequestContext requestContext)
       at Microsoft.TeamFoundation.Warehouse.WarehouseSyncJobExtension`1.CheckSchemaLockQueue(Object source)
    
    

    Any clue on how i can or should analyse this ?
    Is there a way to reduce the size of the warehouse database contents by cleaning up the collection database ? if so, which data is important to clean ?

    Looking forward to your replies.

    Nico

    Sunday, August 7, 2016 2:23 PM

All replies

  • Hi Nico,

    Please try to reconfigure warehouse database to use a new database (e.g. Tfs_Warehouse2) through TFS admin console, then process warehouse and analysis services cube, after that check the database size.

    If it doesn’t work, please clean up TFS database, for example, delete unused content (Exec prc_DeleteUnUsedContent 1), delete unused files (Exec prc_DeleteUnusedFiles 1,0,1000).

    # Cleaning up tfs Tfs_DefaultCollection database

    https://social.msdn.microsoft.com/Forums/vstudio/en-US/661c8f0f-61aa-4ddb-a044-cba530278aaf/cleaning-up-tfs-tfsdefaultcollection-database?forum=tfsgeneral

    https://www.visualstudio.com/en-us/docs/setup-admin/clean-up-data

    Regards

    Starain


    We are trying to better understand customer views on social support experience, so your participation in this interview project would be greatly appreciated if you have time. Thanks for helping make community forums a great place.
    Click HERE to participate the survey.

    Monday, August 8, 2016 8:52 AM
    Moderator
  • Thank you Starain,

    I believe the rebuild has completed overnight, all 5 jobs now have LastRun result = Succeeded.

    i will see what i can do to clean the database, will removing Test attachments help ?

    Can you comment on the warnings that i receive in the event log ? Even after the rebuild is done, im getting these warnings, though on a lower frequency, but still about once or twice a minute.

    Please advise.

    Nico

    Monday, August 8, 2016 12:04 PM
  • Hi Nico,

    For TF221033 issue, The Cube processing job requires exclusive access to some of the warehouse resources that data synchronization jobs use. The Cube processing job will wait for the release of the resources (normally for an hour) before it gives up. If a data synchronization job does not release the resource in time, the Cube processing job will fail with that error. You may refer to this article to deal with it. (Why might most Cube processing jobs fail?)

    For warehouse database size, you could check which tables size is large.

    Regards

    Starain


    We are trying to better understand customer views on social support experience, so your participation in this interview project would be greatly appreciated if you have time. Thanks for helping make community forums a great place.
    Click HERE to participate the survey.

    Tuesday, August 9, 2016 2:02 AM
    Moderator
  • Hi Starain,

    These are my 5 largest tables in the warehouse

    TableName - Rows - ReservedMB - IndexMB

    FactTestResult 42476410    34361.09 10461.8 23825.23
    DimTestResult 20741765    18458.39 15031.36 3416.12
    FactBuildCoverage 26883933    9563.55 4902.3 4622.03
    DimFile 7472998     5810.23 2810.63 2932.86
    FactCodeChurn 13428038    4519.02 2238.29 2272.2

    So yes, by far most of the data is related to test results.
    We do generate a lot of test data daily, but this is also cleaned daily. Will this be reflected in the warehouse ? I have also noticed a lot of test data in the database that is "tied" to destroyed builds, should i manually delete this and would this then be removed from the warehouse too ?

    Tuesday, August 9, 2016 6:27 AM
  • Hi nsijtsma,

    >> We do generate a lot of test data daily, but this is also cleaned daily

    How do you clean test data daily? How do you specify retention policy of you build definition? There is dbo.prc_DeleteTestRun store procedure in collection database that you may try to delete test runs.

    Regards

    Starain


    We are trying to better understand customer views on social support experience, so your participation in this interview project would be greatly appreciated if you have time. Thanks for helping make community forums a great place.
    Click HERE to participate the survey.

    Wednesday, August 10, 2016 2:15 AM
    Moderator
  • We are on TFS 2015.3 by now and have rebuild the Warehouse about a year ago.

    Note above sized.

    The TFS_Warehouse database is now 500gb... how did that happen ?

    The top 5 databases are now:

    Table -- Rows -- Reserved MB -- Data MB -- Index MB -- Unused MB
    FactTestResult	245793619  	194781.84	57009.94	137755.93	15.98
    DimTestResult	169954262  	150243.2	117615.96	32626.8	0.44
    FactBuildCoverage	27118639   	8665.23	4575.8	4089.3	0.13
    DimFile	9096946    	7590.14	3670.55	3919.52	0.07
    FactCodeChurn	16791568   	5074.2	2563.7	2510.16	0.34

    How can we control the TestResults ?

    We cleanup the test results with the automated Test cleaner in TFS 2015, set to remove automated test runs after 90 days.

    Are we able to see what data is in above tables and analyse it further ? it seems like alot. especially knowing it was about 10x smaller last year.

    Monday, September 11, 2017 2:54 PM