none
Reasons to avoid very large content databases - simple recovery model, file groups RRS feed

  • Question

  • We have a very large content database (650GB) with a single site collection.

    Recommendations are to keep content databases under 200GB in size, for backup and recovery purposes.

    We use the SIMPLE recovery model and recovery would mean restoring the physical mdf and ldf files from last night's backup, and re-attaching the databases - with no roll forward. (This is acceptable in our case).

    If I use SQL 'file groups' so that I have several physical files for the ONE content database I believe we can perform the restore of the files from Azure recovery vault without timing out.

    Are there other reasons for keeping content databases small? Splitting the content database (single site collection) into two, after it has grown this large is difficult.

    Am I safe to allow it to grow as long as I split the data into multiple physical files?

    Thank you


    Marcel

    Tuesday, October 15, 2019 6:04 PM

Answers

  • There is no reason you cannot have databases >1TB. That 200GB max guidance was created many years ago prior to SSD/NVMe storage. With modern storage solutions, the backup/restore argument doesn't make much sense.

    Multiple data files versus one makes no difference. Multiple data files allows you to spread the disk I/O across multiple volumes.


    Trevor Seward

    Office Apps and Services MVP



    Author, Deploying SharePoint 2019

    Author, Deploying SharePoint 2016

    This post is my own opinion and does not necessarily reflect the opinion or view of Microsoft, its employees, or other MVPs.

    Tuesday, October 15, 2019 6:37 PM
    Moderator

All replies

  • There is no reason you cannot have databases >1TB. That 200GB max guidance was created many years ago prior to SSD/NVMe storage. With modern storage solutions, the backup/restore argument doesn't make much sense.

    Multiple data files versus one makes no difference. Multiple data files allows you to spread the disk I/O across multiple volumes.


    Trevor Seward

    Office Apps and Services MVP



    Author, Deploying SharePoint 2019

    Author, Deploying SharePoint 2016

    This post is my own opinion and does not necessarily reflect the opinion or view of Microsoft, its employees, or other MVPs.

    Tuesday, October 15, 2019 6:37 PM
    Moderator
  • Thanks for your response Trevor. The reason I thought I needed multiple physical files was because

    1) Azure places a 1TB limit on drive size 

    UPDATE: apparently the drive size limit is now 4096GB (4TB). ref:https://azure.microsoft.com/en-us/blog/azure-introduces-new-disks-sizes-up-to-4tb/ 

    2) secondly because recovery of a file from the Azure Recovery vault has a 12 hour limit after which the mapped drives are released and the recovery fails. A file of 650G seems to take about 10 hours to restore. (ref: https://docs.microsoft.com/en-us/azure/backup/backup-azure-restore-files-from-vm)

    UPDATE: there are strategies to overcome this 12 hour limit - I am testing these now.

    Related: In order to stay application consistent, wouldn't I need to keep all physical files related to the content database on the same physical drive? We are using Azure VM backups via Azure Recovery Vault.

    Based on the information above, I agree, this issue is resolved.


    Marcel







    Wednesday, October 16, 2019 2:08 PM
  • You shouldn't use VM backups. You can't *really* make an app consistent snapshot of a SharePoint farm (Azure Site Recovery is the only "supported" way of doing this and it has post-restore issues that must be resolved, but are documented).

    Use standard SQL backups instead.


    Trevor Seward

    Office Apps and Services MVP



    Author, Deploying SharePoint 2019

    Author, Deploying SharePoint 2016

    This post is my own opinion and does not necessarily reflect the opinion or view of Microsoft, its employees, or other MVPs.

    Wednesday, October 16, 2019 2:55 PM
    Moderator
  • Just to clarify and verify: in the case of a single content database, that resides on a single drive, the VM backup does provide an application consistent copy of the database files that we can restore from. Please correct me if I am wrong.

    As for using SQL backups - these take a very long time to perform and then a very long time to restore, for these very large databases, making this approach impractical (what am I missing).  I can restore a single (huge) content database using 'VM Restore' and  'Restore Disks' in under an hour as opposed to 12+. Note: we use simple recovery model, so no roll forward, no large log files which normally need a sql backup to allow shrinking.


    Marcel

    Thursday, October 17, 2019 2:19 PM