none
Site collection of size > 100GB

    Question

  • I have some questions in the SharePoint 2010 to design a new architecture for one of my projects. Below are the scenarios and please reply me the possibilities.

    1. A site collection has many sites around 200-300 a minimum and in future they can grow. 
    2. The site collection size would come around 150-200GB once I move all data from 2007. In future it grows as sites are created.

    I believe I have two solutions for this issue.

    1. Split the site collection into different site collections. But, this is not a good solution for me because the data is in use here and there. If I split I cannot access the data wherever I want as they are in different site collections.
    2. Using RBS.

    If I use RBS what are the issues. I have read many articles online and most of them said, use RBS for document archive purpose etc.. But no one is telling what we have to do if a site collection is beyond 100GB or 150GB.

    How is the search with RBS and are there any delays?

    thanks

    ASP.NET and SharePoint developer
    Blog: http://praveenbattula.blogspot.com
    Please click "Propose As Answer" if a post solves your problem or "Vote As Helpful" if a post has been useful to you.
    Monday, January 09, 2012 9:38 AM

All replies

  • Hi,

    RBS is mainly used to offload the SQL server, as the data is actually not in the mdf database file but stored on the filesystem. There is a limit you can set on the RBS provider, which decides if the data should be saved in the content db or should be saved in RBS (MinimumBlobStorageSize).

     

    But you can use much larger databases for your content dbs. The limit today is 4TB for a content db.

    http://blogs.msdn.com/b/pandrew/archive/2011/07/08/announcing-new-larger-content-database-size-limits-and-rbs-clarifications.aspx
    http://blogs.msdn.com/b/pandrew/archive/2011/07/08/articles-about-scaling-sharepoint-to-large-content-database-capacity.aspx
    http://technet.microsoft.com/en-us/library/cc678868.aspx

    You can perform some scaling on the SQL level - separate data and logs, create multiple files etc.


    Marek Chmel, WBI Systems (MCTS, MCITP, MCT, CCNA)
    Monday, January 09, 2012 9:55 AM
  • I typically limit content db's to 50GB and SharePoint Health Analyzer starts complaining once you surpass 100GB, but in this case I'd migrate the data and keep it in a content database that's somewhat larger than usual. MS guidelines typically point out that you should go up to 200GB (see http://sharepointdragons.com/2011/12/05/sharepoint-capacity-planning/ for more info) and as Marek points out, you can go way up if you feel the need to.

    As far as RBS goes... I think http://www.loisandclark.eu/Pages/blob.aspx is a good resource (although I'm not objective at all, I've written it myself). MS has the following guidelines: http://technet.microsoft.com/en-us/library/ff628583.aspx

    I wouldn't say that RBS is mainly to offload SQL Server, I feel that you should use it for high end document storage scenarios that are hard or impossible to implement via a SharePoint content database, far different from what you are trying to accomplish here. I'm thinking scenarios like:

    • Storing very large documents
    • Storing huge amounts of small documents
    • BLOB immutability
    • Expunging capabilities
    • Obliterating capabilities (sounds cool, huh?)
    • Data de-duplication
    • Guaranteed retention and deletion policies

    Since the default RBS provider is limited, if you do need to use an RBS, it's likely that you need a 3rd party vendor like EMC², Open Text, NetApp, AvePoint, and CommVault.


    Kind regards,
    Margriet Bruggeman

    Lois & Clark IT Services
    web site: http://www.loisandclark.eu
    blog: http://www.sharepointdragons.com


    Monday, January 09, 2012 11:03 AM
  • Thanks Marek for your prompt response.

    Yes, I read and I understood when to use RBS. But, the limitation of content databases are usually 200GB which stated by Microsoft per content database. 100GB is the limitation per site collection size.

    So, here is where I confused. If a site collection starts giving me the message that the site collection is crossing 100GB size limit then it is a problem. And I am not sure that the data grow day to day because the size of the site collection about a year back was 30GB and today almost close to 180GB. So, it is increasing rapidly. This is the reason I am stuck and confused.

    Conclusion:

    I have a single site collection and backend is a problem to me [Where to save huge amount of data]. So, can you please rethink and suggest? The limitations are where I am thinking again and again.

    thanks

    -Praveen.

     


    ASP.NET and SharePoint developer
    Blog: http://praveenbattula.blogspot.com
    Please click "Propose As Answer" if a post solves your problem or "Vote As Helpful" if a post has been useful to you.
    Monday, January 09, 2012 11:34 AM
  • Thanks Margriet. That information helps.

    Yes, I have huge amounts of small documents. Around 150-200 per document library and there are around 20-30 document libraries per site. And like these sites there are 200 sites in my site collection.

    The size is what really make me think to go for RBS. But, I am not sure how RBS works. I am evaluating the RBS by installing a copy in my server. But, I don't have huge data to test.

    I am really surprised what/how people in big organizations are architecture the data which are in TB's.

    thanks again for the suggestions.


    ASP.NET and SharePoint developer
    Blog: http://praveenbattula.blogspot.com
    Please click "Propose As Answer" if a post solves your problem or "Vote As Helpful" if a post has been useful to you.
    Monday, January 09, 2012 11:38 AM
  • Glad that it helped.

    The numbers you're mentioning (1.2 million items) are still a long way from the limits mentioned by MS: http://technet.microsoft.com/en-us/library/cc262787.aspx . You can have 30 million items per list and a total of 60 million items per content database. As far as I'm concerned we're not in the realm of huge amounts of small documents by far!

     


    Kind regards,
    Margriet Bruggeman

    Lois & Clark IT Services
    web site: http://www.loisandclark.eu
    blog: http://www.sharepointdragons.com

    Monday, January 09, 2012 12:23 PM
  • Also

    Before you move them SP2010 from 2007.

    Look for ways where you can reduce the space requirements.

    -- Look for older version of document , may be you can drop them to reduce good amount of space.

    -- Also shrink DB to get real DB size , as DB usually have lot of un-used space

    -- You can identify critical / non-critical ( less used sites ), and move them to different Site Collection

     

    Thanks

    Sandeep Nahta

    http://snahta.blogspot.com

     


    Sandeep Nahta AppliedIS Hyderabad http://snahta.blogspot.com
    Monday, January 09, 2012 12:51 PM
  • I am not worried about the numbers either, Margriet. :0) My concern is about the Content Database size and how to make this scalable for future needs.

    As Microsoft is recommending the limits below,

    Site collection size limit from Microsoft = 100GB,

    Content Database size limit from Microsoft = 200GB.

    What I am trying to understand is,.. 

    1) What are the enterprises doing to handle high volume of data which goes into TBs in their SharePoint sites?

    2) Is RBS even a solution for this kind of problem?

    3) If the limits from Msft can be ignored, what would be the impact of having say a TB of data in a content database with respect to things like SharePoint search, backups/restores and retrieving performance.

    All my questions are related to "the web application has a single site collection."

    Thanks.


    ASP.NET and SharePoint developer
    Blog: http://praveenbattula.blogspot.com
    Please click "Propose As Answer" if a post solves your problem or "Vote As Helpful" if a post has been useful to you.
    Monday, January 09, 2012 1:13 PM
  • Hi Praveen,

    You're asking some good questions and from reading the replies, there's some very good discussions. Something to keep in mind is that even with leveraging RBS, the recommendation is to keep the size of the SP Content DBs up to 200GB including the the data that has been externalized.

    In addition, another important question that should be asked is - What are your backup/restore SLAs? I know enterprises out there with very stringent SLAs have thus been forced to break up their data into many site collections to keep their Content DBs around 50-100GBs.


    Please Mark Answered if my reply solves your problem. Thanks!
    Monday, January 09, 2012 2:44 PM
  • Thanks all. But, I am still with no clarity.

    My site collection is almost reached 260GB. Now, I have the opportunity to change the architecture of the total SharePoint server 2010 implementation. But, what is the way I have to approach is still a question.

    If I split the site collection to different site collections to use the different databases then all my existing code and the search, permissions[security] will be broken. If I split the content database to 200GB per file then it won't give me any big advantage. What are other approaches??? only RBS?

     


    ASP.NET and SharePoint developer
    Blog: http://praveenbattula.blogspot.com
    Please click "Propose As Answer" if a post solves your problem or "Vote As Helpful" if a post has been useful to you.
    Wednesday, January 18, 2012 11:48 AM
  • RBS is not the solution for your issue. RBS is a good solution for archival of data not for collabaration scenarios. Recommendations from Microsoft to limit the database size is based on how long it will take to backup and restore the content database and access time of the content. Yes, you can have TB content database, what you have to consider is how long it will take to backup and restore that content database. The best approach is to split the site collection into 2 with its own content databases, so that you don't land into issues later on.
    Wednesday, January 18, 2012 1:55 PM
  • Hi Praveen,

    The previous posts reference recommendations to avoid possible issues others have seen. They're not hard limits (as you've seen) and you are well within Microsoft supported numbers. BUT, if you do decide to go over the recommended 200GB, certain disk performance metrics are recommended ... The MS SharePoint BLOG lays this out - http://sharepoint.microsoft.com/blog/Pages/BlogPost.aspx?pID=988. I suggest reading this.

    And no .. RBS should not be seen as one of the approaches (from the blog post above) -

    The content database size includes both metadata and BLOBs regardless of where the BLOBs are located and use of RBS does not bypass or increase these limits.

     

    Please Mark Answered if my reply solves your problem. Thanks!
    Wednesday, January 18, 2012 2:05 PM
  • Thanks all. I am very sorry, If I am asking too many questions related to same context. I am out of ideas and many articles online are saying different approaches. RBS is good or bad? For this question, I found RBS is not for only BLOB storage and mainly used for archival purpose [Document archive]. So, if this is the case RBS is not an approach for me as you guys said. For breaking the site collection to different small site collection there are huge issues with the current implementation.

    Below is the complete picture of what I actually have:

    I believe you are understanding the site collections and the sites from the above picture. [All site collections are using their own content databases]

    SC6 is the main/big site collection in the diagram. It has 5 regions named Europe, Asia, East, West and Central. And there are "Department" sites in the same level as regions under the site collection.

    The Department sites contains high level information which can be accessible by owners, managers etc. And each region also has department data which can be uploaded by the department users. Now, the requirement is all department owners able to see the department related data from a region directly in their department sites instead of going to the each region. And we have to write some custom logic to show the data from region sites in department sites.

    Each region itself is very big and contains many sites in it.

    And the questions are:

    1. If we use the above architecture then the site collection (SC6) size will grow around 500-600GB in one year.
    2. But there are limitations on site collection size 100GB and Content database size 200GB.
    3. If we RBS then there are some other issues [But not sure]

    4.  We cannot create the region sites as site collections as the department users should be able to view the regions. [Implementation of permissions, security will be difficult]

    So, how to resolve these kind of problems? Any suggestions will be greatly appreciated.


    ASP.NET and SharePoint developer
    Blog: http://praveenbattula.blogspot.com
    Please click "Propose As Answer" if a post solves your problem or "Vote As Helpful" if a post has been useful to you.
    Monday, January 23, 2012 11:56 AM