locked
performance for more than 100 million rows in a table RRS feed

  • Question

  • Hello every one
    I have a database , with a table which is more than 100 million rows . It is intensive read /write .

    I create charts that aggregates values from the entire table .

    Sort of SELECT  SUM (Value1)  FROM MyTable

    I want to know whats best in terms of database layout on the disk .

    Do i use  table partitioning on several disks and have trn log on another disk , tempdb on third disk ? 

    Do i get much benefit if i keep indexes on separate disk ? 

    Or use RAID 5  for the entire system ?


    Thanks

    www.DownLoadComponent.com
    Friday, February 20, 2009 8:07 PM

Answers

  • The best would be to use Stripe and Mirroring (Raid 10), having the logs on a serarate disk, the tempdb on serarate disks (one file per core at best) and the data files on serarate disks. If you use partitioning (that would make sense in your case) Try to spread the partitions of the tables also on different disks, this will maximize your throughput in addition.

    Jens K. Suessmeyer
    Saturday, February 21, 2009 4:53 PM
  • I should point out when Jens says separate disks it is actually referencing separate RAID 10 disk sets, not just separate singular disks.  RAID 10 is the best for performance where you have the correct number of disks, and it is the most fault tolerant configuration for SQL.  RAID 5 on the other hand can yield better performance when you can't create appropriate RAID 10 disk sets for performance.  Case in point, if you can only have 4 disks in your RAID Array, you will get better performance out of RAID 5, but you can only lose 1 disk, whereas in the RAID 10, you can lose 2 disks as long as they don't belong to the same mirror group in the stripe.

    Unless you have bitmap contention in tempdb, you shouldn't create 1 file per core, and per Paul Randal's blog on the topic, 1 file per core isn't necessarily the best even with bitmap contention problems.

    In Recovery... | Search Engine Q&A #12: Should you create multiple ...

    If you are creating multiple files on the same spindle sets, you need to monitor your I/O subsystem to make sure that you aren't causing a bottleneck there due to the number of files.  My personal experience in this area is less can be more.  Multiple files across different spindle sets is the ideal configuration for performance.
    Jonathan Kehayias
    http://sqlblog.com/blogs/jonathan_kehayias/
    http://www.sqlclr.net/
    Please click the Mark as Answer button if a post solves your problem!
    Sunday, February 22, 2009 6:51 PM

All replies

  • The best would be to use Stripe and Mirroring (Raid 10), having the logs on a serarate disk, the tempdb on serarate disks (one file per core at best) and the data files on serarate disks. If you use partitioning (that would make sense in your case) Try to spread the partitions of the tables also on different disks, this will maximize your throughput in addition.

    Jens K. Suessmeyer
    Saturday, February 21, 2009 4:53 PM
  • I should point out when Jens says separate disks it is actually referencing separate RAID 10 disk sets, not just separate singular disks.  RAID 10 is the best for performance where you have the correct number of disks, and it is the most fault tolerant configuration for SQL.  RAID 5 on the other hand can yield better performance when you can't create appropriate RAID 10 disk sets for performance.  Case in point, if you can only have 4 disks in your RAID Array, you will get better performance out of RAID 5, but you can only lose 1 disk, whereas in the RAID 10, you can lose 2 disks as long as they don't belong to the same mirror group in the stripe.

    Unless you have bitmap contention in tempdb, you shouldn't create 1 file per core, and per Paul Randal's blog on the topic, 1 file per core isn't necessarily the best even with bitmap contention problems.

    In Recovery... | Search Engine Q&A #12: Should you create multiple ...

    If you are creating multiple files on the same spindle sets, you need to monitor your I/O subsystem to make sure that you aren't causing a bottleneck there due to the number of files.  My personal experience in this area is less can be more.  Multiple files across different spindle sets is the ideal configuration for performance.
    Jonathan Kehayias
    http://sqlblog.com/blogs/jonathan_kehayias/
    http://www.sqlclr.net/
    Please click the Mark as Answer button if a post solves your problem!
    Sunday, February 22, 2009 6:51 PM