none
disk setup, ntfs cluster size and sql server performance

    Question

  • Hello,

    Short time ago I've took over sql server after somebody; machine behaves strange - its performance is poor in term of disk access. Server is on vmware, in the meantime was moved from slower to faster storage array - but problem still persists.

    It looks like:

    - big file transfers, backup restore etc works fine (up to 300mb/s), while

    - doing multiple small oeprations (ie. deleting ~20millions rows from one table) takes ~1,3 hour

    Disk queue requests (shown on vm, not array side) are up to 20 in queue then, what suggests io subsystem has issues; on the other hand total io is ~20-25 mbs in same time; this storage may do better.

    I believe I've checked all areas where I could expect problems. At the momemt I would say I do not have doubts problem lays on storage / io side. Since few days I suspect problem may be in ntfs setup - seems that disks where databases are located were formatted with standard (4kb) cluster size, while ms recommend to have 64kb cluster size; my question is:

    - did anyone notices real benefits after changing cluster size on ntfs? or

    - this difference (4 vs 64) may not have serious impact on sql performance?

    If only I would have hope it may help, I'll take some efforts to change 4->64, but I am not convinced it may have so huge impact on sql performance - can anyone agree/disagree, based not on books but own experience? I would appreciate a lot.

    regards

    Tomasz

    Saturday, November 26, 2016 8:59 AM

All replies

  • Hi,

      the 64k alignment does helps, in the old days, the alignment was 63k, which cause some big reads to perform 2 IO instead of 1 if the reads is 64k (SQL works with 8k page).

      As if that really helps with performance? It depends, as you already seen the issue with IO sub-system, changing the alignment and see if that helps might worth a try, however, you should see what actually is causing the reads from IO to memory. There are other things that you can look at, eg, PLE, and buffer hit ratio. If PLE is really low, and huge IO, it might means that you need more memory for the work load, so adding more memory might help with the issue.

      If changing the disk is something simple and easy to do, it might worth to do that given its VM.

    Hope this helps 

    Monday, November 28, 2016 7:06 AM
  • Hi,

    Thank you for the answer; regarding ple - it is ~30000 seconds (!), and server is used only as ssas/ssrs server (no oltp activities) having 64gb of ram now, so I believe it is not an issue with low ram; getting closer to try to change 4k->64k, but.. anyone else had similar problem, and changing of cluster size helped?

    regards

    Tomasz

    Monday, November 28, 2016 9:49 AM
  • Hi tserwanski,

    I think changing cluster size on NTFS can improve the performance of SQL Server a lot. It will improve the proference of delete and backup obviously.

    On the hard drive, the minimum amount of data is sector, a group of sectors is a cluster, a drive with a 64KB cluster size will have 128 sectors, SQL Server performs disk an extent at a time (eight 8KB pages), setting cluster to 64kb will not waste IO operation.

    However, it will not improve the performance of writing data a lot, because data changing will be written to Transaction Log first and then write to the disk, this step will base on the performance of memory.

    Besides the method changing cluster size to 64kb, there are many other methods we can use to improve the performance of SQL Server on disk, you can refer to this article.

    Best Regards,
    Teige

    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    Tuesday, November 29, 2016 12:44 PM
    Moderator
  • Hello,

    I've changed 4k into 64k, and can now say that.. it does not improve sql server efficiency at all :( This is as for cluster size.

    Regarding other part of problems, I've not mentioned before - same server is a bit orphaned - many teams put some activities there, but nobody really cares; it leads to problem like simultanous backups done against same database etc., while nobody noticed such a thing happens.

    Since several weeks I'm taking a look af resource monitor checking resource waits, and there are high values shown during this time. Printscreen below shows them as for now, but these are quite same during long time period. There is a lot of articles describint resource waits, but some ot them are telling 'it is not good tohave them', and others 'it depends'. Considering my case, could I as to help/elaborate if my guess is ok, based on printscreen attached, and assuming that server os vmware machine, having all resources (also disks) virtualized:

    1) five more 'waiting' operations are these related to backup, disk io, network throughput and these caused by application written in .net - is it true?

    2) values shown in last column are extremely high - is it correct?

    3) I have problems understanding both: high disk io waits and low database i/o; how it is possible that having 15mbs database io I see also high io waits?

    4) to have little background: server is on vmware, where io can perform up to 100mbs (during, for example, backup restore it is possible to achieve such a speed); cpu us usually close to 0% usage; problem I have with this server is that it is really slow,, so I've blamed ntfs cluster size first, while now I believe problem may be related to .net application writeen by unknown person, which (application) is taking data from external server, and processes it on affected one; problem is that this process/operation takes 24h/day (checked in master.dbo.sysprocesses):

    I know that above details are probably not sufficient, but can anyone advice:

    a) are those resprce waits which caused server slowness?

    b) if so, how to found out guilty one of causing this, if processes listed above are not right ones?

    regards

    Tomasz

    Tuesday, January 10, 2017 11:04 AM