Respondida SQL Server Slow After Adding New Storage To Cluster

  • Wednesday, January 30, 2013 4:22 AM
     
     

    Hello,

      We have SQL Server 2008 R2 setup in a failover cluster configuration.  Initially the quorum, along with the drives for the MDF files, and the drives for the LDF files were all located on our first iSCSI array.  We basically ran out of space on that array, and purchased a new iSCSI array.  We attached that array to the server, and cluster, and added another set of drives for the MDF and LDF files which are now pointing to array #2.  So we have basically 1/2 of the databases on array #1 and 1/2 of the databases on array #2.

      Everything works fine, except I'm having the following strange issues only after I added the second set to disks to the cluster.

    1) When I add a new database, and click refresh in SSMS I get a "Lock request time out period exceeded" error.  Refreshing again after a minute the databases list loads fine.

    2) When attempting to open a database backup to restore it, in SSMS it will take a really long time to browse for the .bak file to restore.

    3) When restoring a database, it takes a really long time to restore.  For example, it sits at 0% for about 1 minute, then it will start running fast.

    I've checked the failover cluster manager cluster events, and the event viewer and no errors are showing up.  I ran the validation for the cluster and nothing came up either.  Does anyone know where next to troubleshoot?

    Again - everything works - and works fast, so I'm not sure really if this is an issue or is just what happens when we have 2 separate arrays on the same cluster?

    Thanks

All Replies

  • Wednesday, January 30, 2013 6:26 AM
     
     

    I have my doubts that it relates to the storage... 

    1) Does it happen all the time or  you have seen this only once--- "Lock request time out period exceeded"?

    3) I rarely using SSMS  for restore operation, if you open a new query Window and issue RESTORE DATABASE .. command does it work fast?


    Best Regards,Uri Dimant SQL Server MVP, http://sqlblog.com/blogs/uri_dimant/

    MS SQL optimization: MS SQL Development and Optimization
    MS SQL Blog: Large scale of database and data cleansing
    Remote DBA Services: Improves MS SQL Database Performance

  • Wednesday, January 30, 2013 3:30 PM
     
     

    Thanks Uri, I did try to just do it from a query, the actions through SSMS and just submitting via the query are the same.  What will happen is on a restore it will take a really long time to start on 1% - maybe like 1 minute, then it will go quickly.

    Since it's about a minute, I'm not sure if something is timing out or something?

    Can a cluster be setup on 2 different iSCSI arrays with the quorum being on 1 and no quorum for the second?  I'm not sure if that matters?  It all ran fine until we added the second set of disks.

    Thanks.

  • Thursday, January 31, 2013 5:56 AM
     
     
    How big is the database?... Also http://support.microsoft.com/kb/2653893

    Best Regards,Uri Dimant SQL Server MVP, http://sqlblog.com/blogs/uri_dimant/

    MS SQL optimization: MS SQL Development and Optimization
    MS SQL Blog: Large scale of database and data cleansing
    Remote DBA Services: Improves MS SQL Database Performance

  • Thursday, January 31, 2013 9:57 PM
     
     

    These are small databases, like 100 MB.  So it will sit at 0% for about 1 minute, then fly through and go to 100% in like a second or two.  But it just waits forever at 0%.  The .bak files that are being restored are on the same drive as the MDF (data file)s drive on the cluster.

    Should we be able to have the failover cluster be connected to 2 different arrays in the same cluster?  If not, is there a way to change the cluster so that we can remove iSCSI array #1 (the original that we ran out of space on) - after we migrate the databases over to iSCSI array #2?

    Thanks

  • Friday, February 01, 2013 4:09 PM
     
     

    Hi Ryan,

    I think the reason its sitting for at 0% is zeroing the data file, Can you please check if file initialization is set for sql service account. 

    http://msdn.microsoft.com/en-us/library/ms175935(v=sql.105).aspx

    Thanks

    Mushtaq.


    Thank you -Mushtaq

  • Friday, February 01, 2013 9:03 PM
     
     

    Thanks Mushtaq - I read the article - we do have TDE enabled so instant file initialization has to be disabled.  I just tested restoring to the iSCSI array #1 and it goes quickly.  I think there is something wrong with the cluster or the config.  Just not sure...

    Thanks.

  • Friday, February 01, 2013 9:39 PM
     
     

    May there is new disks are slower compared to the older ones. Check the disk IO latency of old and new disks

  • Friday, February 01, 2013 9:43 PM
     
     

    Thanks, yeah so if I go into Windows Explorer for example, it's totally fine.  The disks actually have more I/O's than the previous.  SQL Server is the only thing that seems to be working kind of slow for some reason when browsing to the disks or adding a new database.

    Does anyone know if you can have 2 different arrays connected to the same cluster if that causes issues?

    Thanks

  • Saturday, February 02, 2013 3:40 PM
     
     

    Browsing explorer may not be a right test to check for IO latency. You may refer below IO bottleneck section and monitor some perfmon counters while issue is reproduced

    http://msdn.microsoft.com/en-us/library/cc966540.aspx#EFAA

  • Saturday, February 02, 2013 3:56 PM
     
     

    Thanks V Keerthi Deep - I just checked the IO latency and everything is under 3 ms - so it appears as if things are running pretty quickly. 

    I think at this point I'm going to try to migrate all the data over to the new array, then disconnect the old array from the cluster to see if that helps at all.

    Everything runs, and it runs fast, it's just these strange slownesses happened only when adding a database, browsing for a backup, etc. when I attached the new array to the cluster.  Even before I added any databases to it.  So I think it's more of a cluster issues than a SQL Server or database issue.

    Thanks again.

  • Sunday, February 03, 2013 9:58 AM
     
     
    I assume you ran Cluster Validation right after the storage was added, right? It is a requirement to be run after any major configuration change like adding disks.

    Allan Hirt Blog: http://www.sqlha.com/blog Coming in 2013: Mission Critical SQL Server 2012 - the followup to Pro SQL Server 2008 Failover Clustering

  • Monday, February 04, 2013 4:41 PM
     
     Answered

    Thanks Allan, yeah I tried that - which actually helped me discover the issue, so thank you.  What it turned out was that the CPU usage was high on the SQL Server for a few weeks which just happened to be the same time as when we added the new array.  There was a new app that we were running causing the CPU to run a lot higher than normal on the SQL Server.  To run the validation with the disk checking, I had to basically wait until no users were on the server.  Since there were no users on and the CPU usage was still sky high, I used the activity monitor to determine it was the new app, shut it down, then everything was back to normal.

    So we need to take the new app offline and diagnosis why it was doing that.

    Thanks everyone for your suggestions.

    • Marked As Answer by Ryan_Ha Monday, February 04, 2013 4:41 PM
    •