none
how to reduce sql server bottle neck?

    Întrebare

  • hello everyone,

    Here we got a situtation, am using using sql server 2008 database as OLTP and Some delete queries takes more than hours to finish.

    So i checked the performance monitor counters for % Disk time, Avg. Disk Queue length. i found there is bottleneck in my server.

    How can i reduce the bottleneck? Pls give me some suggestions.

    avg values of the counters which is recorded for some hours:

    % disk time  = 740.104  max : 4315.65

    Avg. Disk Queue length = 14.802 max : 86.320

    We can 10 tables in a single db, with one data file and one log file. Does the increase of data files and place it in different physical disk solve this?

    In all tables transactions are happening continuously lik bulk insert, deleting 1000s of rows, fetching...

    13 martie 2012 12:29

Răspunsuri

  • Take a look at the way the SQLCAT team recommended doing large deletes to Myspace, it's an interesting read - http://blogs.msdn.com/b/sqlcat/archive/2009/05/21/fast-ordered-delete.aspx

    It would be worth checking the amount of IO you are performing now against the method that they suggest.

    Always check your IO!!!!  Much more often than not, IO numbers reflect efficiency.

    But I disagree with that SQLCAT post, and it doesn't recommend anything any different than OP is already doing here.

    OP, the real solution is architectural, either partitioned tables or if you don't have enterprise edition you grow your own by using multiple tables and views.

    But here's a thing that might help a lot - just change your rowcount from 5000 to 50000!  See if that doesn't cut an hour or two off your time.  For good luck, I usually put a little WAITFOR in those kinds of loops, just wait a second or two, in case you've been hogging resources for too long.

    The SQLCAT *idea* is good, using the clustered key and key-based scans instead of scanning the entire table from top to bottom each time, but their solution does not accomplish it!  The comment I posted there is that even if the execution plan looks better on a first, small run, when you put that into a loop as you must it does not scale, and even the initial method may actually run faster, especially if the keys used are the CK.

    Josh

    • Marcat ca răspuns de KJian_ 20 martie 2012 03:34
    14 martie 2012 19:32

Toate mesajele

  • Yep ,separating data and log file is a good idea. Also,please take a look at indexes of the tables that participate in blocking

    What does the below return?

    WITH Waits AS
    (
      SELECT
        wait_type,
        wait_time_ms  /1000. AS wait_time_s, 
             100. * wait_time_ms / SUM(wait_time_ms)OVER()AS pct,
             ROW_NUMBER() OVER(ORDER BY wait_time_ms DESC) AS rn
          FROM sys.dm_os_wait_stats
          WHERE wait_type NOT IN ('CLR_SEMAPHORE','LAZYWRITER_SLEEP','RESOURCE_QUEUE','SLEEP_TASK'
    ,'SLEEP_SYSTEMTASK','SQLTRACE_BUFFER_FLUSH','WAITFOR', 'LOGMGR_QUEUE','CHECKPOINT_QUEUE'
    ,'REQUEST_FOR_DEADLOCK_SEARCH','XE_TIMER_EVENT','BROKER_TO_FLUSH','BROKER_TASK_STOP','CLR_MANUAL_EVENT'
    ,'CLR_AUTO_EVENT','DISPATCHER_QUEUE_SEMAPHORE', 'FT_IFTS_SCHEDULER_IDLE_WAIT'
    ,'XE_DISPATCHER_WAIT','SQLTRACE_INCREMENTAL_FLUSH_SLEEP', 'XE_DISPATCHER_JOIN')
    )
          --filter out additional irrelevant waits


    SELECT
      W1.wait_type,
      CAST(W1.wait_time_s AS DECIMAL(12,2))AS wait_time_s,
      CAST(W1.pct AS DECIMAL(12,2))AS pct,
      CAST(SUM(W2.pct)AS DECIMAL(12,2))AS running_pct
    FROM Waits AS W1
      JOIN Waits AS W2
        ON W2.rn<= W1.rn
    GROUP BY W1.rn, W1.wait_type, W1.wait_time_s, W1.pct
    HAVING SUM(W2.pct)-W1.pct < 90-- percentage threshold
    ORDER BY W1.rn;


    Best Regards, Uri Dimant SQL Server MVP http://dimantdatabasesolutions.blogspot.com/ http://sqlblog.com/blogs/uri_dimant/

    13 martie 2012 12:35
  • the result for the above query i got as :

    BROKER_EVENTHANDLER              266733.36          62.11   62.11

    SLEEP_BPOOL_FLUSH                  50517.38            11.76   73.87

    WRITELOG                                    32399.45            7.54     81.42

    PAGEIOLATCH_EX                         30639.07           7.13      88.55

    PAGEIOLATCH_SH                         23848.10           5.55      94.11

    13 martie 2012 13:25
  • You disks io looks oK.... What problems are you observing? Are your queries running slowly? Can you show us an execution plan of the query?

    Best Regards, Uri Dimant SQL Server MVP http://dimantdatabasesolutions.blogspot.com/ http://sqlblog.com/blogs/uri_dimant/

    13 martie 2012 13:38
  • i cant able to get the execution plan now. sorry for the inconvenience.

    All my fetching queries which uses indexes are fine and takes few sec to complete. no problem in that.

    Server condition:

    1) every 3 mins bulk insert of 10000 rows happens in each table

    2) Every ten minutes delete job runs in one table. fetching also happens continuously in this table.

    3) Every day 70 mil data is deleted in one table using job. fetching also happens continously

    Problems facing :

    1) when i try to delete 70 mil data as batch wise (5000 rows in each batch), it takes 7 hours.

    2) And the performance counters also showing high values?

    3) Sql server uses too much of system resources? That makes the server too slow and sometimes hangs.


    13 martie 2012 13:51
  • What is the current IO subsystem architecture?
    Do you have physical disks, DAS or SAN?
    Are the disks raided? If so, hw many disks are in each LUN and which RAID level?
    How many disks are presented to windows?


    It sounds like your data and log files are on the same drive from your initial comment. Best practice is to split the ransom IO of OLTP workload and the sequential IO of a log file onto spearate physical drives. I would also recommendyou have a separate drives for TempDB and another for your backups.

    If you find this helpful, please mark the post as helpful,
    If you think this solves the problem, please propose or mark it an an answer.

    Please provide details on your SQL Server environment such as version and edition, also DDL statements for tables when posting T-SQL issues

    Richard Douglas
    My Blog: Http://SQL.RichardDouglas.co.uk
    Twitter: @SQLRich


    13 martie 2012 14:03
  • i dont hve much knowledge about the windows hard drive architectures...

    Could you pls suggest me how to find the hardware specfication? I cant use the third party tool, since it is a live server i cant take any risks.

    Yes my data file and log file is in same drive only.

    I am planning to move that in different drives. Does it really enough to solve my problem?(%disk time and Avg. Disk Queue length)

    Thanx for your effort

    13 martie 2012 14:31
  • What you are describing is almost certainly a blocking issue, not a performance issue.  You need to look at the reason the delete query is taking so long.

    13 martie 2012 15:10
  • Tom, he's deleting 70,000,000 rows!  Even with no blocking, that is going to be very slow.  OP, that is not something SQL Server is good at.  How many rows are left after the delete?

    If you can afford to delete ALL, then you can truncate the table and that takes about one second.

    If you can't afford to delete all, then you might want to look into partitioned tables, that lets you drop an entire partition in about one second.

    Or going old school, renaming the 70m row table and using a new one for the next day, with or without a partitioned view on top of it.

    Hope that helps,

    Josh

    13 martie 2012 18:25
  • thanx tom hrstern,

    Here my deleting method to delete 70 mil rows is here (in the table out of the 7 columns 2 columns are composite non clustered indexes and i used both the columns in the delete query):

    WHILE 1 = 1
     BEGIN
        DELETE TOP (5000) FROM table WITH (READPAST) WHERE nonclustered_index_column=value
        IF @@ROWCOUNT < 5000
        BREAK
     End

    I cant use truncate in this, bcz table contains 100 mil rows. thinking of doing partition.

    Meanwhile i jus checked with my perf counters for Physical Disk Object: % Disk Read Time and Physical Disk Object: % Write Time.

    It shows most of the time the

    % disk read time hikes --> 647

    %disk write time --> 240

    Which means disk read and write is not balancing.

    What can i do in this?

    14 martie 2012 04:49
  • Take a look at the way the SQLCAT team recommended doing large deletes to Myspace, it's an interesting read - http://blogs.msdn.com/b/sqlcat/archive/2009/05/21/fast-ordered-delete.aspx

    It would be worth checking the amount of IO you are performing now against the method that they suggest. Although I do like the elegant partitioning idea, assuming you are using enterprise edition of course.


    If you find this helpful, please mark the post as helpful,
    If you think this solves the problem, please propose or mark it an an answer.

    Please provide details on your SQL Server environment such as version and edition, also DDL statements for tables when posting T-SQL issues

    Richard Douglas
    My Blog: Http://SQL.RichardDouglas.co.uk
    Twitter: @SQLRich

    14 martie 2012 09:09
  • Take a look at the way the SQLCAT team recommended doing large deletes to Myspace, it's an interesting read - http://blogs.msdn.com/b/sqlcat/archive/2009/05/21/fast-ordered-delete.aspx

    It would be worth checking the amount of IO you are performing now against the method that they suggest.

    Always check your IO!!!!  Much more often than not, IO numbers reflect efficiency.

    But I disagree with that SQLCAT post, and it doesn't recommend anything any different than OP is already doing here.

    OP, the real solution is architectural, either partitioned tables or if you don't have enterprise edition you grow your own by using multiple tables and views.

    But here's a thing that might help a lot - just change your rowcount from 5000 to 50000!  See if that doesn't cut an hour or two off your time.  For good luck, I usually put a little WAITFOR in those kinds of loops, just wait a second or two, in case you've been hogging resources for too long.

    The SQLCAT *idea* is good, using the clustered key and key-based scans instead of scanning the entire table from top to bottom each time, but their solution does not accomplish it!  The comment I posted there is that even if the execution plan looks better on a first, small run, when you put that into a loop as you must it does not scale, and even the initial method may actually run faster, especially if the keys used are the CK.

    Josh

    • Marcat ca răspuns de KJian_ 20 martie 2012 03:34
    14 martie 2012 19:32