Since last Friday (11/18) I've been seeing poor performance for insert, update, and delete operations in my SQL Azure instance. A query on sys.dm_exec_requests shows lots of blocked requests. I just saw 194 blocked requests, with only 243 total sessions on the database. Many of the blocked requests had a TOTAL_ELAPSED_TIME of over 22000 (milliseconds, I think). This certainly seems like it might be related.
The WAIT_TYPE for all of the blocked sessions is SE_REPL_COMMIT_ACK. I don't know exactly what that means - does anyone here know for sure? It looks like it might be related to replication, but that's just a guess. In any case, what should I do about this?
I've already opened a support ticket, but beyond asking me for the name of the database they haven't done anything at all in over 30 hours. I get the feeling they're at a loss too.
The queries I'm using:
SELECT session_id, blocking_session_id, wait_type, last_wait_type, wait_time, total_elapsed_time, cpu_time, logical_reads, reads, writes
FROM sys.dm_exec_requests AS r
ORDER BY total_elapsed_time DESC
FROM sys.dm_exec_sessions AS s
Hi Brian Reischl,
You can use the statement below to look into the queries which are blocking other queries:
SELECT r.session_id, r.plan_handle, r.sql_handle, r.request_id, r.start_time, r.status, r.command, r.database_id, r.user_id, r.wait_type, r.wait_time, r.last_wait_type, r.wait_resource, r.total_elapsed_time, r.cpu_time, r.transaction_isolation_level, r.row_count, st.text FROM sys.dm_exec_requests r CROSS APPLY sys.dm_exec_sql_text(r.sql_handle) as st WHERE r.session_id in (SELECT distinct(blocking_session_id) FROM sys.dm_exec_requests) GROUP BY r.session_id, r.plan_handle, r.sql_handle, r.request_id, r.start_time, r.status, r.command, r.database_id, r.user_id, r.wait_type, r.wait_time, r.last_wait_type, r.wait_resource, r.total_elapsed_time, r.cpu_time, r.transaction_isolation_level, r.row_count, st.text ORDER BY r.total_elapsed_time desc
The blocking can be caused by inappropriate indexes on tables, poor design of applications and so on. Now, you can see the specific queries related to blocking, digging into these queries might be an effective method to this issue.
For more information: Finding Blocking Queries in SQL Azure.
Please remember to mark the replies as answers if they help and unmark them if they provide no help.
Every one of the sessions had a block_session_id of zero, so this query returns nothing. But it seems like a good one to add to my bag of tricks, so thanks for that.
It appears that Support has done something to fix the problem, but they haven't told me what yet. I'll come back here and write it in if/when they tell me.
The response from support was that this was caused by "Size of system database TEMPDB unexpectedly grown to high Percentage. This was later fixed by the self-healing mechanism of the system."
I'm not enough of a SQL Server guru to know for sure if that's plausible, but it smells funny to me. Would a TempDB problem cause the SE_REPL_COMMIT_ACK waits I was seeing? And what kind of self-healing mechanism only kicks in after 6 days, a support case, and two support engineers get involved?
Anyone with more SQL knowledge care to chime in?
That wait type is related to the SQL Azure-specific replication we do to ensure that you always have a primary and two secondaries. It is a bit surprising to me that it was causing large enough waits to significantly impact the query.
If you are still having trouble, please follow up with me privately at evanba at microsoft dot com with your case number.