Why do some XML functions have such high query plan cost?

Unanswered Why do some XML functions have such high query plan cost?

  • Tuesday, June 02, 2009 2:22 PM
     
     
    Hi,

    Does anyone know why this generates an expensive query plan:

    DECLARE @XML XML
    
    SELECT @XML = '<XML Id="Hello"></XML>'
    
    
    
    SELECT c.value('.', 'VARCHAR(MAX)') FROM @XML.nodes('//XML') T(c);
    
    


    The queryplan includes two "Table Valued Function [XML Reader]"'s to occur with a cost of 61.08 each on my 2005 sp3 developer server.  If I select out an attribute then it is considerably lower:


    SELECT
    c.value('.', 'VARCHAR(MAX)') FROM @XML.nodes('//XML/@Id') T(c);


    This results in a [XML Reader] and a [XML Reader wuth XPath filter] and the cost is 1.004 for each operator.

    Running tests show that they are both quick, but I am writing a stored proc for a high transaction system and always try to keep the query plan as low as possible and would not normally consider putting anything in production that is expensive as this.

    So - 

     

    1. Am I doing something wrong?
    2. Is it actually as expensive as it says it is and will this cause queries to run slowly later on?

    Thanks for any advice on this,




    Ed

    • Edited by Ed_E_1 Tuesday, June 02, 2009 2:22 PM tidying up formatting
    •  

All Replies

  • Monday, April 19, 2010 12:12 PM
     
      Has Code

    Hi,

     

    First of all, the two SELECT statements you have provided are not equivalent. The first one should be 

    SELECT c.value('./@Id', 'VARCHAR(MAX)') FROM @XML.nodes('//XML') T(c);
    

    When I run this and the second of your queries I get zero ms as the execution time from SET STATISTICS TIME ON so I think you'll be ok with either.

     

    If your XML is stored in a table you may want to think about adding XML indexes to the column. This will help if your XML is large.

     

    Hope that helps,

    Ben

  • Tuesday, November 23, 2010 1:42 AM
     
      Has Code

    Hi Ed,

    Did you ever get to the bottom of this? I have similar problems. A sproc receives two xml parameters (or small or moderate size, up to perhaps 1000 characters) and takes 5 minutes to run. The actual execution plan says that 99% of the cost is linked to inserting the two xml files into temp tables. This seems crazy to me. If I remove the xml processing from the sproc and just run it in a batch script as shown below, it runs almost instantaneously. But when part of a sproc that runs for 5 minutes, the insert into the #temp table takes 79% of the cost. I don't trust the query plan stats.

    FWIW, the procedure cache on these servers is very tight. There are 15 copies of the same DB, one for each customer, on each server and each DB caches the same sprocs, about 10-15 or so. (I know it's a bad design; I inherited it.) I am wild guessing that the procedure cache manager may be want to recompile the sproc, depending on the xml parameters, and that it has to do garbage cleaning to find space to do so, then actually compile the sproc. Just a wild guess, but we run into this a lot.

    declare @sortInfo xml 
    set @sortInfo=convert(xml,N'<ArrayOfWfSortInfo/>') 
    
    if object_id('tempdb..#sortInfo') is not null drop table #sortInfo
    	create table #sortInfo(
    		orderByType nvarchar(10)
    		, orderByExpression nvarchar(3000)
    		, orderByDirection varchar(4)
    		, dimensionId bigint
    		, dimensionName nvarchar(200)
    		, attributeName nvarchar(200)
    		, attributeNumber int
    		, levelnumber int)
    
    insert into #sortInfo
    		SELECT  fplan.query('./orderByType').value('.', 'nvarchar(10)') orderByType,
    			 fplan.query('./orderByExpression').value('.', 'nvarchar(3000)') orderByExpression,
    			 fplan.query('./orderByDirection').value('.', 'varchar(4)') orderByDirection,
    			convert(bigint, null) dimensionId,
    			convert(nvarchar(200), null) dimensionName,	
    			convert(nvarchar(200), null) attributeName,
    			convert(int, null) attributeNumber,
    			convert(int, null) levelNumber			
        FROM  @sortInfo.nodes ( '/ArrayOfWfSortInfo/WfSortInfo' ) T ( fplan );
    
    
  • Tuesday, November 23, 2010 4:26 PM
    Answerer
     
     

    Can you provide a small piece of sample XML?

    Have you tried typing the XML ie using an XML SCHEMA COLLECTION.  I've seen impressive results with that.

    Also have a look through these two great articles:

    Performance Optimizations for the XML Data Type in SQL Server 2005
    http://msdn.microsoft.com/en-us/library/ms345118.aspx

    XML Indexes in SQL Server 2005
    http://msdn.microsoft.com/en-us/library/ms345121(SQL.90).aspx

  • Tuesday, November 23, 2010 5:18 PM
     
     

    wBob, thanks for responding.

    The xml is shown at the top as @sortInfo. It is trivial. The insert inserts no rows. It runs instantaneously by itself as shown above.

    However, the actual query plan for the sproc in which it is contained says this is 78% of the total cost of the batch (a 5 minute run), which seems ridiculous. The query plan for this trivial insert when part of the sproc says it sets up 5 xml readers table-valued functions and 3 xml readers with xpath filters.

    The query plan for the same insert when run in isolation and not part of the sproc as above says it sets up 4 xml readers and 2 xml readers with xpath filters.

    It seems that the optimizer is setting up a lot of infrastructure in both cases. I wish it could be smart enough to avoid it when the activity is trivial. But regardless of the infrastructure, in the isolated case it runs fast, while when part of the sproc, it runs slow (if I can believe the query plan).

    As a workaround, I suppose I can check for the trivial xml and not do the insert in that case.

    Also, I will look at the articles you cited.

    Thanks, Jeff Roughgarden

  • Tuesday, November 23, 2010 6:31 PM
    Answerer
     
      Has Code

    I did have a case recently where using the OPTIMIZE FOR query option helped, something like this.  Give it a try and let me know how you get on:

    insert into #sortInfo
    		SELECT fplan.query('./orderByType').value('.', 'nvarchar(10)') orderByType,
    			 fplan.query('./orderByExpression').value('.', 'nvarchar(3000)') orderByExpression,
    			 fplan.query('./orderByDirection').value('.', 'varchar(4)') orderByDirection,
    			convert(bigint, null) dimensionId,
    			convert(nvarchar(200), null) dimensionName,	
    			convert(nvarchar(200), null) attributeName,
    			convert(int, null) attributeNumber,
    			convert(int, null) levelNumber			
      FROM @sortInfo.nodes ( '/ArrayOfWfSortInfo/WfSortInfo' ) T ( fplan )
    OPTION ( OPTIMIZE FOR ( @sortInfo = NULL ) )
    
    
  • Tuesday, August 07, 2012 6:01 PM
     
      Has Code
    It showing cost of 99% in Table Valued Function [ XML Reader with XPath filter] ; I am trying to know whether 

    OPTION ( OPTIMIZE FOR ( @sortInfo = NULL ) ) did help to reduce the query execution cost; appreciate any help.

    Thanks