none
Full Text Indexing XML Column - Attribute values are not returned by Contains

    Question

  • I've created a typed XML column in a table and created a full-text index on it.

    I can execute queries using the "Contains" function and get results if the value being searched is contained in the value of an element.  But if its the value of an attribute it does not come back.

    The following article states "Attribute values are full-text indexed unless they are numeric values."  http://msdn.microsoft.com/en-us/library/bb522491.aspx

    But it doesn't seem to work for me, any help would be appreciated.  Here is an example xml:

    <LogXML> 
    <Props> 
    <Person surname="Jarrett" g1="Ryan" g2="A" g3="" g4="" g5="" dob="" gender="M" mtp="" fps="" role="" /> 
    </Props> 
    <Attributes> 
    <Attribute name="Command"> 
    <Value>Execute Test command on user Smith.</Value> 
    </Attribute> 
    </Attributes> 
    </LogXML>
    
    
    

     and my query:

    SELECT

     

    * FROM [dbo].[TestLogXML2] Where Contains (*, 'Jarrett' )

    -- No results come back with the above query

    SELECT

     

    * FROM [dbo].[TestLogXML2] Where Contains (*, 'Smith' )

    -- The proper results come back with this one.

     

     

     


    Ryan J
    Monday, March 22, 2010 7:38 PM

Answers

  • Hi Hilary,

    We got the answer from MSFT: "Only Level 0 and Level 1" attribute values get indexed."  They are going to review to see if this is a bug or by design and update their documentation.

    So if I altered my original XML:

    <LogXML> 
    <Props> 
    <Person surname="Jarrett" g1="Ryan" g2="A" g3="" g4="" g5="" dob="" gender="M" mtp="" fps="" role="" /> 
    </Props> 
    <Attributes> 
    <Attribute name="Command"> 
    <Value>Execute Test command on user Smith.</Value> 
    </Attribute> 
    </Attributes> 
    </LogXML>
    

    to:

    <LogXML> 
    <Person surname="Jarrett" g1="Ryan" g2="A" g3="" g4="" g5="" dob="" gender="M" mtp="" fps="" role="" /> 
    <Attributes> 
    <Attribute name="Command"> 
    <Value>Execute Test command on user Smith.</Value> 
    </Attribute> 
    </Attributes> 
    </LogXML>
    

    then the attribute values would get indexed.


    Ryan J
    • Edited by Ryan Jarrett Wednesday, April 14, 2010 4:49 PM more info
    • Marked as answer by Ryan Jarrett Wednesday, April 14, 2010 4:50 PM
    Wednesday, April 14, 2010 4:47 PM

All replies

  • What version of SQL Server are you using, and are your documents stored using the XML data type, or varbinary data type?

    looking for a book on SQL Server 2008 Administration? http://www.amazon.com/Microsoft-Server-2008-Management-Administration/dp/067233044X looking for a book on SQL Server 2008 Full-Text Search? http://www.amazon.com/Pro-Full-Text-Search-Server-2008/dp/1430215941
    Monday, March 22, 2010 7:59 PM
  • Hi Hilary,

    We are using SQL 2008 Enterprise Edition.  They are stored in a typed XML data type with a Schema Collection (although I've tried it with and with out this).

    Here is an example of how I created the table:

    CREATE TABLE [dbo].[TestLogXML2](
    	[LogXML] [xml](DOCUMENT [dbo].[LogXML_SchemaCollection]) NULL,
    	[ID] [bigint] IDENTITY(1,1) NOT NULL,
     CONSTRAINT [PK_TestLogXML2] PRIMARY KEY CLUSTERED 
    (
    	[ID] ASC
    )WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
    ) ON [PRIMARY]
    

     


    Ryan J
    Tuesday, March 23, 2010 11:57 AM
  • It looks like it is only getting the contents of the Value tag. 
    looking for a book on SQL Server 2008 Administration? http://www.amazon.com/Microsoft-Server-2008-Management-Administration/dp/067233044X looking for a book on SQL Server 2008 Full-Text Search? http://www.amazon.com/Pro-Full-Text-Search-Server-2008/dp/1430215941
    Tuesday, March 23, 2010 1:52 PM
  • Right ... but the msdn article says that the attribute values should be included.

    I found an online preview of a 2008 book that says attributes are included as part of the XML markup, but it was unclear (to me) if they meant just the attribute name or both the name and the value.


    Ryan J
    Tuesday, March 23, 2010 2:41 PM
  • Ryan - that might be the book I worked on. Once upon a time it was, however this appears to be no longer the case. I am following up with a Microsoft contact of mine as this appears to be a doc bug.

    looking for a book on SQL Server 2008 Administration? http://www.amazon.com/Microsoft-Server-2008-Management-Administration/dp/067233044X looking for a book on SQL Server 2008 Full-Text Search? http://www.amazon.com/Pro-Full-Text-Search-Server-2008/dp/1430215941
    Tuesday, March 23, 2010 4:37 PM
  • Thank you Hilary.  Please keep me posted. :)
    Ryan J
    Tuesday, March 23, 2010 6:47 PM
  • Hi Hilary,

    We got the answer from MSFT: "Only Level 0 and Level 1" attribute values get indexed."  They are going to review to see if this is a bug or by design and update their documentation.

    So if I altered my original XML:

    <LogXML> 
    <Props> 
    <Person surname="Jarrett" g1="Ryan" g2="A" g3="" g4="" g5="" dob="" gender="M" mtp="" fps="" role="" /> 
    </Props> 
    <Attributes> 
    <Attribute name="Command"> 
    <Value>Execute Test command on user Smith.</Value> 
    </Attribute> 
    </Attributes> 
    </LogXML>
    

    to:

    <LogXML> 
    <Person surname="Jarrett" g1="Ryan" g2="A" g3="" g4="" g5="" dob="" gender="M" mtp="" fps="" role="" /> 
    <Attributes> 
    <Attribute name="Command"> 
    <Value>Execute Test command on user Smith.</Value> 
    </Attribute> 
    </Attributes> 
    </LogXML>
    

    then the attribute values would get indexed.


    Ryan J
    • Edited by Ryan Jarrett Wednesday, April 14, 2010 4:49 PM more info
    • Marked as answer by Ryan Jarrett Wednesday, April 14, 2010 4:50 PM
    Wednesday, April 14, 2010 4:47 PM
  • Hi,

    In this article for SQL Server 2005       http://msdn.microsoft.com/en-us/library/ms345117%28SQL.90%29.aspx

    is written that:

    " You can create a full-text index on XML columns; this indexes the content of the XML values while ignoring the XML markup. Attribute values are not full-text indexed (since they are considered part of the markup) and element tags are used as token boundaries"

    I couldn't find an article like that for SQL 2008.

     

    The article below is for SQL 2008

    http://msdn.microsoft.com/en-us/library/bb522491.aspx

    and it says

    "You can create a full-text index on XML columns that indexes the content of the XML values, but ignores the XML markup ."

    So XML markup is definitely not indexed, according to documentation as well as Attributes because they are part of markup (unless they are not lv1,2 :) ).

    It would be nice to have mentioned somewhere in the documentation for SQL 2008 that attributes are concidered as part of XML markup and not indexed.

    It would be even better to have them indexed ;)

     

    Cheers,

    Helen

    Tuesday, May 4, 2010 7:30 AM