SQL full text searching / location of hit within document

Respondida SQL full text searching / location of hit within document

  • Tuesday, January 24, 2012 10:38 PM
     
     

    I have a table with an NVARCHAR(MAX) column (FileText) which I use to store text associated with a given file.  I have a full text index on it that allows me to search this table for given term / phrase / etc:

    SELECT  DocID, FileName, FileText FROM MyTable WHERE FREETEXT(*, N'Test', LANGUAGE 1033)

    This tells me which document a given phrase occurs in.  What it doesn't tell me is where in the document a given term occurs (offset).

    Does SQL2008 Full Text or SQL Denali Full Text offer such functionality (hit location / offset within document)?  This functionality would be helpful for showing the hit context to a user (i.e. the immediate words to the left and right of the hit).  I hope to not have to write such functionality myself if possible.

     

     

     

All Replies

  • Wednesday, January 25, 2012 7:38 AM
    Moderator
     
     

    Hi scott_m,

    SQL Server 2008 offers new dynamic management views that offer this metadata for full text. Please take a look at sys.dm_fts_parser and sys.dm_fts_index_keywords.

    The sys.dm_fts_parser view takes in a phrase, along with a couple of other parameters and outputs a table showing a row set, showing stemmed versions of the individual words after the word breaker has deemed them as separate words.

    MSDN gives the example of this query against the view:
    SELECT * FROM sys.dm_fts_parser (' "The Microsoft business analysis" ', 1033, 0, 0)

    To get the keywords, you can use sys.dm_fts_index_keywords.


    Best Regards,
    Peja

    Please remember to click "Mark as Answer" on the post that helps you, and to click "Unmark as Answer" if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread.
  • Wednesday, January 25, 2012 3:34 PM
     
     
    User wants ability to retrieve text from column offset X to offset Y within DocText column without loading entire DocText column as it could be very large.  Apparently, the sql full text search provides us with the record the hit took place in but not the location / offset within the full text column for that record.
    Example:
    My Table:
    DocID int
    DocName nvarchar(50)
    DocText nvarchar(MAX) 
     
    1, "MyDocument", "Four score and seven years ago our fathers brought forth on this continent, a new nation, conceived in Liberty, and dedicated to the proposition that all men are created equal.................................................................."
    Retrieve text at offset 91-111 ("conceived in Liberty").  The seems like a stream type operation but isn't SQL FileStream limited to varbinary?  I need nvarchar(MAX) for SQL full text searching.
  • Wednesday, January 25, 2012 7:33 PM
     
     Answered

    No SQL Server 2012 (Denali) full text indexing and search still does not support returning the offset of a word or phrase in a search.

    (But SQL Server 2012 does support a new NEAR operator which is looser that a "phrase of words" since it gives you the ability to query for words being close to each other with a particular position.)

    RLF

    • Marked As Answer by scott_m Wednesday, January 25, 2012 7:46 PM
    •