Tuesday, January 24, 2012 10:38 PM
I have a table with an NVARCHAR(MAX) column (FileText) which I use to store text associated with a given file. I have a full text index on it that allows me to search this table for given term / phrase / etc:
SELECT DocID, FileName, FileText FROM MyTable WHERE FREETEXT(*, N'Test', LANGUAGE 1033)
This tells me which document a given phrase occurs in. What it doesn't tell me is where in the document a given term occurs (offset).
Does SQL2008 Full Text or SQL Denali Full Text offer such functionality (hit location / offset within document)? This functionality would be helpful for showing the hit context to a user (i.e. the immediate words to the left and right of the hit). I hope to not have to write such functionality myself if possible.
Wednesday, January 25, 2012 7:38 AMModerator
SQL Server 2008 offers new dynamic management views that offer this metadata for full text. Please take a look at sys.dm_fts_parser and sys.dm_fts_index_keywords.
The sys.dm_fts_parser view takes in a phrase, along with a couple of other parameters and outputs a table showing a row set, showing stemmed versions of the individual words after the word breaker has deemed them as separate words.
MSDN gives the example of this query against the view:
SELECT * FROM sys.dm_fts_parser (' "The Microsoft business analysis" ', 1033, 0, 0)
To get the keywords, you can use sys.dm_fts_index_keywords.
Please remember to click "Mark as Answer" on the post that helps you, and to click "Unmark as Answer" if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread.
Wednesday, January 25, 2012 3:34 PMUser wants ability to retrieve text from column offset X to offset Y within DocText column without loading entire DocText column as it could be very large. Apparently, the sql full text search provides us with the record the hit took place in but not the location / offset within the full text column for that record.Example:My Table:DocID intDocName nvarchar(50)DocText nvarchar(MAX)1, "MyDocument", "Four score and seven years ago our fathers brought forth on this continent, a new nation, conceived in Liberty, and dedicated to the proposition that all men are created equal.................................................................."Retrieve text at offset 91-111 ("conceived in Liberty"). The seems like a stream type operation but isn't SQL FileStream limited to varbinary? I need nvarchar(MAX) for SQL full text searching.
Wednesday, January 25, 2012 7:33 PM
No SQL Server 2012 (Denali) full text indexing and search still does not support returning the offset of a word or phrase in a search.
(But SQL Server 2012 does support a new NEAR operator which is looser that a "phrase of words" since it gives you the ability to query for words being close to each other with a particular position.)
- Marked As Answer by scott_m Wednesday, January 25, 2012 7:46 PM