none
Full-text search, forms of words

    Question

  • Hi all,

    Can I get "main" form of words using FTS (SQL Server 2008)?

    for example:

    feet => foot

    ran => run

    running => run

    When I try to use

    SELECT *
    FROM sys.dm_fts_parser(' FORMSOF ( INFLECTIONAL, "feet") ',1033,0,0)

    I get a few forms of word "foor", but how I can to know what form of this word is "main" (in my sample - foot)?

    thanks

    Monday, July 05, 2010 3:54 PM

Answers

  • No - SQL FTS and the full-text DMVs do not expose this in SQL 2008. What you are doing is looking for the root of the word. You might want to look at Porter Stemming algorithm to see how this is done.

    For irregular word endings a exception list is consulted and a lookup is done for the irregular stems - like goose and geese, deer and deer, and spouse and spice.

     

     


    looking for a book on SQL Server 2008 Administration? http://www.amazon.com/Microsoft-Server-2008-Management-Administration/dp/067233044X looking for a book on SQL Server 2008 Full-Text Search? http://www.amazon.com/Pro-Full-Text-Search-Server-2008/dp/1430215941
    • Marked as answer by KJian_ Wednesday, July 14, 2010 5:17 AM
    Monday, July 05, 2010 5:56 PM

All replies

  • No - SQL FTS and the full-text DMVs do not expose this in SQL 2008. What you are doing is looking for the root of the word. You might want to look at Porter Stemming algorithm to see how this is done.

    For irregular word endings a exception list is consulted and a lookup is done for the irregular stems - like goose and geese, deer and deer, and spouse and spice.

     

     


    looking for a book on SQL Server 2008 Administration? http://www.amazon.com/Microsoft-Server-2008-Management-Administration/dp/067233044X looking for a book on SQL Server 2008 Full-Text Search? http://www.amazon.com/Pro-Full-Text-Search-Server-2008/dp/1430215941
    • Marked as answer by KJian_ Wednesday, July 14, 2010 5:17 AM
    Monday, July 05, 2010 5:56 PM
  • thanks
    Tuesday, July 06, 2010 9:05 AM