none
full-text search special character question

    Question

  • Hi there , at the moment i m doing something with full text index and found some funnny result with the 2 search

     

    1) SELECT strLocationName FROM Location WHERE CONTAINS(strLocationName,'"Port-of-Spain"')

     

    2) SELECT strLocationName FROM Location WHERE CONTAINS(strLocationName,'"Port-of-Spain*"')

    the first search brings back result as port of spain and the second row doens't brings back any result .

    sorry this is a bit of a rubbish way to ask a question , is the "-" dash some special character in the full text search or the star is just making it goes funny e.t.c ?

     

    thanks for the help

    Monday, February 07, 2011 12:07 PM

Answers

All replies

  • Your second search should wildcard everything. Basically most hyphenated words in English are broken by the word breaker and stored in the index like this

    original word:

    data-base

    indexed as:

    database

    data

    base

    So this should work.

    However you real problem is that you should be doing this:

    SELECT strLocationName FROM Location WHERE CONTAINS(strLocationName,'"Port-of-Spain*"')

    instead of this:

    SELECT strLocationName FROM Location WHERE CONTAINS(strLocationName,'Port-of-Spain*')


    looking for a book on SQL Server 2008 Administration? http://www.amazon.com/Microsoft-Server-2008-Management-Administration/dp/067233044X looking for a book on SQL Server 2008 Full-Text Search? http://www.amazon.com/Pro-Full-Text-Search-Server-2008/dp/1430215941
    • Marked as answer by KJian_ Tuesday, February 15, 2011 10:25 AM
    Monday, February 07, 2011 2:31 PM
  • Hi,

    I have a similar problem, i get the next SQL:

     

     SELECT DISTINCT Id, TITULO,

     FROM Table

    WHERE  CONTAINS(TITULO, '5/1999')

     

    and i get bad results. 

    I have tried with:  WHERE  CONTAINS(TITULO, '"5/1999"')

    and also i get bad results.

    i think that the problem is the special character '/' but i'm no sure.

    Can you help me, please?


    Thursday, May 26, 2011 8:23 AM
  • What bad results are you getting? Can you give me some examples? Have you removed numbers from your stop word list and rebuilt your catalog?
    looking for a book on SQL Server 2008 Administration? http://www.amazon.com/Microsoft-Server-2008-Management-Administration/dp/067233044X looking for a book on SQL Server 2008 Full-Text Search? http://www.amazon.com/Pro-Full-Text-Search-Server-2008/dp/1430215941
    Thursday, May 26, 2011 10:00 AM
  • I have a similar situation, my contention is that some punctuation is never stored regardless of your stoplists.  We have a column with keywords in it with an index (English language).  The Stoplist is OFF.  If you do a search such as

    select * from item where keywords like '% #4%' and keywords like '% wi%'

    you will get 63 rows (I insert the spaces to mimic the fulltext functionality).

    If you do the following

        SELECT item.*
        FROM item
        WHERE CONTAINS(item.keywords, '("wi*") AND ("#4*")')

    you will get 3563 rows.

    And if you do the following

        SELECT item.*
        FROM item
        WHERE CONTAINS(item.keywords, '("wi*") AND ("4*")')

    you also get 3563 rows so it looks like the # character is just ignored.

    When I do the following

        SELECT item.*
        FROM item
        WHERE CONTAINS(item.keywords, '("wi*") AND ("#*")')

    I get no rows and the following message:

    Informational: The full-text search condition contained noise word(s).

    Can anyone confirm that these are hardcoded, non-removable stopwords?

     

    Friday, August 05, 2011 8:54 PM
  • * is the wildcard operator. This is hardcoded.

    # is thrown away because almost all non alpha-numeric characters are not indexed. There are some exceptions - like the hyphen which has special treatment, and the . character.


    looking for a book on SQL Server 2008 Administration? http://www.amazon.com/Microsoft-Server-2008-Management-Administration/dp/067233044X looking for a book on SQL Server 2008 Full-Text Search? http://www.amazon.com/Pro-Full-Text-Search-Server-2008/dp/1430215941
    Sunday, August 07, 2011 10:04 PM
  • i have a simmilar kind of problem

    select titel    from tableName where CONTAINS (   titel  , '("Allez") And ("hop") And ("7")'  )
    select  titel    from tableName where CONTAINS (   titel  , '("Allez") And ("hop")'  )
    

    the 2nd query returns   "Allez hop  7"

    but the 1st query does not return any record

    Friday, April 13, 2012 2:20 PM
  • This is due to the number 7 in the query which is a noise word. All single digit numbers, alphabets and some words like at, on, in and many other words are marked as noise words in the SQL Server by default and when they are included in your full text search they do not return any result.
    • Proposed as answer by Somnath T Monday, April 22, 2013 2:12 PM
    Monday, April 22, 2013 1:54 PM
  • Hi Hilary,

    I have a question on stoplist file. I need to search for exact phrase string("this is the incident") which contains   noise words. As part of the FT search engine, during parsing it eliminates noise word and search on remaining string in the given phrase. 

    let us say there are 10 rows which contains the term "incident" in the FT table . and 1 row which has the exact phrase.i.e."this is the incident".

    if we use containstable() to search for "this is the incident", we are getting 10 rows instead of 1 row.

    To resolve the issue, we have 3 solutions

    1.either stoplist file needs to be modified to remove the words (this,is,the)

    2. set stoplist = OFF.

    3.empty stoplist.

    Apart from the above solutions, is there any better solution with out touching noise words file list.

    If any solution that provides flexibility to ignore noise words conditionally at one time and not to ignore them. 

    Please provide your suggestion.



    kkprasad

    Friday, February 14, 2014 9:36 AM