Unanswered search without accent

  • Monday, January 24, 2011 11:00 AM
     
     

    hi

    i have a field with accent alphabet and i want when search in this field accent character has'nt considered in search

    thankyou


    Bijan

All Replies

  • Monday, January 24, 2011 1:05 PM
     
     

    I am not totally sure what you mean by your question.

    Are you saying when you search on accented words, the search does not find the accented words in the indexed content? For example, if you search on a word like Résumé, does full text search not find full-text indexed content containing this word?

    Or are you saying that when you search on a word like Résumé, full text search does not find the word resume?


    looking for a book on SQL Server 2008 Administration? http://www.amazon.com/Microsoft-Server-2008-Management-Administration/dp/067233044X looking for a book on SQL Server 2008 Full-Text Search? http://www.amazon.com/Pro-Full-Text-Search-Server-2008/dp/1430215941
  • Wednesday, January 26, 2011 2:03 PM
     
     

    hi

    i am searchin on a field with arabic language and i want when i search a word "احد"

    the sql find احد-اَحد-اُحد-اّحد

     


    Bijan
  • Wednesday, January 26, 2011 10:02 PM
     
     

    How is your Full Text Catalog defined?  Each Fulltext Index belongs to a catalog, which is where Accent Sensitivity is set.

    http://msdn.microsoft.com/en-us/library/ms176095.aspx

    ALTER FULLTEXT CATALOG ftCatalog
    REBUILD WITH ACCENT_SENSITIVITY=OFF;
    GO

    The Fulltext Index is where the data is defined with the  "Arabic" language.  Putting those two distinct settings together should give you what you want.

    RLF

  • Saturday, January 29, 2011 10:02 AM
     
     

    hi

    i do it for my table but int dont work for arabic language


    Bijan
  • Monday, January 31, 2011 1:51 AM
     
     

    Then perhaps I misunderstood your request and you should try:

    ALTER FULLTEXT CATALOG ftCatalog
    REBUILD WITH ACCENT_SENSITIVITY=ON;

    If that does not work for you either, then it may be that what you view as an accent is not the same as the language definition of an accent.  Unfortunately, I am not able to make similar judgements on the Arabic script.

    For Turkish, as an example, there is an 'i' with a dot and an 'i' without a dot.  This is not considered an accent, but is two different characters. 

    RLF

  • Monday, January 31, 2011 2:47 PM
     
      Has Code

    Something is strange here as this works fine for me.

    <span><span>USE [test]
    GO
    CREATE FULLTEXT CATALOG [test]WITH ACCENT_SENSITIVITY = ON
    AS DEFAULT
    AUTHORIZATION [dbo]
    GO
    SET ANSI_NULLS ON
    GO
    SET QUOTED_IDENTIFIER ON
    GO
    CREATE TABLE [dbo].[FULLTEXT](
    	[PK] [int] IDENTITY(1,1) CONSTRAINT FULLTEXTPK PRIMARY KEY NOT NULL,
    	[charcol] [nvarchar](100) NULL)
    GO
    CREATE FULLTEXT INDEX ON [dbo].[FULLTEXT](
    [charcol] LANGUAGE [Arabic])
    KEY INDEX [FULLTEXTPK ]ON ([test], FILEGROUP [PRIMARY])
    WITH (CHANGE_TRACKING = AUTO, STOPLIST = SYSTEM)
    GO
    INSERT INTO FULLTEXT (charcol) VALUES (N'احد')
    INSERT INTO FULLTEXT(charcol) VALUES (N'احد-اَحد-اُحد-اّحد')
     GO
     SELECT * FROM FULLTEXT WHERE CONTAINS(*, N'احد')
     
     1	احد
    2	احد-اَحد-اُحد-اّحد</span>
    </span>
    
    


    looking for a book on SQL Server 2008 Administration? http://www.amazon.com/Microsoft-Server-2008-Management-Administration/dp/067233044X looking for a book on SQL Server 2008 Full-Text Search? http://www.amazon.com/Pro-Full-Text-Search-Server-2008/dp/1430215941
  • Tuesday, February 01, 2011 4:27 AM
     
     

    hi

    you put in both record 'احد'

    we have in each record one shape on احد

    INSERT INTO FULLTEXT(charcol) VALUES (اّحد')

    INSERT INTO FULLTEXT(charcol) VALUES (N'اَحد')

    INSERT INTO FULLTEXT(charcol) VALUES (N'اُحد')

    INSERT INTO FULLTEXT(charcol) VALUES (N'اِحد')

    INSERT INTO FULLTEXT(charcol) VALUES (N'احد')

    for example

    record 4 has only احد

    record 2 has only اَحد

    and recod 3 has only اُحد

    and record 5 has only اِحد

    and i want when i search احد without accent Sql find four record

     

     


    Bijan
  • Tuesday, February 01, 2011 4:41 AM
     
     

    i must add accent(diacritical mark   ) in arabic and persian language are "  َ,ُ,ِ,ّ    "

    and we say erab (اعراب)in our language

    accent(english) = erab(arabic)

    or better word is diacritical mark  

    diacritical mark   (english)=erab(arabic)


    Bijan
  • Wednesday, February 02, 2011 4:14 PM
     
      Has Code

    I ran Hilary's script with the following data and query with the Fulltext Catalog set to ACCENT_SENSITIVITY = ON and the to ACCENT_SENSITIVITY = OFF. 

    INSERT INTO FULLTEXT(charcol) VALUES (N'احد')
    INSERT INTO FULLTEXT(charcol) VALUES (N'اَحد')
    INSERT INTO FULLTEXT(charcol) VALUES (N'اُحد')
    INSERT INTO FULLTEXT(charcol) VALUES (N'اّحد')
    INSERT INTO FULLTEXT(charcol) VALUES (N'احد-اَحد-اُحد-اّحد')
    
     GO
     SELECT * FROM FULLTEXT WHERE CONTAINS(*, N'احد')
     SELECT * FROM FULLTEXT WHERE CONTAINS(*, N'اّحد')
     SELECT * FROM FULLTEXT WHERE CHARCOL LIKE N'%اّحد%'
    

    The results that I get are as follows:

    PK     charcol
    ----------- -------------------
    1      احد
    5      احد-اَحد-اُحد-اّحد
    (2 row(s) affected)
    
    PK     charcol
    ----------- -------------------
    4      اّحد
    5      احد-اَحد-اُحد-اّحد
    (2 row(s) affected)
    
    PK     charcol
    ----------- -------------------
    4      اّحد
    5      احد-اَحد-اُحد-اّحد
    (2 row(s) affected)

    Therefore, using the Arabic language for the fulltext index apparently does not view these diacritics as accents, but as individual characters.  (Similar to the dotted-i and undotted-i in Turkish.  Other languages have similar characters as well.)

    Accent Sensitivity is defined by the language.  As I mentioned, I do not read Arabic, so I cannot comment on the propriety of this or not, but that is apparently what is happening.  It looks like you will need to create a thesaurus for the characters that you want to match or else do some other manual effort to turn the one character into 1 OR 2 OR 3 OR 4.

    Sorry,
    RLF

  • Saturday, February 05, 2011 4:31 AM
     
     

    i find a function ,and i can remove one charachter from text and compar it with another text this is replace function

    but this function only replace one character i need a function to remove some character from a text if you know a function please suggest it .

    if i find this function i can use it to remove accent character and compat it with a text

    thankyou


    Bijan