locked
Select phonetic records SQL server RRS feed

  • Question

  • User-375223641 posted

    I'm trying to search a table for users with similar names like John Smith and Joan Smith.

    I tried this SQL statement but I think it is so simple and will not give me an accurate result

    SELECT * FROM Profile Where DIFFERENCE(profile.name, 'John Smith') >=2

    Monday, March 19, 2018 1:26 PM

All replies

  • User753101303 posted

    Hi,

    Which kind of result do you want? What matters is how it sounds or the spelling?

    You have also https://devio.wordpress.com/2010/09/07/calculating-levenshtein-distance-in-tsql/ which is the number of edit operations you need to do to change a string into another string.

    Monday, March 19, 2018 2:42 PM
  • User-375223641 posted

    I want to get all names which their pronunciations are similar. 

    Monday, March 19, 2018 3:03 PM
  • User753101303 posted

    Ok and the unexpected result you have is ?

    SELECT DIFFERENCE('Joan Smith','John Smith') returns 4 here which seems pretty much correct given a 0 to 4 scale.

    The doc is not 100% clear but it seems to imply that the value may be different depending on the collation (as https://docs.microsoft.com/en-us/sql/t-sql/functions/difference-transact-sql tells which value is returned for the sample but tells that this is for a Latin1_General collation).

    Which collation do you use ?

    Edit: ah or if you mean you have to hard time to compare all names try :

    SELECT * FROM Profile a
    JOIN Profile b ON b.Name<>a.Name
    WHERE DIFFERENCE(a.name, b.name) >=2 -- I would start with 4
    ORDER BY DIFFERENCE(a.name,b.name) DESC,a.name
    

    Monday, March 19, 2018 3:37 PM
  • User347430248 posted

    Hi  elshorbagy,

    You can try to refer procedure below may help you to get desired result.

    CREATE PROCEDURE PhonicSearch 
    @TableName varchar (80)  -- table name
    , @ColumnList varchar(800) -- list of comma seperated column list to return
    , @MatchColumn varchar(80) -- compare column with
    , @SearchFor varchar(150) -- search string to compare with @MatchColumn
    , @MaxSuggest varchar(3) -- in case no exact match found the maximum number 
                             -- of suggested value in @MatchColumn column<script></script> 
    
    AS   
    
    Declare @strSQL varchar(1500)
    
    -- Build and execute SQL query to find exact match 
    Set @strSQL = 'SELECT ' + @ColumnList + ' FROM ' + @TableName + 
                  ' WHERE ' + @MatchColumn + ' = ''' + @SearchFor + ''''
    --Print @strSQL
    exec (@strSQL)
    
    -- If no exact match found build query with phonic search 
    IF @@ROWCOUNT<script></script> =0 
    Begin 
        -- Phonic Search query   
        Set @strSQL = 'SELECT ' + @ColumnList + ' FROM ' + @TableName + 
                      ' WHERE SOUNDEX(' + @MatchColumn + 
                      ') = SOUNDEX(''' + @SearchFor + ''')'
    
        exec(@strSQL)
        --print @strSQL
    
        -- Suggetested words for refin search 
        Set @strSQL = 'SELECT TOP ' + @MaxSuggest + ' ' + @MatchColumn + ', DIFFERENCE('+ 
                      @MatchColumn+ ', '''<script></script> + @SearchFor + 
                      ''') as Similarity FROM ' + @TableName + ' WHERE SOUNDEX(' + 
                      @MatchColumn + ') = SOUNDEX(''' + @SearchFor + 
                      ''') ORDER BY Similarity'
    
        exec(@strSQL)
        --print @strSQL
    END
    Exec PhonicSearch 'Customers', 'customerid, CompanyName, 
         ContactName, ContactTitle'<script></script> , 'ContactName', 'AANA',5 

    Reference:

    Simplified phonetic search with MS SQL Server

    Using SOUNDEX and DIFFERENCE to Standardize Data in SQL Server

    Comparing SOUNDEX and DIFFERENCE

    Regards

    Deepak

    Tuesday, March 20, 2018 6:53 AM