locked
syntax for whole word search RRS feed

  • Question

  • hi, i have an old web app that i would like to linq-ize.  It features an on-the-fly generated sql search query to identify whole word matches.  At the time the only way i could find do this was very inelegant (example below does not include checks for sql injection, literal apostrophe etc):

    Code Snippet

    whereFilter = String.Format("[{0}] = '{1}' OR [{0}] like '{1}[^a-z]%' or [{0}] like '%[^a-z]{1}' or [{0}] like '%[^a-z]{1}[^a-z]%'", searchColumn, keyword);


    it searches for an exact match, starts with phrase, ends with phrase, or word boundary before or after the phrase.  pretty ugly.  I know with Linq (or dynamic query) i could do something like
    Code Snippet

    DataContext.Table.Where(z => z.SearchColumn.Contains(searchString));


    but that will include substring searches (due to the %search% sql implementation) which users don't want.  e.g. searching for "IE" as in the country code, not "junkIEs".

    i could probably hack out the same solution in Linq but i'm hoping to do less hacking now that we have the wonderful world of linq Smile
    thanks for any help
    tim
    Friday, March 28, 2008 11:33 AM

Answers

  • LINQ to SQL also has the class SqlMethods with the function 'Like' on it that you can use to get the actual SQL LIKE command.

     

    var q = from x in db.Table where SqlMethods.Like(x.y, "%[^a-z]123[^a-z]%") select x;

     

     

    Friday, March 28, 2008 3:48 PM

All replies

  • With Contains, IndexOf, SubString, StartsWith end EndsWith you can do all you can do with LIKE.

    For example, if you want to do

    LIKE ‘%a%z%e%r%t%y%’

    You can do this LINQ request:

    Code Snippet

    from ta in context.TAs

    let indexOfA = ta.Name.IndexOf("a")

    where indexOfA != -1

    let sA = ta.Name.Substring(indexOfA + 1)

    let indexOfZ = sA.IndexOf("z")

    where indexOfZ != -1

    let sZ = sA.Substring(indexOfZ + 1)

    let indexOfE = sZ.IndexOf("e")

    where indexOfE != -1

    let sE = sZ.Substring(indexOfE + 1)

    let indexOfR = sE.IndexOf("r")

    where indexOfR != -1

    let sR = sE.Substring(indexOfR + 1)

    let indexOfT = sR.IndexOf("t")

    where indexOfT != -1 && sR.Substring(indexOfT + 1).Contains("y")

    select ta.Name;

     

     

    But IndexOf and SubString won't be traduced by LIKE in SQL.

    For example, prev LINQ request will do this SQL one:

    Code Snippet

    exec sp_executesql N'SELECT [t9].[Test]
    FROM (
        SELECT [t8].[Test], [t8].[value], [t8].[value3], [t8].[value4], [t8].[value5], [t8].[value2],
            (CASE
                WHEN (DATALENGTH(@p8) / 2) = 0 THEN 0
                ELSE CHARINDEX(@p8, [t8].[value2]) - 1
             END) AS [value6]
        FROM (
            SELECT [t7].[Test], [t7].[value], [t7].[value3], [t7].[value4], [t7].[value5], SUBSTRING([t7].[value2], [t7].[value5] + @p7 + 1, CONVERT(Int,DATALENGTH([t7].[value2]) / 2)) AS [value2]
            FROM (
                SELECT [t6].[Test], [t6].[value], [t6].[value3], [t6].[value4], [t6].[value2],
                    (CASE
                        WHEN (DATALENGTH(@p6) / 2) = 0 THEN 0
                        ELSE CHARINDEX(@p6, [t6].[value2]) - 1
                     END) AS [value5]
                FROM (
                    SELECT [t5].[Test], [t5].[value], [t5].[value3], [t5].[value4], SUBSTRING([t5].[value2], [t5].[value4] + @p5 + 1, CONVERT(Int,DATALENGTH([t5].[value2]) / 2)) AS [value2]
                    FROM (
                        SELECT [t4].[Test], [t4].[value], [t4].[value3], [t4].[value2],
                            (CASE
                                WHEN (DATALENGTH(@p4) / 2) = 0 THEN 0
                                ELSE CHARINDEX(@p4, [t4].[value2]) - 1
                             END) AS [value4]
                        FROM (
                            SELECT [t3].[Test], [t3].[value], [t3].[value3], SUBSTRING([t3].[value2], [t3].[value3] + @p3 + 1, CONVERT(Int,DATALENGTH([t3].[value2]) / 2)) AS [value2]
                            FROM (
                                SELECT [t2].[Test], [t2].[value], [t2].[value2],
                                    (CASE
                                        WHEN (DATALENGTH(@p2) / 2) = 0 THEN 0
                                        ELSE CHARINDEX(@p2, [t2].[value2]) - 1
                                     END) AS [value3]
                                FROM (
                                    SELECT [t1].[Test], [t1].[value], SUBSTRING([t1].[Test], [t1].[value] + @p1 + 1, CONVERT(Int,DATALENGTH([t1].[Test]) / 2)) AS [value2]
                                    FROM (
                                        SELECT [t0].[Test],
                                            (CASE
                                                WHEN (DATALENGTH(@p0) / 2) = 0 THEN 0
                                                ELSE CHARINDEX(@p0, [t0].[Test]) - 1
                                             END) AS [value]
                                        FROM [dbo].[A] AS [t0]
                                        ) AS [t1]
                                    ) AS [t2]
                                ) AS [t3]
                            ) AS [t4]
                        ) AS [t5]
                    ) AS [t6]
                ) AS [t7]
            ) AS [t8]
        ) AS [t9]
    WHERE ([t9].[value6] <> @p9) AND (SUBSTRING([t9].[value2], [t9].[value6] + @p10 + 1, CONVERT(Int,DATALENGTH([t9].[value2]) / 2)) LIKE @p11) AND ([t9].[value5] <> @p12) AND ([t9].[value4] <> @p13) AND ([t9].[value3] <> @p14) AND ([t9].[value] <> @p15)',N'@p0 nvarchar(1),@p1 int,@p2 nvarchar(1),@p3 int,@p4 nvarchar(1),@p5 int,@p6 nvarchar(1),@p7 int,@p8 nvarchar(1),@p9 int,@p10 int,@p11 nvarchar(3),@p12 int,@p13 int,@p14 int,@p15 int',@p0=N'a',@p1=1,@p2=N'z',@p3=1,@p4=N'e',@p5=1,@p6=N'r',@p7=1,@p8=N't',@p9=-1,@p10=1,@p11=N'%y%',@p12=-1,@p13=-1,@p14=-1,@p15=-1

     

     

    Friday, March 28, 2008 2:04 PM
  • LINQ to SQL also has the class SqlMethods with the function 'Like' on it that you can use to get the actual SQL LIKE command.

     

    var q = from x in db.Table where SqlMethods.Like(x.y, "%[^a-z]123[^a-z]%") select x;

     

     

    Friday, March 28, 2008 3:48 PM
  • Great!

    Moreover, in VB, you can use Like as LINQ Operator but not in C#.

     

    Friday, March 28, 2008 3:50 PM
  • hey Matt. thanks a million for pointing out these methods, i never came across them before.  This will definitely help, but having spent a lot of time trying it out, my conclusion is that a Linq implementation is too complicated (for me) and perhaps not worth it, i reckon it would involve a lot of DynamicExpression.ParseLambda stuff which i haven't gotten into yet.

    the basic requirement is to do a whole word keyword search over a run-time specified list of columns. 
    my first stab looked something like this:

    Code Snippet

    var ds = db.Table.Whatver....
    if(!String.IsNullOrEmpty(keyword))
    {
      foreach (string s in searchFields)
        ds = ds.Where(SqlMethods.Like("it." + s, keyword));  // for a complete field match
    }


    the problem with the above is that it is ANDing all the where conditions when it should OR them instead, but i can't OR them while iterating the search fields at the same time. The best example i can give of what i have in mind is as follows, taking a more management scenario of 2 columns (instead of N columns), and it doesn't compile because IEnumerable.Where() doesn't like the multiple conditions i'm giving it.  this is where i gave up! 

    Code Snippet

    var ds = db.Table.Whatver....
    if (!String.IsNullOrEmpty(keyword))
    {

        // statically declare 2 columns to search for, ideally support N columns at runtime

        string col1 = "Title";
        string col2 = "Abstract";
        ds = ds.Where
            (
                    // complete field match
                    SqlMethods.Like("it." + col1, keyword) ||
                    // match at start of field (followed by word boundary)
                    SqlMethods.Like("it." + col1, keyword + "[^a-z]%") ||
                    // match at end of field (preceeded by word boundary)
                    SqlMethods.Like("it." + col1, "%[^a-z]" + keyword) ||
                    // match mid-field (surrounded by word boundaries)
                    SqlMethods.Like("it." + col1, "%[^a-z]" + keyword + "[^a-z]%")
                    ||        

                    // same for col2...
                    SqlMethods.Like("it." + col2, keyword) ||
                    SqlMethods.Like("it." + col2, keyword + "[^a-z]%") ||
                    SqlMethods.Like("it." + col2, "%[^a-z]" + keyword) ||
                    SqlMethods.Like("it." + col2, "%[^a-z]" + keyword + "[^a-z]%")
            );
    }


    just posting this on the off-chance that you had a moment to point me a step further in the right direction.  thanks again for responding earlier. 
    tim
    Friday, March 28, 2008 5:16 PM
  • hi matthew. thanks for the post.  i came across this same example with %a%z%e%r%t%y% on the internet and quickly moved on because the SQL was so scary. 
    i doubt if it is possible to achieve a 'whole word' implementation using just those Linq/string methods, and in any case i expect the resulting SQL would be so bad, over large data at least, as to be unusable compared to
    the reasonably efficient direct sql query i posted originally.  i've spent ages grappling with this and much as i would like to upgrade this search facility to use LINQ, it just may not be worth the effort.
    thanks
    tim
    Friday, March 28, 2008 5:26 PM