locked
sql query to find most repeated word in each row RRS feed

  • Question

  • User372584113 posted

    I have a table in SQL-server called client below like :

    prediction     |         text

    0                 this is a sample sentence for sample sentence  

    1                this is a simple sentence this is a simple sentence and important sentence

    now i need a query that make a column to above table that find 2 words each row and show in third column below like :

    prediction     |         text                                                              |         top terms

    0                 this is a sample sentence for sample sentence                            |    simple, sentence

    1                this is a simple sentence this is a simple sentence and important sentence|    important, sentence       

    please help me to solve it

    Tuesday, January 19, 2021 2:47 PM

All replies

  • User-1545767719 posted

    Do you want to do it only by using T-SQL? If not, please write more details of your requirements especially in relation to ASP.NET.

    Wednesday, January 20, 2021 2:37 AM
  • User-939850651 posted

    Hi sep.merchantapp,

    According to your description, you could try to define a customer function and call it.

    I created a simple demo, you could refer to it:

    create function twoWords (@string varchar(max))
    returns table
    as
    return 
    WITH splitString(val) AS       
    (
        -- convert the string to xml, seperating the elements by spaces
        SELECT    CAST('<r><i>' + REPLACE(@string,' ','</i><i>') + '</i></r>' AS XML)
    )
    SELECT top 2 Word, CAST(COUNT(*) AS VARCHAR) AS  Number
    FROM    (   -- select all of the values from the xml created in the cte
    			SELECT  p.value('.','varchar(100)') AS Word
    			FROM    splitString
    					CROSS APPLY val.nodes('//i') t (p)) AS t
    GROUP BY Word
    ORDER BY Number desc
    -----------------------------------------------------------------------------------
    create table client(
    prediction int,
    text varchar(max)
    )
    
    insert into client values (0,'this is a sample sentence for sample sentence'),
    (1,'this is a important sentence this is a important sentence and important sentence')
    
    select prediction,text,(select STRING_AGG(Word, ', ') result from twoWords(text)) top_terms from client

    Result:

    Hope this can help you.

    Best regards,

    Xudong Peng

    Wednesday, January 20, 2021 3:02 AM
  • User372584113 posted

    thanks for your reply, the result is as i as expected but i can't use CROSS APPLY, please show me a sql query without CROSS APPLY

    Thanks

    Wednesday, January 20, 2021 5:09 AM
  • User372584113 posted

    thanks for your reply, yes I want to make it only by using T-SQL but with out Cross Apply 

    Wednesday, January 20, 2021 5:27 AM
  • User-939850651 posted

    Hi sep.merchantapp,

    Well, I modified my query statement, and it seems to be easily implemented.

    Something like this:

    create table client(
    prediction int,
    text varchar(max)
    )
    
    insert into client values (0,'this is a sample sentence for sample sentence'),
    (1,'this is a important sentence this is important sentence and important sentence')
    
    select  prediction, text, (select STRING_AGG(value, ', ') result from (
    			select top 2 value,count(*) number from STRING_SPLIT(text,' ') group by value
    			order by number desc
    	) t ) top_terms
    from client

    Result:

    Hope this can help you.

    Best regards,

    Xudong Peng

    Wednesday, January 20, 2021 10:05 AM
  • User372584113 posted

    thanks for your reply, Yes, that's the output I want , but Can you rewrite your query without STRING_AGG and STRING_SPLIT because I use from a third party and it only support simple sql syntax

    Thanks

    Wednesday, January 20, 2021 10:55 AM
  • User-939850651 posted

    Hi sep.merchantapp,

    It seems that your system does not support the built-in functions I have used.

    So could you provide more details, mainly including:

    • What is the third-party (sql server ?) version
    • What sql syntax are supported

    Best regards,

    Xudong Peng

    Thursday, January 21, 2021 1:55 AM
  • User372584113 posted

    thanks for yor reply, the sql version is 2008 and support basic syntax such as select, join, where, having, group by,.. it does not support function such as string_agg, string_split, cross apply,...

    Thursday, January 21, 2021 5:27 AM
  • User-939850651 posted

    Hi sep.merchantapp,

    Well, In your case, I think you have to define a customer function to split the column value again.

    This is my test:

    -- split string to words table
    create function f_split(@SourceSql varchar(8000),@StrSeprate varchar(10))
    returns @temp table(words varchar(100))
    as
    begin
    declare @i int
    set @SourceSql=rtrim(ltrim(@SourceSql))
    set @i=charindex(@StrSeprate,@SourceSql)
    while @i>=1
    begin
    insert @temp values(left(@SourceSql,@i-1))
    set @SourceSql=substring(@SourceSql,@i+1,len(@SourceSql)-@i)
    set @i=charindex(@StrSeprate,@SourceSql)
    end
    if @SourceSql<>''
    insert @temp values(@SourceSql)
    return
    end
    
    
    create table client(
    prediction int,
    text varchar(max)
    )
    
    insert into client values (0,'this is a sample sentence for sample sentence'),
    (1,'this is a important sentence this is important sentence and important sentence')
    
    select  prediction, text,(SELECT STUFF((SELECT ', ' + words as [text()] FROM (select top 2 words,count(*) number from f_split(text,' ') group by words
    			order by number desc ) t FOR XML PATH('')),1,1,'')) AS top_terms
    from client

    Result:

    Best regards,

    Xudong Peng

    Thursday, January 21, 2021 8:26 AM