none
How to count occurrences of two or more characters in a string

    Question

  • Hello all, I recently posted this in the SSRS forum and Parry2k helpfully suggested I tried here also:

    -------------------------------

    I know how to count occurences of a single character in SQL, using:

    LEN(<field>) - LEN(REPLACE(<field>,"N",""))

     

    But I wondering how to manipulate this to count occurrences of multiple characters in a string - i.e.

    mystring 'TTNFFNQQQTQRES'

    count of Q,S,F,T = 9.


    I know how to do this in VBA but not using SQL (at least without repeating the the Len statement 4 times which I assume is massively inefficient). Could anyone please offer any pointers? I thought it might be possible with a case statement??

     

    Many thanks :)

     

    PS using SSRS 2005. 

    Friday, March 26, 2010 7:25 PM

Answers

  • Ok, I wrote it just in case.
    declare @String varchar(100) = 'AAABABCDEDE'
    
    declare @Cnt int, @Pos int
    
    set @Cnt = 0
    set @Pos = PATINDEX('%[ADE]%', @String)
    
    --print @Pos
    
    while @Pos > 0
      begin
      set @Cnt = @Cnt + 1
      set @String = SUBSTRING(@String, @Pos + 1, 100)
      set @Pos = PATINDEX('%[ADE]%', @String)
      --print @pos
      --print @String
      end
    select @Cnt

    Premature optimization is the root of all evil in programming. (c) by Donald Knuth

    Naomi Nosonovsky, Sr. Programmer-Analyst

    My blog
    • Marked as answer by MrZeroPercent Saturday, March 27, 2010 12:14 PM
    Friday, March 26, 2010 8:19 PM
    Moderator
  • In the string 'AAABABCDEDE' I'd like to count the occurences of A,D and E to which there is no pattern.

     The result to this question would be 8, with there being 4 As, 2 Ds and 2 Es.


    In addition to Naomi's solution consider the following "loopless" solution.

    More or less loopless... a number of the T-SQL operators and functions have "built-in loops".

    -- The occurrence of A D & E
    DECLARE @String varchar(32)='AAABABCDEDE' 
    SELECT Occuring = LEN(@String)-
    				  LEN (REPLACE(REPLACE(REPLACE(
    					@String,
    					'A',''),
    					'D',''),
    					'E',''))
    /*
    Occuring
    8
    */
    

    Kalman Toth SQL 2008 GRAND SLAM
    Paperback/Kindle: Exam 70-461 Bootcamp: Querying Microsoft SQL Server 2012


    Friday, March 26, 2010 8:51 PM
    Moderator
  • DECLARE	@String VARCHAR(100) = 'AAABABCDEDE'
    
    SELECT		SUBSTRING(@String, Number, 1) AS Character,
    		COUNT(*) AS Items,
    		SUM(COUNT(*)) OVER () AS Total
    FROM		master..spt_values
    WHERE		Number BETWEEN 1 AND LEN(@String)
    		AND Type = 'P'
    		AND SUBSTRING(@String, Number, 1) IN ('A', 'D', 'E')
    GROUP BY	SUBSTRING(@String, Number, 1)
    Friday, March 26, 2010 9:23 PM
  • That's flattering - if you'd knew how many times I wished to be somebody else and know at least half of what Plamen knows or Ted Kruger or other people I see here or in other places...
    Premature optimization is the root of all evil in programming. (c) by Donald Knuth

    Naomi Nosonovsky, Sr. Programmer-Analyst

    My blog
    • Marked as answer by MrZeroPercent Saturday, March 27, 2010 12:09 PM
    Friday, March 26, 2010 8:33 PM
    Moderator
  • Hi SQLUSA, 

    Great thanks! Another neat solution! You guys are brilliant. 

     Do you have any comment on the merits of each approach??


    My solution is very, very fast when compared to the other solutions offered. Generally a single statement nested string functions expression is extremely fast.



    Kalman Toth SQL 2008 GRAND SLAM
    Paperback/Kindle: Exam 70-461 Bootcamp: Querying Microsoft SQL Server 2012


    Saturday, March 27, 2010 8:22 AM
    Moderator

All replies

  • You just do the same and divide by the length of the replacement string.

    See here http://wiki.lessthandot.com/index.php/Find_Out_How_Many_Occurrences_Of_A_Substring_Are_In_A_String


    Premature optimization is the root of all evil in programming. (c) by Donald Knuth

    Naomi Nosonovsky, Sr. Programmer-Analyst

    My blog

    Friday, March 26, 2010 7:27 PM
    Moderator
  • Hey Naom,

     

    thanks for your reply. Yes, this is the approach that Parry2k helpfully recommended as well. I guess I just figured there'd be a more elegant, efficient method to use when counting several+ characters in the same string, using a case statement or loop perhaps?

     

    Thanks! :)

    Friday, March 26, 2010 7:43 PM
  • Do you want to count for different patterns in the string, e.g.

    This is a test string testing

    We want to find occurrences of 'test' and 'string' together?

    Otherwise, if, say, you only want to find occurrences of 'test' (2), then this approach is efficient enough and I don't think there is something better.


    Premature optimization is the root of all evil in programming. (c) by Donald Knuth

    Naomi Nosonovsky, Sr. Programmer-Analyst

    My blog
    Friday, March 26, 2010 7:50 PM
    Moderator
  • Hey Naom, 

     

    In the string 'AAABABCDEDE' I'd like to count the occurences of A,D and E to which there is no pattern.

     

    The result to this question would be 8, with there being 4 As, 2 Ds and 2 Es.

     

    I have no doubt you're correct in your approach, I think that with my background being slightly more rooted in programming I figured there would be a way to do this in a loop or through some variable manipulation, but i'm pretty new to SQL so bow to your superior knowledge!

     

    Many thanks! 

    Friday, March 26, 2010 8:02 PM
  • Yes, for this case we may need to use a loop, you're right. The approach from the above is just to find a single pattern.

    So, in this case we would need to use patindex in a loop until it returns 0. Do you want me to write the code or you'll figure it out?


    Premature optimization is the root of all evil in programming. (c) by Donald Knuth

    Naomi Nosonovsky, Sr. Programmer-Analyst

    My blog
    Friday, March 26, 2010 8:10 PM
    Moderator
  • Hey Naom,

     

    I've never used a loop in SQL so i'd certainly be grateful for any pointers you'd be prepared to give me! Only if you have time, however. 

     

    Many, many thanks. :D

    Friday, March 26, 2010 8:15 PM
  • Ok, I wrote it just in case.
    declare @String varchar(100) = 'AAABABCDEDE'
    
    declare @Cnt int, @Pos int
    
    set @Cnt = 0
    set @Pos = PATINDEX('%[ADE]%', @String)
    
    --print @Pos
    
    while @Pos > 0
      begin
      set @Cnt = @Cnt + 1
      set @String = SUBSTRING(@String, @Pos + 1, 100)
      set @Pos = PATINDEX('%[ADE]%', @String)
      --print @pos
      --print @String
      end
    select @Cnt

    Premature optimization is the root of all evil in programming. (c) by Donald Knuth

    Naomi Nosonovsky, Sr. Programmer-Analyst

    My blog
    • Marked as answer by MrZeroPercent Saturday, March 27, 2010 12:14 PM
    Friday, March 26, 2010 8:19 PM
    Moderator
  • Great thanks! Can't wait to give it a go!

     

    I wish I was as great as you!

     

     

    Friday, March 26, 2010 8:24 PM
  • That's flattering - if you'd knew how many times I wished to be somebody else and know at least half of what Plamen knows or Ted Kruger or other people I see here or in other places...
    Premature optimization is the root of all evil in programming. (c) by Donald Knuth

    Naomi Nosonovsky, Sr. Programmer-Analyst

    My blog
    • Marked as answer by MrZeroPercent Saturday, March 27, 2010 12:09 PM
    Friday, March 26, 2010 8:33 PM
    Moderator
  • I'm sure it won't be long till you're there (if you're not already!)
    Friday, March 26, 2010 8:40 PM
  • In the string 'AAABABCDEDE' I'd like to count the occurences of A,D and E to which there is no pattern.

     The result to this question would be 8, with there being 4 As, 2 Ds and 2 Es.


    In addition to Naomi's solution consider the following "loopless" solution.

    More or less loopless... a number of the T-SQL operators and functions have "built-in loops".

    -- The occurrence of A D & E
    DECLARE @String varchar(32)='AAABABCDEDE' 
    SELECT Occuring = LEN(@String)-
    				  LEN (REPLACE(REPLACE(REPLACE(
    					@String,
    					'A',''),
    					'D',''),
    					'E',''))
    /*
    Occuring
    8
    */
    

    Kalman Toth SQL 2008 GRAND SLAM
    Paperback/Kindle: Exam 70-461 Bootcamp: Querying Microsoft SQL Server 2012


    Friday, March 26, 2010 8:51 PM
    Moderator
  • Hi SQLUSA, 

    Great thanks! Another neat solution! You guys are brilliant. 

     

    Do you have any comment on the merits of each approach??

    Friday, March 26, 2010 8:58 PM
  • DECLARE	@String VARCHAR(100) = 'AAABABCDEDE'
    
    SELECT		SUBSTRING(@String, Number, 1) AS Character,
    		COUNT(*) AS Items,
    		SUM(COUNT(*)) OVER () AS Total
    FROM		master..spt_values
    WHERE		Number BETWEEN 1 AND LEN(@String)
    		AND Type = 'P'
    		AND SUBSTRING(@String, Number, 1) IN ('A', 'D', 'E')
    GROUP BY	SUBSTRING(@String, Number, 1)
    Friday, March 26, 2010 9:23 PM
  • Hi SQLUSA, 

    Great thanks! Another neat solution! You guys are brilliant. 

     Do you have any comment on the merits of each approach??


    My solution is very, very fast when compared to the other solutions offered. Generally a single statement nested string functions expression is extremely fast.



    Kalman Toth SQL 2008 GRAND SLAM
    Paperback/Kindle: Exam 70-461 Bootcamp: Querying Microsoft SQL Server 2012


    Saturday, March 27, 2010 8:22 AM
    Moderator