Find how many time word repeat in string

Answered Find how many time word repeat in string

  • Friday, April 13, 2012 9:54 PM
     
      Has Code

    I have a text file like this

    TOM*SAID*THAT~
    TOM*SAID*THAT~

    I would like to count how many times TOM occurred? The TOM should be always in the beginning of the line and will be following by *

    I am using the following code and I can get the number of time occurred. However, the matches collection shows the matched substring is ~TO. The M is missing. 

    var matches = Regex.Matches(content, "~TOM*");
    

    Is that anything missing?

All Replies

  • Friday, April 13, 2012 10:23 PM
     
      Has Code

    You caid you wanna look for word "TOM" and not "~TOM". There is no such occurrences in your example - as far as I can see. So for the pattern you take "TOM" only.

    Instead of using Regex, we can use Linq to get the number of occurences of some item:

    string filePath = @"C:\myfile.txt";
    string text = File.ReadAllText(filePath);
    int Tom_Repeats = text.ToCharArray().Where(w => w == 'TOM').Select(s => s).Count();
    MessageBox.Show(string.Format("In a file {0} a word TOM repeats {1} times.", filePath, Tom_Repeats));

    Bellow is how to use Regex too.


    Mitja



  • Friday, April 13, 2012 10:26 PM
     
     
    Part of your issue probably has to do with the fact that * is a reserved character.  You need to escape it with a \

    It would be greatly appreciated if you would mark any helpful entries as helpful and if the entry answers your question, please mark it with the Answer link.

  • Friday, April 13, 2012 10:28 PM
     
      Has Code

    or using Regex:

    string filePath = @"C:\myfile.txt";
    string text = File.ReadAllText(filePath);
    MatchCollection matches = Regex.Matches(text, "TOM");
    
    MessageBox.Show(string.Format("In a file {0} a word TOM repeats {1} times.", filePath, matches.Count));
    


    Mitja

  • Saturday, April 14, 2012 6:28 AM
     
     

    This code may help you.

    var matches = Regex.Matches(content, "[\r\n|]TOM[*]"); //check TOM* with new line

    var matches = Regex.Matches(content, "TOM[*]");//check only  TOM* 

     
  • Saturday, April 14, 2012 10:33 AM
     
     

    Try this too:

    int count = Regex.Matchestext@"^TOM[*]"RegexOptions.Multiline ).Count;

  • Monday, April 16, 2012 8:50 AM
    Moderator
     
     

    Hi TravelMan,

      I suggest you that you can read this thread,its topic is the same to you.

      word count

    http://social.msdn.microsoft.com/Forums/en-US/csharpgeneral/thread/a4de4e46-329e-4ad5-b8b2-b22f0a58b7bf 

    Sincerely,

    Jason Wang


    Jason Wang [MSFT]
    MSDN Community Support | Feedback to us

  • Monday, April 16, 2012 3:52 PM
     
     Answered Has Code

    Thanks everyone response. Maybe I didn't describe my case clearly. I don't see any feedback met my requirement.

    The character ~ is the end of segment. There may have multiple line breaks or line break plus multiple spaces. I need cover all kinds of cases finding out how many specific format TOM string.  

    It appears cover all cases if I change code to following

    var matches = Regex.Matches(content, "~\s*TOM\*");

    it will cover the following input
    TOM*......~    
    TOM*~<multiple line breaks>
    TOM*~TOM*~

    Again, thanks everyone's input. 

    • Marked As Answer by TravelMan Monday, April 16, 2012 3:52 PM
    •