count the occurrence of a particular word\phrase in a email RRS feed

  • Question

  • Currently, we are working on the feature where customers asking us to provide keyword/keyword phrase statistics in an email. On a daily basis, there are around 1 one million emails are getting exchange between the users in the organization. On the subset of emails, there is a reviewer’s job in our application to perform searches based on keyword/keyword phrase and review the emails for compliance purpose. We are currently investigating the best tool which can solve this use case. We had tried SQL full text indexing feature, but it doesn’t index the ‘keyword phrase’ at all.  Here is some context of our requirement,


    Our requirement is to count the occurrence of a particular word\phrase in a file:

    Word\phrase can contain a wild card character as well:

    Examples of word:

    1. swim
    2. swim*


    Example of phrase:

    1. swim* in the water
    2. swim in the water*
    3. won't do it


    Can anyone tell us what could be the best way (most performant) way to do this in C#.Net? or any other way?




    Tuesday, December 18, 2018 6:30 PM