Answered Finding a sentence in a text?

  • Friday, February 24, 2012 1:46 AM
     
     

    Hi guys,

    Im trying to find  sentences in a text. this code partly does what i want but i can not get all sentences in the text .please see code for explanation.

    i need to do something with i and j counters but couldnt figure out what.

    thanks.

    [code]

     

    public SentenceList(Text text)
            {
                string wordDelimeters = "!?.";
                int found = 0;
     int i = 0,j=0;
     for (i=0; i < text.Tokens.Count; i++)
                {
                    found = text.Tokens[i].IndexOfAny(wordDelimeters.ToCharArray());
    if (found !=-1)
                    {
    for (j = 0; j < i; j++)
                           {

    token = text.Tokens[j] + token;

     }//i want to say this j=i   start with the index whereevery you stop,. but in the for(j=0) it makes j=0 again and mess up the result

    Console.WriteLine(token);

    [/code]

All Replies

  • Friday, February 24, 2012 2:41 AM
    Moderator
     
     Answered Has Code

    Are you just trying to split the string into sentences?  If so, you can probably do this via an iterator a bit more easily than your code above:

    public static IEnumerable<string> MakeSentenceList(string text)
    {
        char[] wordDelimeters = new[] { '!', '?', '.' };
        int wordStart = 0;
    
        while (true)
        {
            while (wordStart < text.Length && char.IsWhiteSpace(text[wordStart]))
                ++wordStart;
                    
            int pos = text.IndexOfAny(wordDelimeters, wordStart);
            if (pos > -1)
            {
                yield return text.Substring(wordStart, pos - wordStart + 1);
                wordStart = pos + 1;
            }
            else if (wordStart < text.Length)
            {
                yield return text.Substring(wordStart, text.Length - wordStart);
                break;
            }
            else break;
        }
    }
    

    Using this, you can write:

                var sentences = MakeSentenceList("This is a test.  This is only a test.  This is just to see what happens. Foo");
                foreach (var sentence in sentences)
                {
                    Console.WriteLine(sentence);
                }
    


    Reed Copsey, Jr. - http://reedcopsey.com
    If a post answers your question, please click "Mark As Answer" on that post and "Mark as Helpful".

    • Marked As Answer by Lisyus35 Friday, February 24, 2012 3:36 AM
    •  
  • Friday, February 24, 2012 3:36 AM
     
     
    thank you very much .but i didnt want to display punctuation marks .but still works thanks.
  • Friday, February 24, 2012 5:00 PM
    Moderator
     
     Answered Has Code
    thank you very much .but i didnt want to display punctuation marks .but still works thanks.

    If you want to leave them off, just change the first yield line to:

         yield return text.Substring(wordStart, pos - wordStart);

    By removing the +1, you won't show the separator character.


    Reed Copsey, Jr. - http://reedcopsey.com
    If a post answers your question, please click "Mark As Answer" on that post and "Mark as Helpful".

    • Marked As Answer by Lisyus35 Saturday, February 25, 2012 10:34 PM
    •