none
Searching a body of text

    Question

  • Hello,

    I'm creating a program that will display some text in a multiline text box and search through that text for an email address and display the email address in another text box.

    I can't figure out how to search the text and pull the email address out.. all the information I'm sending to the multiline text box has one email address in it. Is there any way to parse the information to pull out that email address and display it in another text box? I'm not even sure if parsing is the way to go.. I'm fairly new to C#. Any suggestions will help. Thank you.

    Marcie
    Monday, June 08, 2009 7:22 PM

Answers

  • string[] words = textBox1.Text.Split(new char[]{' '});
    string emailPattern = @"^(([^<>()[\]\\.,;:\s@\""]+"
          + @"(\.[^<>()[\]\\.,;:\s@\""]+)*)|(\"".+\""))@"
          + @"((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}"
          + @"\.[0-9]{1,3}\])|(([a-zA-Z\-0-9]+\.)+"
          + @"[a-zA-Z]{2,}))$";
    Regex reg = new Regex(emailPattern);

    foreach(string word in words)
    {
       if(reg.IsMatch(word)
       {
           textBoxEmail.Text = word;
           break;
        }
    }



    Thanks,
    A.m.a.L
    .Net Goodies
    Remember to click "mark as answered" when you get a correct reply to your question
    • Marked as answer by Luxx Monday, June 08, 2009 10:32 PM
    Monday, June 08, 2009 7:41 PM
  • I concur John! Amal...why are you not using the power of Regex? ;-) Here is code to get a majority of email address for you spam engine ( ;-) )...err I mean for your project:

    string text = @"abc-ghi@def.com Lorem ipsum dolor sit amet, 
    consectetur bob_Denver@microsoft.com adipiscing elit. 
    Nunc et sapien ac nulla congue consectetur. 
    Pellentesque habitant%@regex.sub.tv morbi tristique senectus 
    et netus et malesuada fames ac turpis egestas. 
    Aenean pulvinar, augue at aliquam ornare, 
    justo nulla aliquam magna, non dignissim augue 
    turpis sed ipsum. Integer at consequat lectus. 
    Sed sed tempor enim. Pellentesque eu facilisis 
    lectus alli@babba.com";
    
    string pattern =@"
    (?<=^|\s)                   # There should be a space or the beginning of the text.
    (?<Email>[^\s]+@([^\s]+\.)+\w{2,3}) # Find XXXX to @ Multiple sub domains (.) the 2-3 characters.
    (?=[\s.;?]|$)               # Anchor to a space a period a semi colon and question mark or End of text
    ";
    
    var addys = from Match m in Regex.Matches( text, pattern, RegexOptions.IgnorePatternWhitespace )
                select m.Groups["Email"].Value;
    
    foreach ( string email in addys )
        Console.WriteLine( email );
    
    /* Output:
    abc-ghi@def.com
    bob_Denver@microsoft.com
    habitant%@regex.sub.tv
    alli@babba.com
    */

    To start you on your Regular Expression Journey check out the .Net regular expression forum Regular Expressions and its informative top level post .Net Regex Resources Reference, which is a useful reference for beginners to experts.

    Also one can use this free tool (Expresso) to test and learn about out regex patterns outside of ones .Net code.
    William Wegerson (www.OmegaCoder.Com)
    • Proposed as answer by JohnGrove Monday, June 08, 2009 10:15 PM
    • Marked as answer by Luxx Monday, June 08, 2009 10:34 PM
    Monday, June 08, 2009 8:54 PM
    Moderator

All replies

  • Is your email seperated from the rest of the text through some symbol? Is it on a seperate line in the text box. Show sample text in the multiline textbox.

    Because if your text is somedatamyemail@live.comsomemoredata. It isnt possible to seperate the email id out of it.

    Ganesh Ranganathan
    [Please mark the post as answer if you find it helpful]
    Monday, June 08, 2009 7:35 PM
  • string[] words = textBox1.Text.Split(new char[]{' '});
    string emailPattern = @"^(([^<>()[\]\\.,;:\s@\""]+"
          + @"(\.[^<>()[\]\\.,;:\s@\""]+)*)|(\"".+\""))@"
          + @"((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}"
          + @"\.[0-9]{1,3}\])|(([a-zA-Z\-0-9]+\.)+"
          + @"[a-zA-Z]{2,}))$";
    Regex reg = new Regex(emailPattern);

    foreach(string word in words)
    {
       if(reg.IsMatch(word)
       {
           textBoxEmail.Text = word;
           break;
        }
    }



    Thanks,
    A.m.a.L
    .Net Goodies
    Remember to click "mark as answered" when you get a correct reply to your question
    • Marked as answer by Luxx Monday, June 08, 2009 10:32 PM
    Monday, June 08, 2009 7:41 PM
  • a sample would be:

    Your message did not reach some or all of the intended recipients.

          Subject: Message From xxxxxxxx. Registration confirmation on account xxxxxxxx
          Sent: 6/6/2009 9:01 AM

    The following recipient(s) cannot be reached:

          email@live.com on 6/8/2009 11:24 AM
                Could not deliver the message in the time limit specified.  Please retry or contact your administrator.
                <xxxxxxxxxxxxxxxxxxx>

    But sometimes the email address is in a different location.. not the same location every time. Would I be able to pull it still?

    Monday, June 08, 2009 8:16 PM
  • string[] words = textBox1.Text.Split(new char[]{' '});
    string emailPattern = @"^(([^<>()[\]\\.,;:\s@\""]+"
          + @"(\.[^<>()[\]\\.,;:\s@\""]+)*)|(\"".+\""))@"
          + @"((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}"
          + @"\.[0-9]{1,3}\])|(([a-zA-Z\-0-9]+\.)+"
          + @"[a-zA-Z]{2,}))$";
    Regex reg = new Regex(emailPattern);

    foreach(string word in words)
    {
       if(reg.IsMatch(word)
       {
           textBoxEmail.Text = word;
           break;
        }
    }



    Thanks,
    A.m.a.L
    .Net Goodies
    Remember to click "mark as answered" when you get a correct reply to your question

    Why not just read the whole box, use MultiSelect and have a MatchCollection rather then creating a string array of words from the textBox? I guess either way would be fine.
    John Grove - TFD Group, Senior Software Engineer, EI Division, http://www.tfdg.com
    Monday, June 08, 2009 8:24 PM
  • I concur John! Amal...why are you not using the power of Regex? ;-) Here is code to get a majority of email address for you spam engine ( ;-) )...err I mean for your project:

    string text = @"abc-ghi@def.com Lorem ipsum dolor sit amet, 
    consectetur bob_Denver@microsoft.com adipiscing elit. 
    Nunc et sapien ac nulla congue consectetur. 
    Pellentesque habitant%@regex.sub.tv morbi tristique senectus 
    et netus et malesuada fames ac turpis egestas. 
    Aenean pulvinar, augue at aliquam ornare, 
    justo nulla aliquam magna, non dignissim augue 
    turpis sed ipsum. Integer at consequat lectus. 
    Sed sed tempor enim. Pellentesque eu facilisis 
    lectus alli@babba.com";
    
    string pattern =@"
    (?<=^|\s)                   # There should be a space or the beginning of the text.
    (?<Email>[^\s]+@([^\s]+\.)+\w{2,3}) # Find XXXX to @ Multiple sub domains (.) the 2-3 characters.
    (?=[\s.;?]|$)               # Anchor to a space a period a semi colon and question mark or End of text
    ";
    
    var addys = from Match m in Regex.Matches( text, pattern, RegexOptions.IgnorePatternWhitespace )
                select m.Groups["Email"].Value;
    
    foreach ( string email in addys )
        Console.WriteLine( email );
    
    /* Output:
    abc-ghi@def.com
    bob_Denver@microsoft.com
    habitant%@regex.sub.tv
    alli@babba.com
    */

    To start you on your Regular Expression Journey check out the .Net regular expression forum Regular Expressions and its informative top level post .Net Regex Resources Reference, which is a useful reference for beginners to experts.

    Also one can use this free tool (Expresso) to test and learn about out regex patterns outside of ones .Net code.
    William Wegerson (www.OmegaCoder.Com)
    • Proposed as answer by JohnGrove Monday, June 08, 2009 10:15 PM
    • Marked as answer by Luxx Monday, June 08, 2009 10:34 PM
    Monday, June 08, 2009 8:54 PM
    Moderator