Regex for Mailto links
-
Monday, February 06, 2012 2:46 PM
Hi,
I'm trying to create a regex to match me email address from html pages.
I want to match 2 types of email:
<a href="mailto:test@test.com">test@test.com</a>
and
<a href="mailto:test@test.com?subject=123">test@test.com</a>
I've used this regex but it didn't work:
Match match = Regex.Match("<a href=\"mailto:test@test.com?subject=1234", "mailto:(?<Email>.+)\\?|\"");
I think it has something with greedy...
Another question:
This is my string.
string x = "hell
<br/>
oooooooooooooooooooooooooooooooo";How can i match only the first o?
I've tried this but it didn't work for me:
Match match = Regex.Match("helloooooooooooooooooooooooo","hell(?<GroupName>)o");
I want that the math.Groups["GroupName"].Value will return me only the first o
how do i do it?
All Replies
-
Monday, February 06, 2012 6:58 PM
Hello BRegex,
Below, the answers to your questions:
using System; using System.Text.RegularExpressions; namespace c64e13cb_0e36_482f_bc23_85d68bd3583f { internal class Program { private static void Main() { // Email question const string emailPattern = @"([a-zA-Z0-9_\-\.]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})"; const string pattern = "(" + @"(?<withsubject>" + @"(?<=mailto\:)" + emailPattern + @"(?=\?)" + ")" + "|" + @"(?<withoutsubject>" + @"(?<=mailto\:)" + emailPattern + @"(?="")" + ")" + ")"; const string s = @"<a href=""mailto:test@test.com"">test@test.com</a>" + @"<a href=""mailto:test@test.com?subject=123"">test@test.com</a>"; var r = new Regex(pattern); MatchCollection mc = r.Matches(s); foreach (Match m in mc) { string withsubject = m.Groups["withsubject"].Value; string withoutsubject = m.Groups["withoutsubject"].Value; if (!string.IsNullOrEmpty(withsubject)) Console.WriteLine("Email with subject: {0}", withsubject); if (!string.IsNullOrEmpty(withoutsubject)) Console.WriteLine("Email without subject: {0}", withoutsubject); } // The o question Match match = Regex.Match("helloooooooooooooooooooooooo", "hell(?<GroupName>o)"); Console.WriteLine("The o question: {0}", match.Value); Console.ReadKey(); } } }
Kind regards,
My blog
Whether you’re a construction worker, a forum moderator, or just someone that likes helping people. I think these guidelines can be helpful in keeping you helpful when being helpful.- Edited by Link.fr Monday, February 06, 2012 6:59 PM Minor
- Marked As Answer by Paul ZhouModerator Thursday, February 16, 2012 8:22 AM
-
Monday, February 06, 2012 7:11 PM
On Mon, 6 Feb 2012 14:46:36 +0000, BRegex wrote:>>>Hi,>>I'm trying to create a regex to match me email address from html pages.>>I want to match 2 types of email:>><a href="mailto:test@test.com">test@test.com</a>>>and>><a href="mailto:test@test.com?subject=123">test@test.com</a>>>>>I've used this regex but it didn't work:>>>Match match = Regex.Match("<a href=\"mailto:test@test.com?subject=1234", "mailto:(?<Email>.+)\\?|\"");>>>>>I think it has something with greedy...It is always helpful if you lay out exactly what you want your regex to return (see the sticky about How to Ask A Regex Question).Making an assumption that what you wish to return is:test@test.comtest@test.com?subject=123then try this regex:"(?<=mailto:)[^\"]+(?=\">)"(and I'm not sure if the lookahead at the end is really necessary).>>>>Another question:>>This is my string.>>string x = "hell><br/>>>oooooooooooooooooooooooooooooooo";>>>>How can i match only the first o?>>I've tried this but it didn't work for me:>>>Match match = Regex.Match("helloooooooooooooooooooooooo","hell(?<GroupName>)o");>>>>>I want that the math.Groups["GroupName"].Value will return me only the first o>>how do i do it?\A(?:.(?<!o))*(?<First_o>o)
Ron- Marked As Answer by Paul ZhouModerator Thursday, February 16, 2012 8:22 AM

