.NET Framework Developer Center >
.NET Development Forums
>
Regular Expressions
>
How to: Verify That Strings Are in Valid E-Mail Format
How to: Verify That Strings Are in Valid E-Mail Format
- This is HALF answered in the MSDN library with an example function called IsValidEmail.
Does anyone else think maybe it's time the example function (which developers are likely to copy-paste) was changed to closer represented what an email address can actually be?
Sure, it matches joe.blogs@example.com, or joe-blogs@example.com...
But how about joe.blogs+work@example.com, or joe.blogs+support@example.com, or e=mc^2@example.com, or many other valid email addresses?
The page http://msdn.microsoft.com/en-us/library/01escwtf.aspx has been updated for each new release of VisualStudio/.NET, but even the VS 2010/.NET4.0 one isn't much better!- Moved bynobugzMVP, ModeratorTuesday, June 02, 2009 11:41 AMnot a clr q (From:Common Language Runtime)
Answers
- I understand your frustration, I don't have a unique email address, but use multiple ones based on the website I am logging into. I have a personal domain that has a catch all for all email addys w/o account. That is how I do something like this
msdn_ww@MyDomain.com
NYTImes_ww@MyDomain.com
Amazon_ww@MyDomain.com
That way I know if my email addy has been sold to another company when I receive email from the non originating source, or if a site gets hacked and I start receiving ____ spam. Has happened.
The best you can do is add a community comment to the MSDN page specifying what you go through; for in the end its up to the target web developers who are enforcing the bad regex patterns on the users due to ignorance of the situation.
Other suggestions
- Possibly create a website specifically for this problem. If you maintain it long enough it might become the first item in the search engines list instead of the bad patterns.
- Put the pattern on the wikipedia page (E-mail address ) which best solves the problem.
William Wegerson (www.OmegaCoder.Com)- Marked As Answer byNastyBastard Thursday, June 04, 2009 6:38 AM
- Unmarked As Answer byNastyBastard Monday, June 08, 2009 10:17 AM
- Marked As Answer byNastyBastard Tuesday, June 09, 2009 4:27 AM
- In its current form, which is
^([0-9a-zA-Z]([-\.\w]*[0-9a-zA-Z])*@([0-9a-zA-Z][-\w]*[0-9a-zA-Z]\.)+[a-zA-Z]{2,9})$
the regular expression has a number of limitations. You've mentioned its support for only a subset of supported characters (it fails to support ! # $ %& ' * + = ? ^ ` { } | and ~), but there are a few other limitations that are worth noting as well:
1. The local-part of an email address can still contain invalid characters if it is delimited by quotation marks. The IsValidEmail method does not recognize this convention and returns False.
2. Double dots in an email address are not allowed. The IsValidEmail method does not recognize this restriction and, if the email address is otherwise valid, returns True.
3. It does recognize a domain that consists of an IP address. In such cases, it always returns False.
4. It does not validate the top-level domain. For example, it would recognize as valid an address like someone@validaddress.not.
Although we have revised the regular expression several times in the past, it seems in need of another update. This will be reflected in the next scheduled update of the Visual Studio 2008 documentation, as well as in the final Visual Studio 2010 documentation. A preliminary version of this regular expression is:
^(("".+?"")|([0-9a-zA-Z](((\.(?!\.))|([-!#\$%&'\*\+/=\?\^`\{\}\|~\w]))*[0-9a-zA-Z])*))@([0-9a-zA-Z][-\w]*[0-9a-zA-Z]\.)+[a-zA-Z]{2,9}$It corrects three of the issues that I've listed above (it supports all valid characters listed in RFC 5322, recognizes a local-part that is delimited by quotation marks, and disallowed double dots). It still doesn't recognize IP addresses (we'll fix this in the revised documentation), nor does it validate the top-level domain (a regex is not the appropriate tool to do this).
--Ron Petrusha
Developer Division User Education
Microsoft Corporation- Marked As Answer byNastyBastard Saturday, November 21, 2009 2:59 PM
All Replies
- Ah, that brings back memories. I saw a regular expression once that filled an entire printed page, filled with completely inscrutable regexp. The Unix Hater's Handbook about sendmail was a great joy too. It's a lively discussion, you probably can contribute if you are in to that sort of stuff.
But it has very little to do with the CLR, the topic of this forum.
Hans Passant. - public static bool isValidEmail(string inputEmail)
{
string strRegex = @"^([a-zA-Z0-9_\-\.]+)@((\[[0-9]{1,3}" +
@"\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\" +
@".)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)$";
Regex re = new Regex(strRegex);
if (re.IsMatch(inputEmail))
return (true);
else
return (false);
}
Here we go
Regards, Nikolay- Proposed As Answer byTao Liang Tuesday, June 02, 2009 6:52 AM
- Unproposed As Answer byOmegaManMVP, ModeratorTuesday, June 02, 2009 6:32 PM
- I think you've missed the point. All you've really achieved is to allow for an ip address instead of a domain. Whilst useful, that's even less common than some of my examples.
If anyone can and cares to update the documentation, something as simple as the following would be a step in the right direction
^[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}$
I lifted it from http://www.regular-expressions.info/email.html. They also have a more comprehensive one closer to RFC 2822, but all I'm looking for personally in this change is that amateur developers who just copy-past the example validation code don't invalidate my joe.bloggs+support@example.com email address. Those who use Gmail probably understand my frustration.. but may not realize it's also supported by other MTAs. ^[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}$
Doesn't work. Incorrectly reports that
joe.. blogs+work@example.com
. joe.blogs+work@example.com
is a valid addy. Can't have two periods consecutively or a period at the beginning.
William Wegerson (www.OmegaCoder.Com)- I understand this doesn't invalidate all technically-invalid email addresses. But even an email address adhering perfectly to the RFC could still be invalid (i.e. the user doesn't exist, or the domain doesn't have an MX record).
I think, if you're not going to match exactly, you should ere on allowing MORE email addresses over FEWER. Because of this example (albeit indirectly) I can't sign up on many websites with my perfectly valid email address. It's an annoyance I often hear from developer peers.
Disallowing "+", or "%" is just as arbitrary as disallowing "e" or "t".
A very basic email address validation is testing for an @ character. If you don't like the semi complicated ones, why not stick with that? ^.*@.*$
If you're anal about having a complex one that will invalidate on some principles (such as the double period "..") then use one of the more comprehensive ones found at http://www.regular-expressions.info/email.html. - I understand your frustration, I don't have a unique email address, but use multiple ones based on the website I am logging into. I have a personal domain that has a catch all for all email addys w/o account. That is how I do something like this
msdn_ww@MyDomain.com
NYTImes_ww@MyDomain.com
Amazon_ww@MyDomain.com
That way I know if my email addy has been sold to another company when I receive email from the non originating source, or if a site gets hacked and I start receiving ____ spam. Has happened.
The best you can do is add a community comment to the MSDN page specifying what you go through; for in the end its up to the target web developers who are enforcing the bad regex patterns on the users due to ignorance of the situation.
Other suggestions
- Possibly create a website specifically for this problem. If you maintain it long enough it might become the first item in the search engines list instead of the bad patterns.
- Put the pattern on the wikipedia page (E-mail address ) which best solves the problem.
William Wegerson (www.OmegaCoder.Com)- Marked As Answer byNastyBastard Thursday, June 04, 2009 6:38 AM
- Unmarked As Answer byNastyBastard Monday, June 08, 2009 10:17 AM
- Marked As Answer byNastyBastard Tuesday, June 09, 2009 4:27 AM
- Phil Haack has a post which mirrors your sentiments: I Knew How To Validate An Email Address Until I Read The RFC
William Wegerson (www.OmegaCoder.Com) - I took your advice and vented on http://gotgeek.co.nz/?p=33. Doing so prompted fellow developers to express equivalent pet peeves from which I can learn so as to not make such mistakes. After enough trolls have helped me fill in all the holes, and weed out the circumlocution and verbosity I hope to end up with a helpful/educational post to which I can refer people.
- I liked this post so much, for we are getting a few email regex questions. I made it sticky to regex forum for 2010.
William Wegerson (www.OmegaCoder.Com) I never consider such complex email address verification. I often use the following regex to verify email.
function verifyAddress(obj)
{
var email = obj;
var pattern = /^([a-zA-Z0-9_-])+@([a-zA-Z0-9_-])+(.[a-zA-Z0-9_-])+/;
flag = pattern.test(email);
if(flag)
return true;
else
return false;
}
__________________________________
April
http://www.comm100.com/livechat/
Microsoft Certified Partner- As you may have noticed from some of the above discussions, your validation prevents many popular and common addresses, such as joe.blogs@example.com (popular in business), or webmaster@localhost
How about just:
function verifyAddress(email_address)
{
var pattern = /@/;
return pattern.test(email_address);
}
- May be this URL helps...
http://www.techartifact.com/blogs/2009/06/email-address-validation-in-csharp.html
Anky - thanks ankit.this was helpful.though I was not facing the same problem but I needed an elaborate idea on this,which I got
faa practice test - In its current form, which is
^([0-9a-zA-Z]([-\.\w]*[0-9a-zA-Z])*@([0-9a-zA-Z][-\w]*[0-9a-zA-Z]\.)+[a-zA-Z]{2,9})$
the regular expression has a number of limitations. You've mentioned its support for only a subset of supported characters (it fails to support ! # $ %& ' * + = ? ^ ` { } | and ~), but there are a few other limitations that are worth noting as well:
1. The local-part of an email address can still contain invalid characters if it is delimited by quotation marks. The IsValidEmail method does not recognize this convention and returns False.
2. Double dots in an email address are not allowed. The IsValidEmail method does not recognize this restriction and, if the email address is otherwise valid, returns True.
3. It does recognize a domain that consists of an IP address. In such cases, it always returns False.
4. It does not validate the top-level domain. For example, it would recognize as valid an address like someone@validaddress.not.
Although we have revised the regular expression several times in the past, it seems in need of another update. This will be reflected in the next scheduled update of the Visual Studio 2008 documentation, as well as in the final Visual Studio 2010 documentation. A preliminary version of this regular expression is:
^(("".+?"")|([0-9a-zA-Z](((\.(?!\.))|([-!#\$%&'\*\+/=\?\^`\{\}\|~\w]))*[0-9a-zA-Z])*))@([0-9a-zA-Z][-\w]*[0-9a-zA-Z]\.)+[a-zA-Z]{2,9}$It corrects three of the issues that I've listed above (it supports all valid characters listed in RFC 5322, recognizes a local-part that is delimited by quotation marks, and disallowed double dots). It still doesn't recognize IP addresses (we'll fix this in the revised documentation), nor does it validate the top-level domain (a regex is not the appropriate tool to do this).
--Ron Petrusha
Developer Division User Education
Microsoft Corporation- Marked As Answer byNastyBastard Saturday, November 21, 2009 2:59 PM
- One small step for man. One giant leap for man kind.
Thank you.
Jeremy Lawson
Senior Developer
Doubledot Media - Hi
I found a class file that may seem interesting to you.
Hope you find this helpfull.Visit this page http://coderbuddy.wordpress.com/2009/10/31/coder-buddyc-code-to-extract-email/ for more description on this articleusing System; using System.Collections.Generic; using System.Text; using System.Text.RegularExpressions; namespace Coderbuddy { public class ExtractEmails { private string s; public ExtractEmails(string Text2Scrape) { this.s = Text2Scrape; } public string[] Extract_Emails() { string[] Email_List = new string[0]; Regex r = new Regex(@"[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,6}", RegexOptions.IgnoreCase); Match m; //Searching for the text that matches the above regular expression(which only matches email addresses) for (m = r.Match(s); m.Success; m = m.NextMatch()) { //This section here demonstartes Dynamic arrays if (m.Value.Length > 0) { //Resize the array Email_List by incrementing it by 1, to save the next result Array.Resize(ref Email_List, Email_List.Length + 1); Email_List[Email_List.Length - 1] = m.Value; } } return Email_List; } } }
- Proposed As Answer byvamsi_krishna Saturday, November 21, 2009 7:39 AM
- Unproposed As Answer byNastyBastard Saturday, November 21, 2009 2:59 PM
String mailRegex = "\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b";


