.NET Framework Developer Center > .NET Development Forums > Regular Expressions > How to: Verify That Strings Are in Valid E-Mail Format
Ask a questionAsk a question
 

StickyHow to: Verify That Strings Are in Valid E-Mail Format

  • Thursday, May 28, 2009 10:39 PMNastyBastard Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    This is HALF answered in the MSDN library with an example function called IsValidEmail.

    Does anyone else think maybe it's time the example function (which developers are likely to copy-paste) was changed to closer represented what an email address can actually be?

    Sure, it matches joe.blogs@example.com, or joe-blogs@example.com...

    But how about joe.blogs+work@example.com, or joe.blogs+support@example.com, or e=mc^2@example.com, or many other valid email addresses?

    The page http://msdn.microsoft.com/en-us/library/01escwtf.aspx has been updated for each new release of VisualStudio/.NET, but even the VS 2010/.NET4.0 one isn't much better!
    • Moved bynobugzMVP, ModeratorTuesday, June 02, 2009 11:41 AMnot a clr q (From:Common Language Runtime)
    •  

Answers

  • Wednesday, June 03, 2009 2:09 PMOmegaManMVP, ModeratorUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     Answer
    I understand your frustration, I don't have a unique email address, but use multiple ones based on the website I am logging into. I have a personal domain that has a catch all for all email addys w/o account. That is how I do something like this

    msdn_ww@MyDomain.com
    NYTImes_ww@MyDomain.com
    Amazon_ww@MyDomain.com

    That way I know if my email addy has been sold to another company when I receive email from the non originating source, or if a site gets hacked and I start receiving ____ spam. Has happened.

    The best you can do is add a community comment to the MSDN page specifying what you go through; for in the end its up to the target web developers who are enforcing the bad regex patterns on the users due to ignorance of the situation.

    Other suggestions

    1. Possibly create a website specifically for this problem. If you maintain it long enough it might become the first item in the search engines list instead of the bad patterns.
    2. Put the pattern on the wikipedia page (E-mail address ) which best solves the problem.
    GL


    William Wegerson (www.OmegaCoder.Com)
  • Friday, October 23, 2009 8:49 PMR Petrusha - MSFT Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     AnswerHas Code
    In its current form, which is

    ^([0-9a-zA-Z]([-\.\w]*[0-9a-zA-Z])*@([0-9a-zA-Z][-\w]*[0-9a-zA-Z]\.)+[a-zA-Z]{2,9})$

    the regular expression has a number of limitations. You've mentioned its support for only a subset of supported characters (it fails to support ! # $ %& ' * + = ? ^ ` { } | and ~), but there are a few other limitations that are worth noting as well:

    1. The local-part of an email address can still contain invalid characters if it is delimited by quotation marks. The IsValidEmail method does not recognize this convention and returns False.
    2. Double dots in an email address are not allowed. The IsValidEmail method does not recognize this restriction and, if the email address is otherwise valid, returns True.
    3. It does recognize a domain that consists of an IP address. In such cases, it always returns False.
    4. It does not validate the top-level domain. For example, it would recognize as valid an address like someone@validaddress.not.

    Although we have revised the regular expression several times in the past, it seems in need of another update. This will be reflected in the next scheduled update of the Visual Studio 2008 documentation, as well as in the final Visual Studio 2010 documentation. A preliminary version of this regular expression is:

    ^(("".+?"")|([0-9a-zA-Z](((\.(?!\.))|([-!#\$%&'\*\+/=\?\^`\{\}\|~\w]))*[0-9a-zA-Z])*))@([0-9a-zA-Z][-\w]*[0-9a-zA-Z]\.)+[a-zA-Z]{2,9}$
    It corrects three of the issues that I've listed above (it supports all valid characters listed in RFC 5322, recognizes a local-part that is delimited by quotation marks, and disallowed double dots). It still doesn't recognize IP addresses (we'll fix this in the revised documentation), nor does it validate the top-level domain (a regex is not the appropriate tool to do this).

    --Ron Petrusha
      Developer Division User Education
      Microsoft Corporation







    • Marked As Answer byNastyBastard Saturday, November 21, 2009 2:59 PM
    •  

All Replies

  • Friday, May 29, 2009 12:08 AMnobugzMVP, ModeratorUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    Ah, that brings back memories.  I saw a regular expression once that filled an entire printed page, filled with completely inscrutable regexp.  The Unix Hater's Handbook about sendmail was a great joy too.  It's a lively discussion, you probably can contribute if you are in to that sort of stuff.

    But it has very little to do with the CLR, the topic of this forum.

    Hans Passant.
  • Monday, June 01, 2009 2:06 PMNikolay Podkolzin Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    public static bool isValidEmail(string inputEmail)
    {
       string strRegex = @"^([a-zA-Z0-9_\-\.]+)@((\[[0-9]{1,3}" +
             @"\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\" +
             @".)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)$";
       Regex re = new Regex(strRegex);
       if (re.IsMatch(inputEmail))
        return (true);
       else
        return (false);
    }


    Here we go
    Regards, Nikolay
  • Tuesday, June 02, 2009 8:44 AMNastyBastard Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    I think you've missed the point. All you've really achieved is to allow for an ip address instead of a domain. Whilst useful, that's even less common than some of my examples.

    If anyone can and cares to update the documentation, something as simple as the following would be a step in the right direction

    ^[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}$

    I lifted it from http://www.regular-expressions.info/email.html. They also have a more comprehensive one closer to RFC 2822, but all I'm looking for personally in this change is that amateur developers who just copy-past the example validation code don't invalidate my joe.bloggs+support@example.com email address. Those who use Gmail probably understand my frustration.. but may not realize it's also supported by other MTAs.
  • Tuesday, June 02, 2009 6:30 PMOmegaManMVP, ModeratorUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    ^[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}$
    Doesn't work. Incorrectly reports that

    joe.. blogs+work@example.com
    . joe.blogs+work@example.com

    is a valid addy. Can't have two periods consecutively or a period at the beginning.

    William Wegerson (www.OmegaCoder.Com)
  • Wednesday, June 03, 2009 6:44 AMNastyBastard Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    I understand this doesn't invalidate all technically-invalid email addresses. But even an email address adhering perfectly to the RFC could still be invalid (i.e. the user doesn't exist, or the domain doesn't have an MX record).
    I think, if you're not going to match exactly, you should ere on allowing MORE email addresses over FEWER. Because of this example (albeit indirectly) I can't sign up on many websites with my perfectly valid email address. It's an annoyance I often hear from developer peers.

    Disallowing "+", or "%" is just as arbitrary as disallowing "e" or "t".

    A very basic email address validation is testing for an @ character. If you don't like the semi complicated ones, why not stick with that? ^.*@.*$
    If you're anal about having a complex one that will invalidate on some principles (such as the double period "..") then use one of the more comprehensive ones found at http://www.regular-expressions.info/email.html.
  • Wednesday, June 03, 2009 2:09 PMOmegaManMVP, ModeratorUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     Answer
    I understand your frustration, I don't have a unique email address, but use multiple ones based on the website I am logging into. I have a personal domain that has a catch all for all email addys w/o account. That is how I do something like this

    msdn_ww@MyDomain.com
    NYTImes_ww@MyDomain.com
    Amazon_ww@MyDomain.com

    That way I know if my email addy has been sold to another company when I receive email from the non originating source, or if a site gets hacked and I start receiving ____ spam. Has happened.

    The best you can do is add a community comment to the MSDN page specifying what you go through; for in the end its up to the target web developers who are enforcing the bad regex patterns on the users due to ignorance of the situation.

    Other suggestions

    1. Possibly create a website specifically for this problem. If you maintain it long enough it might become the first item in the search engines list instead of the bad patterns.
    2. Put the pattern on the wikipedia page (E-mail address ) which best solves the problem.
    GL


    William Wegerson (www.OmegaCoder.Com)
  • Sunday, June 07, 2009 2:46 PMOmegaManMVP, ModeratorUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    Phil Haack has a post which mirrors your sentiments: I Knew How To Validate An Email Address Until I Read The RFC
    William Wegerson (www.OmegaCoder.Com)
  • Tuesday, June 09, 2009 4:47 AMNastyBastard Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    I took your advice and vented on http://gotgeek.co.nz/?p=33. Doing so prompted fellow developers to express equivalent pet peeves from which I can learn so as to not make such mistakes. After enough trolls have helped me fill in all the holes, and weed out the circumlocution and verbosity I hope to end up with a helpful/educational post to which I can refer people.
  • Tuesday, September 15, 2009 3:05 PMOmegaManMVP, ModeratorUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    I liked this post so much, for we are getting a few email regex questions. I made it sticky to regex forum for 2010.
    William Wegerson (www.OmegaCoder.Com)
  • Tuesday, September 29, 2009 11:09 AMapril_123456 Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     

    I never consider such complex email address verification. I often use the following regex to verify email.

    function   verifyAddress(obj)      
      {      
      var   email   =   obj;      
      var   pattern   =   /^([a-zA-Z0-9_-])+@([a-zA-Z0-9_-])+(.[a-zA-Z0-9_-])+/;      
      flag   =   pattern.test(email);      
      if(flag)      
      return   true;      
      else      
      return   false;      
      }   

    __________________________________
    April
    http://www.comm100.com/livechat/

    Microsoft Certified Partner

  • Tuesday, September 29, 2009 11:19 AMNastyBastard Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    As you may have noticed from some of the above discussions, your validation prevents many popular and common addresses, such as joe.blogs@example.com (popular in business), or webmaster@localhost

    How about just:
    function verifyAddress(email_address)
    {
      var pattern = /@/;
      return pattern.test(email_address);
    }


  • Thursday, October 01, 2009 7:04 AMAnkit Goyal Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
  • Friday, October 02, 2009 5:36 PMkeddy1 Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    thanks ankit.this was helpful.though I was not facing the same problem but I needed an elaborate idea on this,which I got
    faa practice test
  • Friday, October 23, 2009 8:49 PMR Petrusha - MSFT Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     AnswerHas Code
    In its current form, which is

    ^([0-9a-zA-Z]([-\.\w]*[0-9a-zA-Z])*@([0-9a-zA-Z][-\w]*[0-9a-zA-Z]\.)+[a-zA-Z]{2,9})$

    the regular expression has a number of limitations. You've mentioned its support for only a subset of supported characters (it fails to support ! # $ %& ' * + = ? ^ ` { } | and ~), but there are a few other limitations that are worth noting as well:

    1. The local-part of an email address can still contain invalid characters if it is delimited by quotation marks. The IsValidEmail method does not recognize this convention and returns False.
    2. Double dots in an email address are not allowed. The IsValidEmail method does not recognize this restriction and, if the email address is otherwise valid, returns True.
    3. It does recognize a domain that consists of an IP address. In such cases, it always returns False.
    4. It does not validate the top-level domain. For example, it would recognize as valid an address like someone@validaddress.not.

    Although we have revised the regular expression several times in the past, it seems in need of another update. This will be reflected in the next scheduled update of the Visual Studio 2008 documentation, as well as in the final Visual Studio 2010 documentation. A preliminary version of this regular expression is:

    ^(("".+?"")|([0-9a-zA-Z](((\.(?!\.))|([-!#\$%&'\*\+/=\?\^`\{\}\|~\w]))*[0-9a-zA-Z])*))@([0-9a-zA-Z][-\w]*[0-9a-zA-Z]\.)+[a-zA-Z]{2,9}$
    It corrects three of the issues that I've listed above (it supports all valid characters listed in RFC 5322, recognizes a local-part that is delimited by quotation marks, and disallowed double dots). It still doesn't recognize IP addresses (we'll fix this in the revised documentation), nor does it validate the top-level domain (a regex is not the appropriate tool to do this).

    --Ron Petrusha
      Developer Division User Education
      Microsoft Corporation







    • Marked As Answer byNastyBastard Saturday, November 21, 2009 2:59 PM
    •  
  • Tuesday, October 27, 2009 1:17 PMNastyBastard Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    One small step for man. One giant leap for man kind.

    Thank you.

    Jeremy Lawson
    Senior Developer
    Doubledot Media
  • Saturday, November 21, 2009 7:39 AMvamsi_krishna Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     Has Code
    Hi
    I found a class file that may seem interesting to you.
    using System;
    using System.Collections.Generic;
    using System.Text;
    using System.Text.RegularExpressions;
    
    namespace Coderbuddy
    {
    public class ExtractEmails
    {
    private string s;
    public ExtractEmails(string Text2Scrape)
    {
    this.s = Text2Scrape;
    }
    
    public string[] Extract_Emails()
    {
    string[] Email_List = new string[0];
    Regex r = new Regex(@"[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,6}", RegexOptions.IgnoreCase);
    Match m;
    //Searching for the text that matches the above regular expression(which only matches email addresses)
    for (m = r.Match(s); m.Success; m = m.NextMatch())
    {
    //This section here demonstartes Dynamic arrays
    if (m.Value.Length > 0)
    {
    //Resize the array Email_List by incrementing it by 1, to save the next result
    Array.Resize(ref Email_List, Email_List.Length + 1);
    Email_List[Email_List.Length - 1] = m.Value;
    }
    }
    return Email_List;
    }
    }
    }
    
    Hope you find this helpfull.Visit this page http://coderbuddy.wordpress.com/2009/10/31/coder-buddyc-code-to-extract-email/ for more description on this article
    • Proposed As Answer byvamsi_krishna Saturday, November 21, 2009 7:39 AM
    • Unproposed As Answer byNastyBastard Saturday, November 21, 2009 2:59 PM
    •  
  • Monday, November 23, 2009 7:56 PMAkram El Assas Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     Has Code
    String mailRegex = "\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b";