Regular Expression and the Ubiquitous Null

    General discussion

  • I recently answered a post where the user did not understand why the pattern he was using was not returning a match in the first match location. That was due to the fact that the regex parser was faithfully returning valid matches which were null. I have blogged about the issue in detail here:

    Regular Expression and the Ubiquitous Null

    and wrote a connect issue suggesting that there be a flag added to the regex parser to not report any null matches or captures as IgnoreNullMatchesAndCaptures which is here

    Regular Expression (Regex) Improvements - Null Value Ignore

    Comments to my blog or here are appreciated.

    Friday, May 25, 2007 5:25 PM

All replies


    There are a few problems with the suggestion.


    1) The values returned are not null but empty strings, and very useful they are too.

    2) If you have the ExplicitCapture set and an empty capture is omitted then you code would crash at

    string someValue = myMatch.Groups["CouldBeEmpty"].Value

    3) The values of \1,\2 etc would differ depending whether a capture was null

    4) There would be problems with lookarounds with references.



    Thursday, December 27, 2007 4:18 PM
  • I agree with all your points. This operation would have unitended side effects as you mentioned. The benefits gained by the operation would pale due to possible problems it could create for other issues.
    Thursday, December 27, 2007 7:32 PM
  • Unfortunately I find I made a mistake. I had always assumed as per 2 above that a groupname that was NOT mentioned would cause a crash. As in...

    regex = (?<name1>.*)

    code = myMatch.Groups["badname"].Value.

    Apparently not - still we live and learn.

    On a side issue there are things that I would like in a regex. They may exist but I just don't know to implement them...

    1) (?"some stuff") would return the capture.value = some stuff.
    2)  An anchor meaning HERE, so I could capture a location
    Regex = (?<LookHere>\H)
    code = myMatch.Groups["LookHere"].Index

    I know there are ways round this, but nothing really clean that is obvious to a person reading your code.

    Then again, regexes are not usually obvious.

    Best regards - and keep up the good work helping people.

    Thursday, December 27, 2007 7:50 PM