none
Regex matching phone numbers with PowerShell RRS feed

  • Question

  • Hey,

    I cannot for the life of me figure this one out. I have the following:

    'tel:+999999999999995599;ext=3434;ms-skip-rnl' -match ‘tel:\+(\d+)(?:;ext=(\d+))?(?:;(\w+))?’;$Matches

    Basically I am trying to separate out the number and optionally extension and what ever other sting follows that if they exist.

     tel:+ indicates that a number will follow

    ;ext= indicates that an extension will follow (this may or may not exist)

    ;<any string> could be any number of things and I would like to also store as a match (this may or may not exist)

    I have 2 problems:

    1. The 3rd match needs to be "ms-skip-rnl" or any other string that may be in its place. If I remove the dashes it works as expected. Why do the dashes stop this working?

    Name                       Value                                                                                                                                                  
    ----                           -----                                                                                                                                                  
    3                              ms                                                                                                                                                 
    2                              3434                                                                                                                                               
    1                              999999999999995599
    0                              tel:+999999999999995599;ext=3434;ms    


    2. If a poorly formatted number string exists I don't want a match to be made. E.g. 'tel:+999999999999995599;ext='

    ;ext= dosent have a number following it so I want to ignore it completely but it matches as "ext":                                                                                                                                              
    Name                       Value                                                                                                                                                  
    ----                           -----                                                                                                                                                  
    3                              ext                                                                                                                                         
    1                              999999999999995599
    0                              tel:+999999999999995599;ext

    Similarly 'tel:+999999999999995599;ext3434' ";ext" does not include the "=" and so should not match. "ext"

    Name                           Value                                                                                                                                                  
    ----                           -----                                                                                                                                                  
    3                              ext3434                                                                                                                                                
    1                              999999999999995599
    0                              tel:+999999999999995599;ext3434

    I note that in the above 2 examples it is matching on the 3rd which is a word match (\w) so understand that is the reason for the results however I just don't know how to get around this to give me the desired result.

    Really appreciate some help to get this working correctly.

    Thanks,

    Andrew


    Andrew Morpeth
    Lync Server Specialist - Auckland, NZ
    Check out my blog


    Sunday, December 22, 2013 9:24 PM

Answers

  • \w doesn't match -

    That's the whole explanation.  \w matches any alphanumeric character or an underscore _, but it does not match a dash/hyphen/minus sign -.  (Technically it matches any Pc category character)  You can change your \w to [\w-] to also match the dash.

    I also suggest matching $ at the end of your expression so that it doesn't stop matching prematurely.  Similarly matching ^ at the beginning will help you detect junk at the start.  (If you are trying to capture multiples occurrences of those, then you need to decide what your delimiter is.)

    Try this:

    ^tel:\+(\d+)(?:;ext=(\d+))?(?:;([\w-]+))?$
    Monday, December 23, 2013 4:50 PM

All replies

  • \w doesn't match -

    That's the whole explanation.  \w matches any alphanumeric character or an underscore _, but it does not match a dash/hyphen/minus sign -.  (Technically it matches any Pc category character)  You can change your \w to [\w-] to also match the dash.

    I also suggest matching $ at the end of your expression so that it doesn't stop matching prematurely.  Similarly matching ^ at the beginning will help you detect junk at the start.  (If you are trying to capture multiples occurrences of those, then you need to decide what your delimiter is.)

    Try this:

    ^tel:\+(\d+)(?:;ext=(\d+))?(?:;([\w-]+))?$
    Monday, December 23, 2013 4:50 PM
  • Thanks very much for your detailed response. I'm a bit of a newbie so it really helps to understand the reasoning.

    I just gave it a test and that looks to have solved the issues :)


    Andrew Morpeth
    Lync Server Specialist - Auckland, NZ
    Check out my blog

    Monday, December 23, 2013 8:31 PM