none
Using regex to parse recursive template RRS feed

  • Question

  • Have a string of the following general form (all in one line):

    <characters>From:<Charcters>To:<Characters>Subject:<characters>

    <characters>From:<Charcters>To:<Characters>Subject:<characters>

    <characters>From:<Charcters>To:<Characters>Subject:<characters>

    I.e. the pattern

    <characters>From:<Charcters>To:<Characters>Subject:<characters>

    repeats.

    I want the regular expression to match each From -> Subject pattern. I am using

    From:.*To:.*(Cc:)?.*Subject:

    as the matching pattern to regEx, however, it's matching the outer pattern in the string just once - because the string globally matches that pattern. 

    What would be the pattern to get the internal matches and not the outer/global one?

    Ps: here is the code snippet I am using -

    outerRegEx = New Regex("From:.*To:.*(Cc:)?.*Subject:", RegexOptions.Singleline)
    outerMatch = outerRegEx.Match(oMsg.BodyText)
     While outerMatch.Success
        ' Matched a From/To/Cc/Subject block.                         
        writeline(outerMatch.Groups(0).Value)
        outerMatch = outerMatch.NextMatch
     End While



    • Edited by Heems Monday, April 29, 2013 10:49 PM
    Monday, April 29, 2013 9:58 PM

Answers

  • Try this:

    For Each m In Regex.Matches(body_text, "From:(.*?)To:(.*?)(?:Cc:(.*?))?Subject:(.*(?=From:)|.*)")
    
        Dim from = m.Groups(1).Value
        Dim [to] = m.Groups(2).Value
        Dim cc = If(m.Groups(3).Success, m.Groups(3).Value, Nothing)
        Dim subject = m.Groups(4).Value
    
    Next



    • Edited by Viorel_MVP Tuesday, April 30, 2013 5:44 AM
    • Marked as answer by Heems Tuesday, April 30, 2013 2:20 PM
    Tuesday, April 30, 2013 5:40 AM
  • If you do not need each component, then try this:

    For Each m As Match In Regex.Matches(body_text, "From:.*?To:.*?(?:Cc:.*?)?Subject:(?:.*(?=From:)|.*)")
        Dim s = m.Value
        . . .
    Next

    • Marked as answer by Heems Tuesday, April 30, 2013 6:19 PM
    Tuesday, April 30, 2013 6:07 PM

All replies

  • Try this:

    For Each m In Regex.Matches(body_text, "From:(.*?)To:(.*?)(?:Cc:(.*?))?Subject:(.*(?=From:)|.*)")
    
        Dim from = m.Groups(1).Value
        Dim [to] = m.Groups(2).Value
        Dim cc = If(m.Groups(3).Success, m.Groups(3).Value, Nothing)
        Dim subject = m.Groups(4).Value
    
    Next



    • Edited by Viorel_MVP Tuesday, April 30, 2013 5:44 AM
    • Marked as answer by Heems Tuesday, April 30, 2013 2:20 PM
    Tuesday, April 30, 2013 5:40 AM
  • Is there a way for each match simply to be the string beginning from "From:" through "Subject:" ?  I don't necessarily need to know the contents of each From/To/cc/subject section separately...

    Update: m.Groups(0).Value in your code above actually is the string I am looking for...  so the solution works.  Not sure if the groupings for the From/To/Cc/Subject are needed though.  If that can be removed/opitimzed I'd appreciate your input.  Eitherway, thanks!

    • Edited by Heems Tuesday, April 30, 2013 2:20 PM
    Tuesday, April 30, 2013 1:59 PM
  • If you do not need each component, then try this:

    For Each m As Match In Regex.Matches(body_text, "From:.*?To:.*?(?:Cc:.*?)?Subject:(?:.*(?=From:)|.*)")
        Dim s = m.Value
        . . .
    Next

    • Marked as answer by Heems Tuesday, April 30, 2013 6:19 PM
    Tuesday, April 30, 2013 6:07 PM
  • Perfect.  Thank you.
    Tuesday, April 30, 2013 6:19 PM