locked
Regex.Replace not working. RRS feed

  • Question

  • User1433899244 posted

    Hi there,

     

    I am trying to replace text in my xsl file. I have to replace all the http and https. But I can leave any links that start with <a href>

    Ie; http://www.google.com has to replaced. But < a href=”google.com”>google.com</a> has to remain as it is.

     

    I am using regex to find all the http(s) and a hrefs in the text. I am struck here.

    My code is,

           

    Dim href As New Regex("(<a.*?>.*?</a>)", RegexOptions.IgnoreCase)
    
    Dim hrefs As New Regex("http(s)?://([\w+?\.\w+])+([a-zA-Z0-9\~\!\@\#\$\%\^\&amp;\*\(\)_\-\=\+\\\/\?\.\:\;\'\,]*)?", RegexOptions.IgnoreCase)
    
        
            Dim matchahref As MatchCollection = href.Matches(StrIn)
    
            Dim matches As MatchCollection = hrefs.Matches(StrIn)
    
     
    for each match in matches
    StrIn = Regex.Replace(StrIn, "(http|ftp|https):\/\/[\w\-_]+(\.[\w\-_]+)+([\w\-\.,@?^=%&amp;:/~\+#]*[\w\-\@?^=%&amp;/~\+#])", _
    
                                          String.Format("<a href=""{0}"">{0}</a>", match.ToString()))

     next

    I am trying to use Regex.Replace  to replace my https. But I am unable to write the evaluate function.

    My regex.replace replaces all the values with a single value.

     

    Can some on please correct my regex.repalce. thank you.

    Monday, August 15, 2011 2:24 PM

Answers

  • User210672981 posted

    Hi~user2980

    I hope I have understood your question correctly.
    From your code, I think you want to:

    Step 1:Remove Http://

    http://www.google.com => www.google.com

    Step 2:Add <a> to the address

    www.google.com=> < a href=”www.google.com”>www.google.com</a>

     

    I have a try for your purpose:

    --------------------------------------------------------------------------

     

    Dim StrIn = http://www.google.com
    Dim
    ahref = New Regex("(<a.*?>.*?</a>)", RegexOptions.IgnoreCase)
    Dim hrefs = New Regex("(http(s)?://)(([\w+?\.\w+])+([a-zA-Z0-9\~\!\@\#\$\%\^\&amp;\*\(\)_\-\=\+\\\/\?\.\:\;\'\,]*)?)", RegexOptions.IgnoreCase)
    Dim ahrefmatch = ahref.Matches(StrIn)Dim matches = hrefs.Matches(StrIn)

     

    For Each match As Match In matches

     

    Dim groups As GroupCollection = match.Groups
    StrIn = Regex.Replace(StrIn, "(http|ftp|https):\/\/[\w\-_]+(\.[\w\-_]+)+([\w\-\.,@?^=%&amp;:/~\+#]*[\w\-\@?^=%&amp;/~\+#])", [String].Format("<a href=""{0}"">{0}</a>", groups(3).ToString()))

    Next

    Console.WriteLine(StrIn)

     

    --------------------------------------------------------------------------

    It will print out "<a href="www.google.com">www.google.com</a>"

    I use a GroupCollection to solve the problem, groups(3).ToString() will target the substring "www.google.com"of “http://www.google.com

    Hope it can help you 

    • Marked as answer by Anonymous Thursday, October 7, 2021 12:00 AM
    Sunday, August 21, 2011 9:12 PM

All replies

  • User-952121411 posted

    Absolute best forum where all the gurus on RegEx monitor for questions like this one is below:

    Regular Expressions MSDN Forum:

    http://social.msdn.microsoft.com/Forums/en-US/regexp/threads

    I recommend, re-posting to that forum.

    Thursday, August 18, 2011 1:57 PM
  • User1105670391 posted

    this

    Dim href As New Regex("(<a.*?>.*?</a>)", RegexOptions.IgnoreCase)

    to this

    Dim href As New Regex("(<a[^>]?>[^<]?</a>)", RegexOptions.IgnoreCase)

    Sunday, August 21, 2011 1:51 PM
  • User210672981 posted

    Hi~user2980

    I hope I have understood your question correctly.
    From your code, I think you want to:

    Step 1:Remove Http://

    http://www.google.com => www.google.com

    Step 2:Add <a> to the address

    www.google.com=> < a href=”www.google.com”>www.google.com</a>

     

    I have a try for your purpose:

    --------------------------------------------------------------------------

     

    Dim StrIn = http://www.google.com
    Dim
    ahref = New Regex("(<a.*?>.*?</a>)", RegexOptions.IgnoreCase)
    Dim hrefs = New Regex("(http(s)?://)(([\w+?\.\w+])+([a-zA-Z0-9\~\!\@\#\$\%\^\&amp;\*\(\)_\-\=\+\\\/\?\.\:\;\'\,]*)?)", RegexOptions.IgnoreCase)
    Dim ahrefmatch = ahref.Matches(StrIn)Dim matches = hrefs.Matches(StrIn)

     

    For Each match As Match In matches

     

    Dim groups As GroupCollection = match.Groups
    StrIn = Regex.Replace(StrIn, "(http|ftp|https):\/\/[\w\-_]+(\.[\w\-_]+)+([\w\-\.,@?^=%&amp;:/~\+#]*[\w\-\@?^=%&amp;/~\+#])", [String].Format("<a href=""{0}"">{0}</a>", groups(3).ToString()))

    Next

    Console.WriteLine(StrIn)

     

    --------------------------------------------------------------------------

    It will print out "<a href="www.google.com">www.google.com</a>"

    I use a GroupCollection to solve the problem, groups(3).ToString() will target the substring "www.google.com"of “http://www.google.com

    Hope it can help you 

    • Marked as answer by Anonymous Thursday, October 7, 2021 12:00 AM
    Sunday, August 21, 2011 9:12 PM
  • User1433899244 posted

    thank you. This did work.

    Monday, August 22, 2011 4:48 PM