locked
Regular Expression to change the font in html string RRS feed

  • Question

  • User-279680092 posted

    Hi,

    I need to change the font in html string

    e.g. i have below mentioned kind of string.

    string html = "<Style font-family: \"arial\"; Size:10; font-family:Verdana; font-family: Times New Roman> 2222222 <FONT size=3 face=\"Times New Roman\"> </FONT> <FONT Size=3 face=Times New Roman color=RED></FONT>";

    (This is not the actual HTML string.)

    I want to get the following result

    html = "<Style font-family: "Arial Unicode MS"; Size:10; font-family:Arial Unicode MS; font-family: Arial Unicode MS> 2222222 <FONT size=3 face="Arial Unicode MS"> </FONT> <FONT Size=3 face=Arial Unicode MS color=RED></FONT>";

    please help me with the appropriate regular expression.

    Thanks

    Bikka

    Thursday, June 19, 2014 1:42 AM

All replies

  • User1140095199 posted

    Hi,

    bikka

    please help me with the appropriate regular expression.

    Actually the html string is rendered as you wrote above at runtime. If you put a break point and check the string in Data Visualizer at runtime you will see it is rendered exactly as you want.

    As you may see in the above picture.

    If you simply want to replace the string in the html. You may use the following code :

            string html = "<Style font-family: \"arial\"; Size:10; font-family:Verdana; font-family: Times New Roman> 2222222 <FONT size=3 face=\"Times New Roman\"> </FONT> <FONT Size=3 face=Times New Roman color=RED></FONT>";
            string newhtml = html.Replace("font-family:Verdana; font-family: Times New Roman", "font-family:Arial Unicode MS; font-family: Arial Unicode MS");
    

    For more reference:

    http://msdn.microsoft.com/en-us/library/fk49wtc1(v=vs.110).aspx

    You may also use Regex:

    string newhtml = Regex.Replace(html, "font-family:Verdana; font-family: Times New Roman", "font-family:Arial Unicode MS; font-family: Arial Unicode MS");

    For more reference:

    http://stackoverflow.com/questions/8143811/find-and-replace-content-within-string-c

    Hope it helps!

    Best Regards!

    Friday, June 20, 2014 3:04 AM
  • User-279680092 posted

    Hi Sam,

    Thanks for spending time for this issue.

    Actually the Font is not fixed. it can be different everytime. also the sequence can be different.

    e.g.

    string html1 = "<Style font-family: \"arial\"; Size:10; font-family:Verdana; font-family: Times New Roman> 2222222 <FONT size=3 face=\"Times New Roman\"> </FONT> <FONT Size=3 face=Times New Roman color=RED></FONT>";
    string html2 = "<Style font-family: \"Comic Sans MS\"; Size:10; font-family:Verdana> 2222222 <FONT size=3 face=\"Times New Roman\"> </FONT> <FONT Size=3 face=Times New Roman color=RED></FONT>";

    string html3 = "<Style Size:10; font-family:Verdana> 2222222 <FONT size=3 face=\"Times New Roman\"> </FONT> <FONT Size=3 face=Times New Roman color=RED></FONT>";
    These are just 3 examples. 
    so the final requirement is:
    Find the font (e.g. Font-Family, Font-Face) and replace it to 'Arial Unicode MS'

    Hint: anything between
    Font-Family: and '>' or ';' is going to be the font
    e.g.
    if the string is <style
    Font-Family:Verdana size:20> or <style Font-Family:Verdana, Comic Sans MS size:20> or <style size:20 Font-Family:"Verdana">

    Hard Code strings find and replace would not work


    I have tried the following code:

    Dim reg1 As Regex = New Regex("font-family:(.)*?(;|>)", RegexOptions.IgnoreCase)
    Dim matchFont As MatchCollection = reg1.Matches(htmlText, 0)
    Dim reg2 As Regex = New Regex("face=(.)*?(;|>)", RegexOptions.IgnoreCase)
    Dim matchFace As MatchCollection = reg2.Matches(htmlText, 0)
    For Each v As Match In matchFont
    htmlText = htmlText.Replace(v.ToString().Substring(0, v.ToString().Length - 1), "font-family:'Arial Unicode MS'")
    Next

    For Each v As Match In matchFace
    htmlText = htmlText.Replace(v.ToString().Substring(0, v.ToString().Length - 1), "face='Arial Unicode MS'")
    Next

    This code works fine. Now i want to refactor the above code. I want to do it using 1 regular expression, so that I can avoid using foreach loop.



    Thanks
    Bikka
    Wednesday, June 25, 2014 2:48 AM