Remove HTML formatting RRS feed

  • Question

  • User-1786262203 posted

    Anyone know how to remove html-formatted-string to plain text programatically?

    I want to get the same result as if i ctrl+a in a browser's page and then ctrl+v in a notepad.

    I was googled how to strip html string, but didn't find the good result.

    Removing tag using regex: "<(.|\n)*?>" causes some problems:

    * The html-table is not nicely converted to tabbed plain text. It produces *crazy* table in plain text.

    * the html-alternate-image lost

    * the list ( <ul> & <ol> ) is not conveted well. No numbering or bullets in plain text

    Monday, August 23, 2010 6:27 PM

All replies