locked
converting html text to plain text RRS feed

  • Question

  • User2054871671 posted

    I have a sharepoint website, where on editing some client it shows the text area where the user can edit the record. For some area it shows the html text with  and other tags which the user does not know how and where to change. How can I help the user to make this html text to plain text so that the user can edit easily.

    Wednesday, June 7, 2017 7:43 PM

Answers

  • User2054871671 posted

    This got resolved. It had some compatibility issue on the browser. When it was set , the error vanished...

    • Marked as answer by Anonymous Thursday, October 7, 2021 12:00 AM
    Wednesday, January 24, 2018 6:27 PM

All replies

  • User2103319870 posted

    How can I help the user to make this html text to plain text so that the user can edit easily.

    if you want to remove the html tags then one option is to remove the Regex like given below

                    //Your html input
    		string htmlsource = "<b>Sample Code!</b><br /><i>this text! </i>";
    		//Regex to remove the html tags
    		string htmltagsremoved =   Regex.Replace(htmlsource, "<[^>]*>", string.Empty);
    Wednesday, June 7, 2017 8:08 PM
  • User991499041 posted

    Hi slm3003,

    I have a sharepoint website, where on editing some client it shows the text area where the user can edit the record. For some area it shows the html text with  and other tags which the user does not know how and where to change. How can I help the user to make this html text to plain text so that the user can edit easily.

    This function converts HTML code to plain text.

    Any step is commented to explain it better.

    You can change or remove unnecessary parts to suite your needs.

    public string HTMLToText(string HTMLCode)
    {
    	// Remove new lines since they are not visible in HTML
    	HTMLCode = HTMLCode.Replace("\n", " ");
    
    	// Remove tab spaces
    	HTMLCode = HTMLCode.Replace("\t", " ");
    
    	// Remove multiple white spaces from HTML
    	HTMLCode = Regex.Replace(HTMLCode, "\\s+", " ");
    
    	// Remove HEAD tag
    	HTMLCode = Regex.Replace(HTMLCode, "<head.*?</head>", ""
    						, RegexOptions.IgnoreCase | RegexOptions.Singleline);
    
    	// Remove any JavaScript
    	HTMLCode = Regex.Replace(HTMLCode, "<script.*?</script>", ""
    		 , RegexOptions.IgnoreCase | RegexOptions.Singleline);
    
    	// Replace special characters like &, <, >, " etc.
    	StringBuilder sbHTML = new StringBuilder(HTMLCode);
    	// Note: There are many more special characters, these are just
    	// most common. You can add new characters in this arrays if needed
    	string[] OldWords = {"&nbsp;", "&amp;", "&quot;", "&lt;",
       "&gt;", "&reg;", "&copy;", "&bull;", "&trade;"};
    	string[] NewWords = { " ", "&", "\"", "<", ">", "®", "©", "•", "™" };
    	for (int i = 0; i < OldWords.Length; i++)
    	{
    		sbHTML.Replace(OldWords[i], NewWords[i]);
    	}
    
    	// Check if there are line breaks (<br>) or paragraph (<p>)
    	sbHTML.Replace("<br>", "\n<br>");
    	sbHTML.Replace("<br ", "\n<br ");
    	sbHTML.Replace("<p ", "\n<p ");
    
    	// Finally, remove all HTML tags and return plain text
    	return System.Text.RegularExpressions.Regex.Replace(
    			sbHTML.ToString(), "<[^>]*>", "");
    }

    Regards,

    zxj

    Thursday, June 8, 2017 2:01 AM
  • User2054871671 posted

    This got resolved. It had some compatibility issue on the browser. When it was set , the error vanished...

    • Marked as answer by Anonymous Thursday, October 7, 2021 12:00 AM
    Wednesday, January 24, 2018 6:27 PM