locked
Replace \r\n with p tag - not between other html tags RRS feed

  • Question

  • User-352524747 posted

    Hello,

    i'm using this line of code to replace the new line with p-tag.

    "<p>" + Regex.Replace(s, @"(?:\r\n *){1,1} *", "</p><p>") + "</p>"

    The problem here is that it adds new <p></p> elements, even between other html elements.

    So, if i write in textarea this:

    Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
    
    <ol>
    <li>Lorem ipsum dolor sit amet</li>
    <li>consectetur adipiscing elit</li>
    </ol>
    
    Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

    The result is this:

    <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.</p>
    <p></p>
    <p></p>
    <ol>
    <p></p>
    <p></p>
    <li>Lorem ipsum dolor sit amet</li>
    <p></p>
    <p></p>
    <li>consectetur adipiscing elit</li>
    <p></p>
    <p></p>
    </ol>
    <p></p>
    <p></p>
    <p>
    Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
    </p>

    I'm looking for a solution to replace the new line with a single p tag only between text paragraph as below:

    <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
    </p>
    <ol>
    <li>Lorem ipsum dolor sit amet</li>
    <li>consectetur adipiscing elit</li>
    </ol>
    <p>
    Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
    </p>

    Sunday, November 30, 2014 4:12 PM

Answers

  • User-821857111 posted

    So, if i write in textarea this:

    Why don't you just use a rich text editor like CKEditor or TinyMCE?

    • Marked as answer by Anonymous Thursday, October 7, 2021 12:00 AM
    Monday, December 1, 2014 2:15 AM

All replies

  • User-821857111 posted

    So, if i write in textarea this:

    Why don't you just use a rich text editor like CKEditor or TinyMCE?

    • Marked as answer by Anonymous Thursday, October 7, 2021 12:00 AM
    Monday, December 1, 2014 2:15 AM
  • User-352524747 posted

    Why don't you just use a rich text editor like CKEditor or TinyMCE?

    For some learning experience, i'm trying to use a simple c# helper.

    So far i have this code. The problem is that it removes all the text and keeps only p -tags in the DOM structure.

    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Text;
    using System.Text.RegularExpressions;
    using System.Web;
    
    public static class ToHtmlHelper
    {
        private static string _paraBreak = "\r\n\r\n";
    
        public static string ConvertToHtml(string source)
        {
            StringBuilder sb = new StringBuilder();
    
            int pos = 0;
            while (pos < source.Length)
            {
                // Extract next paragraph
                int start = pos;
                pos = source.IndexOf(_paraBreak, start);
                if (pos < 0)
                    pos = source.Length;
                string para = source.Substring(start, pos - start).Trim();
    
                // Encode non-empty paragraph
                if (para.Length > 0)
                    EncodeParagraph(para, sb);
    
                // Skip over paragraph break
                pos += _paraBreak.Length;
            }
    
            // Get HTML Version
            string html = sb.ToString();
    
            // Return HTML Version
            return html;
        }
    
        private static void EncodeParagraph(string s, StringBuilder sb)
        {
            // Start new paragraph
            sb.AppendLine("<p>");
    
            // HTML encode text
            s = HttpUtility.HtmlEncode(s);
    
            // Convert single newlines to <br>
            s = s.Replace(Environment.NewLine, "<br>\r\n");
    
            // Close paragraph
            sb.AppendLine("\r\n</p>");
        }
    }

    All i get in database are <p> </p> tags.

    Monday, December 1, 2014 5:19 PM
  • User-821857111 posted

    The result is this:

    <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.</p>
    <p></p>
    <p></p>

    Looks to me like you need to do a final Replace operation which removes the empty <p> tags.

    Tuesday, December 2, 2014 2:10 AM
  • User1049502825 posted

    As per my thought, please replace with &lt;p&gt;

    once all replace done in string please fine >&lt;p&gt;< and again replace ">&lt;p&gt;<" with "><"

    Tuesday, December 2, 2014 4:18 AM
  • User-352524747 posted

    Mikesdotnetting

    Looks to me like you need to do a final Replace operation which removes the empty <p> tags.

    Thats what i'll do, but can't find the correct regex to match empty <p> tags that looks like:

    <p></p>

    or even like this:

    <p>
    </p>

    I use this Regex to match any empty html tag.

    Regex.Replace(s, @"<([\w:]+)>(\s|&nbsp;)*</\1>", string.Empty);

    This doesn't removes the empty <p> tags. And there is something to do with the class published above (LINK).

    Using HttpUtility.HtmlEncode(s); replace the <br> and it does not displays empty p tags. If i remove it, it does not replace new lines with <br> and it adds empty p- tags. Also if i would use it, other html elements that are part of the text like <ol><li><img> are encoded (displayed as text in the browser).

    Any idea what to change?

    Tuesday, December 2, 2014 5:47 PM
  • User-821857111 posted

    but can't find the correct regex to match empty <p> tags that looks like:

    <p></p>

    What do you want to use a Regex for? You can use string.Replace for that:

    input = input.Replace("<p></p>", "");

    Wednesday, December 3, 2014 2:07 AM
  • User-352524747 posted

    Mikesdotnetting

    input = input.Replace("<p></p>", "");

    It doesn't work. Does not replace empty <p> tags.

    Anyway i'm thinking to use another function posted below.

    I want to find the words that are not part of any html tag, by using foreach but how can i select words that do not match with html tags and append p tag?

    @functions{
        public static string tohtml(string s)
        {
            string _tags = @"<[^>].*(>|$)";
    
            Match output = Regex.Match(s, _tags, RegexOptions.ExplicitCapture | RegexOptions.Compiled);
            
            if (String.IsNullOrEmpty(s))
                throw new ArgumentNullException(s);
            var words = s.Split(new[] { "\r\n\r\n" }, StringSplitOptions.RemoveEmptyEntries);
            var sb = new StringBuilder();
            foreach (var word in words.Where(word => !word.Contains(output.ToString())))
            {
                sb.Append("<p>" + word.Replace("\r", "<br>") + "</p>");
            }
            return sb.ToString();
        }
    }

    So, let's say i have this content:

    <img src="/download?png=03122014-1242-02.png" alt="t"> Lorem ipsum dolor sit amet, 
    consectetur adipiscing elit, 
    sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.
    
    
    Lorem ipsum dolor sit amet, consectetur adipiscing elit, 
    sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. 
    Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip 
    ex ea commodo consequat.
    <h2>Header</h2>
    <ul>
    <li>list n 1</li>
    </ul>
    
    Lorem ipsum dolor sit amet, consectetur adipiscing elit, 
    sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.

    With this regex pattern @"<[^>].*(>|$)" i match every html tag (test here http://regexr.com/3a11p)

    I want to modify the content like this:

    <img src="/download?p=image.png" alt="p"> Lorem ipsum dolor sit amet,<br> 
    consectetur adipiscing elit,<br>
    sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.
    <p>
    Lorem ipsum dolor sit amet, consectetur adipiscing elit,<br>
    sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. <br>
    Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip <br>
    ex ea commodo consequat.
    </p>
    <h2>Header</h2>
    <ul>
    <li>list n 1</li>
    </ul>
    <p>
    Lorem ipsum dolor sit amet, consectetur adipiscing elit, <br>
    sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.
    </p>

    I would appreciate your help.

    Wednesday, December 3, 2014 11:14 AM