none
Why does SharePoint put in character code 8203 in a richtext field?

    Question

  • I use some RichHtmlField controls (PublishingWebControls) in different pagelayouts. I edit the pages, put some text in the fields and publish. It all seems to work fine, but I've noticed that SharePoint saves an extra character to my string. Usually it's added at the beginning of the string, but sometimes at the end.

    You cannot see it using the ordinary browser window, because it's a zero width space character. But if you right click and select View source, it's visible as a big space. 

    It's possible to copy the text from the view source window to a text editor preserving this character, so I pasted it as a string in a c# program. When I loop through every character in the string to check its ascii value, this particular character shows as 8203.

    I used CAML Builder to check what my string looked like in the database. I couln't see anything strange, but when I copied the string from the CAML Builder result tab and pasted it into a hex editor, you could clearly see that the strange character was there.

    The problem is that we translate our pages to different languages and this character makes the translation engine go bananas.

    Has anyone experienced this before or has any idea how this could be solved?
    Monday, June 10, 2013 1:23 PM

Answers

  • Actually, I just simplified my fix down to this...

    jQuery('#s4-bodyContainer').html(jQuery('#s4-bodyContainer').html().replace(/\u200B/g,''));

    Wednesday, March 12, 2014 5:25 PM
  • I wrote this little gem which is a bit DOM-intensive but it does the job...

    function spCleanup(code){

    if(code.children().length > 0){

    code.html(jQuery('<div>').append(code.children().clone()).html().replace(/&nbsp;|&#160;|\r\n|\n|\r|\t/g,'').replace(/\s{2,}/g,' '));

           code.children().each(function(){

           

    spCleanup(jQuery(this));

           });

    };

    };

    spCleanup(jQuery('#s4-bodyContainer'));

    Wednesday, March 12, 2014 2:49 PM
  • Hello,

    The Ascii code 8203 stands for line break : http://www.fileformat.info/info/unicode/char/200b/index.htm

    commonly abbreviated ZWSP ;  this character is intended for invisible word separation and for line break control; it has no width, but its presence between two characters does not prevent increased letter spacing in justification

    A third party editor may be adding them , if they are not originally present in your script/code.

    Are you using any such editor tools?

    It will be helpful to open a troubleshooting ticket with Microsoft so that we can look at the issue in more depth.


    Regards, Dhiraj(DJ)-MSFT |Microsoft Online Community Support

    Thursday, July 25, 2013 5:43 PM

All replies

  • Hi ,

    I am trying to involve someone familiar with this topic to further look at this issue.

    Thanks


    Daniel Yang
    TechNet Community Support

    Wednesday, June 12, 2013 9:47 AM
    Moderator
  • Thank you.

    Please let me know if you need more information or the HTML being rendered

    Wednesday, June 12, 2013 9:18 PM
  • try to use this Free Rich Editor 

    http://ckeditor.com/blog/CKEditor-for-SharePoint-Ultimate-Editing-Solution

    might be it solves your Problem :) 

    2nd are you Copying content in side RichHtmlField somewhere like internet or word; 

    Thursday, June 13, 2013 3:58 AM
  • Please share piece  of  html >? which have uni characters 
    Thursday, June 13, 2013 3:59 AM
  • The problem appears in all my richtext fields. It doesn't matter if I copy/paste text into these fields or if I write a simple text manually in the fields.

    I tried to put a string containing the character in a code block above, but the character was lost during this process. This is what happens when I paste the string directly in this editor:

    <

    p>This is my lead text for my news item.</p>


    When I check the ascii codes for this string (as it appears in my HTML page), I get this:

    60        (which is <)
    112      (which is p)
    62        (which is >)
    8203    (here it is)
    84     (which is T) and so on...


    Thursday, June 13, 2013 8:40 AM
  • Hello,

    The Ascii code 8203 stands for line break : http://www.fileformat.info/info/unicode/char/200b/index.htm

    commonly abbreviated ZWSP ;  this character is intended for invisible word separation and for line break control; it has no width, but its presence between two characters does not prevent increased letter spacing in justification

    A third party editor may be adding them , if they are not originally present in your script/code.

    Are you using any such editor tools?

    It will be helpful to open a troubleshooting ticket with Microsoft so that we can look at the issue in more depth.


    Regards, Dhiraj(DJ)-MSFT |Microsoft Online Community Support

    Thursday, July 25, 2013 5:43 PM
  • I am running into this same problem.  In our case, we have a search results web part inside a rich html zone.  The web part displays pictures from a picture library.  For whatever reason, Sharepoint is adding 21 of these &#8203; unicode characters to the very top of the rich html zone (before the web part).  Aside from the web part, the rich thml zone does not have any content so I don't think it coming from a 3rd party text editor.

    This wouldn't be an issue except that these characters create an ugly and unwanted margin above the web part.

    Thursday, October 3, 2013 10:49 PM
  • Nice to see I'm not the only one with this problem. I didn't get a solution to this. When I'm using my custom webparts, I can always filter in code behind by doing this:

    description = description.Replace(((char)8203).ToString(), "");

    Unfortenately this isn't possible when you're putting the rich text fields directly on your page layout.

    Friday, October 4, 2013 8:05 AM
  • We tried running some javascript on the masterpage to get rid of them. It worked, but that ended up breaking a jquery plugin we're using in a custom display template.  This might work for you though.

    The plan now is to not use rich html zones(or at least don't put webparts in them if we can avoid it) and just use webpart zones instead.

    Friday, October 4, 2013 2:29 PM
  • FWIW - I know this is months later, but I just ran into the same problem.  I found that by concatenating my code before inserting via the editor, that prevented the zero width spaces (8203's).  I run a find and replace in SublimeText2 to replace the regex of \n with nothing.
    Thursday, February 20, 2014 10:55 PM
  • I wrote this little gem which is a bit DOM-intensive but it does the job...

    function spCleanup(code){

    if(code.children().length > 0){

    code.html(jQuery('<div>').append(code.children().clone()).html().replace(/&nbsp;|&#160;|\r\n|\n|\r|\t/g,'').replace(/\s{2,}/g,' '));

           code.children().each(function(){

           

    spCleanup(jQuery(this));

           });

    };

    };

    spCleanup(jQuery('#s4-bodyContainer'));

    Wednesday, March 12, 2014 2:49 PM
  • Actually, I just simplified my fix down to this...

    jQuery('#s4-bodyContainer').html(jQuery('#s4-bodyContainer').html().replace(/\u200B/g,''));

    Wednesday, March 12, 2014 5:25 PM
  • I had the same issue and resolved it by little trick with my browsers "developer tools"

    1. Go to edit page
    2. Turn on "Developer tools"
    3. Inspect unwanted characters
    4. Right click > delete node/element

    Save your page and changes will be saved also ;)




    Friday, March 28, 2014 9:36 AM
  • webninjataylor, I can 't thank you enough for this jquery solution you put together.  Thank you!!! this answer came at the right time!
    Monday, July 21, 2014 11:59 AM
  • You're welcome.  :)

    BTW, I recently found out the script solution doesn't play well with some Bootstrap scenarios.  For me, I've seen parts of the DOM removed.  :(

    Monday, July 21, 2014 1:53 PM
  • I'm discovering right now that you just have to be specific in where you want the clean up process to occur.  If you wrap your custom content areas (whether if its reusable content item, or a content section on your page), use "that" id selector instead of #s4-bodyContainer, to remove the characters.  I don't think SharePoint places nice when we try to remove these characters from it's parent level id selectors...

     
    • Edited by blackhawx Monday, July 21, 2014 2:29 PM
    Monday, July 21, 2014 2:29 PM
  • This is helping me out so far with the olso template, this way I can simply load all the ID selectors I really care about...and clean them up!

    function removecharacters() {
    /*REMOVE THE 8203 CHARACTER FROM DOM*/
    var obj = {
      "resuable1": "features",
      "resuable2": "slider-support"
    };
    $.each( obj, function(key, val) {
    $('#' + val).html(jQuery('#' + val).html().replace(/\u200B/g,''));
    });
    /**/
    }

    /*CALL ALL JQUERY FUNCTIONS*/
    $(window).load(function() {
      removecharacters();
    });





    • Edited by blackhawx Monday, July 21, 2014 2:46 PM
    Monday, July 21, 2014 2:42 PM
  • I had the same issue and resolved it by little trick with my browsers "developer tools"

    1. Go to edit page
    2. Turn on "Developer tools"
    3. Inspect unwanted characters
    4. Right click > delete node/element

    Save your page and changes will be saved also ;)




    This is a good plan too, thanks for sharing this.
    Monday, July 21, 2014 3:02 PM
  • Excellent, blackhawx.  I'll give that a shot when I need it next.  :)
    Monday, July 21, 2014 3:05 PM
  • Cool.  Thanks!  :)
    Monday, July 21, 2014 3:08 PM
  • Nice and simple solution. Thanks.
    Friday, April 3, 2015 2:42 PM
  • All of the solutions proposed here are wonderful, but the elephant is still in the room. Why does the RichHtmlField behave like this? Please don't tell me that this behavior is intended? I'm supposed to tell my site content editors that they can't create clean HTML inside a RichHtmlField in source code view and count on it still being there after the file is checked in?? That's ridiculous. 

    In source code view, I created a simple DIV containing a series of stacked images, and went so far as to remove ALL line breaks and spaces between everything - all the HTML was on the same line. Upon check in, SharePoint inserts these characters no matter what.

    It's great that we have both server and client side work-arounds, but seriously, shouldn't it simply work properly in the first place? There are dozens of FREE WYSIWYG editors that work beautifully. But the (expensive) out of the box SharePoint RichHtmlField doesn't. WHY?

    Tuesday, June 30, 2015 11:41 PM
  • What I find most disturbing is the fact that not a single Microsoft respresentative / responsible person is responding to this issue!? It's been almost 2 years now and this issue still exists.

    All different kinds of unwanted characters are added when you try to save your clean RichHTML part :/ It's highly annoying and for quite a few developers, including me, an issue which keeps me away from developing / designing for SharePoint.

    Thursday, July 16, 2015 8:37 AM
  • I found a simple, no JS solution.  Edit code on the RichHtmlField and paste in

    <span style="display:none;">dummy span

    SharePoint will close the malformed code block and insert the 8203 line break, but it will do so within the hidden span, eliminating the space.  

    I wish someone at MS would respond about actually resolving the issue, but this work around is simple and shouldn't have unanticipated repercussions.

    Thursday, November 5, 2015 3:32 PM
  • I've found a very simple solution. SharePoint uses UTF-8 encoding, but Chrome is defaulted to Western (Windows-1252). 

    In Chrome, visit chrome://settings/fonts and click "Customize Fonts". Change the encoding to UTF-8, and the junk characters will not be inserted.

    • Proposed as answer by PC JoeGrammer Friday, December 4, 2015 8:17 PM
    Friday, December 4, 2015 8:13 PM
  • Here are we are with SharePoint 2016 released and still the issue is not resolved in 2013.
    Tuesday, October 25, 2016 6:23 PM
  • This fix is not working for me..

    Can anyone help to resolve this issue?

    I have one SP custom web part which is displaying the content from SP rich text editor (This HTML is having simple images not any text). On render, below web part content space/line break is getting injected automatically.

    Is SP RTE is the cause? Or I am using some JQuery methods in my WP is the cause?

    Please share your comments with working solutions if any.

    Thanks,

    Shruti

    Friday, December 8, 2017 8:40 AM