none
Why editing the text in <Paragraph> generates multiple <Text> controls?! RRS feed

  • Question

  • Hello,

    I realized that if I edit the Table cell content, modifying the text inside, I end up with my text splitted across multiple <Text> elements.
    Any explanations of such behavior?

    For example, my Table cell contains the following text "<MyItem>". If I edit it (for example on "<MyItem1>") I get 3 <text> elements inside the <Run> container:
    • <Text> with "<"
    • <Text> with "MyItem1"
    • <Text> with ">"
    Why my content is splitted across several Text elements?!

    SharePoint 2007 - 2010 Tips & Tricks Portal | Microsoft MVP | My Blog about Information Management | My twitter
    Wednesday, March 10, 2010 5:35 AM

Answers

  • Hi Michael,

    One question about your scenario is that how do you generate the document? By Office Word or the Open XML SDK? If you generate it with Open XML SDK, you could try to modify the code as I said in my last post. If you generate the document in Word, the "ProofError" element is being generated because Word has the function of spelling checking and the "vs" is marked as possible grammatical error. You could try to "ignore" the spelling checking in Word then reflect the code in tool you can find the "ProofError" element has been removed (but there may still be "BookmarkStart"/"BookmarkEnd" element between text "Results vs" and "Content"). So we suggest you generating the document using Open XML SDK so that it is easy to manage what element is added into the document.

    Hope this helps. If you have any question, please let me know.

    Thanks,

    Lu
    Thursday, March 11, 2010 4:48 AM

All replies

  • Hi Michael,

    Thanks for your question.

    For your scenario, could you show us the content of the xml file which splits the content of the Run into three Text parts? I guess there may be different settings on different Text parts. Maybe we could figure out the problem after knowing the content of the xml file.

    Thanks,

    Lu 
    Wednesday, March 10, 2010 5:54 AM
  • Sending you the section that is generated inside the <Paragraph> for the "Results vs Content" text.
    I have 3 <Text> sections separated by ProofError elements.

    Why do I have such behavior?! How can I have the text being in the single <text> element


    public Paragraph GenerateParagraph()
            {
                Paragraph paragraph1 = new Paragraph(){ RsidParagraphAddition = "003C2261", RsidParagraphProperties = "00106DE2", RsidRunAdditionDefault = "008D54FC" };
    
                ParagraphProperties paragraphProperties1 = new ParagraphProperties();
                SpacingBetweenLines spacingBetweenLines1 = new SpacingBetweenLines(){ After = "120" };
                Indentation indentation1 = new Indentation(){ Right = "304" };
    
                ParagraphMarkRunProperties paragraphMarkRunProperties1 = new ParagraphMarkRunProperties();
                RunFonts runFonts1 = new RunFonts(){ Ascii = "Univers", HighAnsi = "Univers", ComplexScript = "Arial" };
                Bold bold1 = new Bold();
    
                paragraphMarkRunProperties1.Append(runFonts1);
                paragraphMarkRunProperties1.Append(bold1);
    
                paragraphProperties1.Append(spacingBetweenLines1);
                paragraphProperties1.Append(indentation1);
                paragraphProperties1.Append(paragraphMarkRunProperties1);
    
                Run run1 = new Run(){ RsidRunProperties = "00724534" };
    
                RunProperties runProperties1 = new RunProperties();
                RunFonts runFonts2 = new RunFonts(){ Ascii = "Univers", HighAnsi = "Univers", ComplexScript = "Arial" };
                Bold bold2 = new Bold();
    
                runProperties1.Append(runFonts2);
                runProperties1.Append(bold2);
                Text text1 = new Text(){ Space = SpaceProcessingModeValues.Preserve };
                text1.Text = "Results ";
    
                run1.Append(runProperties1);
                run1.Append(text1);
                ProofError proofError1 = new ProofError(){ Type = ProofingErrorValues.SpellStart };
    
                Run run2 = new Run(){ RsidRunProperties = "00724534" };
    
                RunProperties runProperties2 = new RunProperties();
                RunFonts runFonts3 = new RunFonts(){ Ascii = "Univers", HighAnsi = "Univers", ComplexScript = "Arial" };
                Bold bold3 = new Bold();
    
                runProperties2.Append(runFonts3);
                runProperties2.Append(bold3);
                Text text2 = new Text();
                text2.Text = "vs";
    
                run2.Append(runProperties2);
                run2.Append(text2);
                ProofError proofError2 = new ProofError(){ Type = ProofingErrorValues.SpellEnd };
    
                Run run3 = new Run(){ RsidRunProperties = "00724534" };
    
                RunProperties runProperties3 = new RunProperties();
                RunFonts runFonts4 = new RunFonts(){ Ascii = "Univers", HighAnsi = "Univers", ComplexScript = "Arial" };
                Bold bold4 = new Bold();
    
                runProperties3.Append(runFonts4);
                runProperties3.Append(bold4);
                Text text3 = new Text(){ Space = SpaceProcessingModeValues.Preserve };
                text3.Text = " Target";
    
                run3.Append(runProperties3);
                run3.Append(text3);
    
                paragraph1.Append(paragraphProperties1);
                paragraph1.Append(run1);
                paragraph1.Append(proofError1);
                paragraph1.Append(run2);
                paragraph1.Append(proofError2);
                paragraph1.Append(run3);
                return paragraph1;
            }
    


    SharePoint 2007 - 2010 Tips & Tricks Portal | Microsoft MVP | My Blog about Information Management | My twitter
    Wednesday, March 10, 2010 1:09 PM
  • Hi Michael,

    Thanks for sharing the code. From the generated code we can see that there are three Runs which contains a Text element respectively instead of one Run containing three Text elements. The RunProperties are the same and these three Runs are separated by the ProofError element, which specifies the presence of a start or end anchor for a single proofing error within a WordprocessingML document. You create this document and then reflect code with the Tool, right? This may be caused by the Word UI which will automatically check the grammatical errors. So you could try to remove these ProofErrors and merge the text "Results vs Content" into one Run.


    Below is an example of ProofError:

    [Example: Consider the following sentence with a grammatical error in its subject/verb agreement. If an application recognized this error and wished to persist it to the document, this paragraph would consist of the following WordprocessingML markup:

    <w:p>
    <w:proofErr w:type="gramStart"/>
    <w:r>
    <w:t>This are</w:t>
    </w:r>
    <w:proofErr w:type="gramEnd"/>
    <w:r>
    <w:t xml:space="preserve"> an error.</w:t>
    </w:r>
    </w:p>

    The proofErr elements, with a val attribute value of gramStart and gramEnd respectively, delimit the start and end the content in this paragraph which is stored as a grammatical error. end example]

    Hope this helps. If you have any question, please let me know.

    Thanks,

    Lu
    Thursday, March 11, 2010 1:42 AM
  • Hello,

    yep, you are correct, three Run elements with text insider.

    The question is how to avoid the "ProofError" being generated? 

    The task we are doing now is to read the cell content programmatically and replace it. In case of having multiple <Run> elements and consequently multiple <Text> elements we can't get the content, only pieces of it, because it's in multiple sections  


    SharePoint 2007 - 2010 Tips & Tricks Portal | Microsoft MVP | My Blog about Information Management | My twitter
    Thursday, March 11, 2010 3:05 AM
  • Hi Michael,

    One question about your scenario is that how do you generate the document? By Office Word or the Open XML SDK? If you generate it with Open XML SDK, you could try to modify the code as I said in my last post. If you generate the document in Word, the "ProofError" element is being generated because Word has the function of spelling checking and the "vs" is marked as possible grammatical error. You could try to "ignore" the spelling checking in Word then reflect the code in tool you can find the "ProofError" element has been removed (but there may still be "BookmarkStart"/"BookmarkEnd" element between text "Results vs" and "Content"). So we suggest you generating the document using Open XML SDK so that it is easy to manage what element is added into the document.

    Hope this helps. If you have any question, please let me know.

    Thanks,

    Lu
    Thursday, March 11, 2010 4:48 AM
  • Document is created by users. They define the cells they want to be updated from server.
    They specify what they want to update by using "<" & ">" symbols around the words.
    For example, they want the Price in one of the cell to be updated from server. They put angle brackets around that word, like that "<Price>"

    The number of such updatable words limited, but we can't control their location on the document. User's use the Glossary for those words.

    When we parse the document we are iterating all cells, finding the "<*>" matching, but this won't work in 
    But the problem is in the words, when they type the word openxml gives us multiple <text> section, not the whole word. The only way we found is not to type but copy-paste the word from somewhere (from notepad for example). In this case word is inserted as as single <Text> element.


    SharePoint 2007 - 2010 Tips & Tricks Portal | Microsoft MVP | My Blog about Information Management | My twitter
    Thursday, March 11, 2010 5:49 AM
  • Hi Michael,

    Thanks for your description.

    From your description we may conclude that you cannot handle how the customers generate the documents. I think one possible solution is that you may pre-process the document before extracting the information you need. That is to say, you may have to merge the text appearing different Runs into just one according to the special tag "<" and ">". As to how to achieve this, you could make use of the Tool's "Compare Files" (See Open XML SDK v2 FAQ Part2), and you may need to add some checking based on the generated code.

    Hope this helps. If you have any question, please let me know.

    Thanks,

    Lu
    Thursday, March 11, 2010 8:30 AM
  • Are there any consideration to simplify such behavior providing the OpenXML API to get the content without ProofError and separation?! 

    I find it quite odd conceptually that text is represented as the single entity visually in Word, but the storage is not unique to the presentation.

    It would be nice if you guys provided API, like Paragraph.GetContentAsText that returns the single string of the <Text>'s contents

    SharePoint 2007 - 2010 Tips & Tricks Portal | Microsoft MVP | My Blog about Information Management | My twitter
    Thursday, March 11, 2010 12:33 PM
  • Hi Micheal,

    In your scenario, I think you may try out the "InnerText" Property (like Paragraph.InnerText) if you need to get the text content of a Paragraph. But if you want to modify this text, you will have to know the detailed file format (such as Run, Text...) to deal with it.

    Hope this helps. If you have any question, please let me know.

    Thanks,

    Lu
    Friday, March 12, 2010 5:50 AM
  • Found the alternative solution to this issue - just add the words to the dictionary. Placing mouse on the top of word ->right mouse click->Add to Dictionary for the underlined words (ProofError-ed text) will remove the ProofError elements and merge the words into the single Text element
    SharePoint 2007 - 2010 Tips & Tricks Portal | Microsoft MVP | My Blog about Information Management | My twitter
    Tuesday, March 16, 2010 4:05 AM