none
How to 'parse' raw HTML text to Word format using OpenXML? RRS feed

  • Question

  • Hello,

    Fairly new to using OpenXML. I have a site which uses a rich-text editor (ckeditor IIRC), which outputs text into it as raw HTML data. Sample is below of a simple text with an inline-image:

    <p>PHD Abstract</p>
    
    <p><strong>Title</strong></p>
    
    <ul>
        <li>Item 1</li>
        <li>Item 2</li>
    </ul>
    
    <p><img alt="" border="0" hspace="0" src="" style="border:0px solid black; height:174px; margin-bottom:0px; margin-left:0px; margin-right:0px; margin-top:0px; width:290px" vspace="0" /></p>


    I store the data above in a DB for various purposes, mainly 1.) loading it back to display and 2.) generating some word documents. I currently have a library grabbed from the internet for HTML to OpenXML called HTML2OpenXML.

    The problem is, the current library I am using cannot parse the stored data completely (simplest example is the image is not displayed, as well as bulleted items).

    Is there a recommended way of parsing HTML back to Open XML? ckeditor can already do the reverse (you copy-paste data from Word to the ckeditor text-area on the browser).

    Thank you!





    • Edited by OCS.New Sunday, August 23, 2015 11:03 AM
    Sunday, August 23, 2015 11:00 AM

Answers

All replies