none
Convert HTML to "word string"`? RRS feed

  • Question

  • HI, I am using C#.I have a template word doc that contains key tags of the format [FIELD_0], [FIELD_1], etc...
    I load this document in memory, and then replace those tags with the correct string values.
    When I am done, I return it to the user.
    This is working fine.

    However, now I have a case where one of my string is html format (bold, italic, bullets), I am wondering how can I convert this "Hmtl string" into "word doc string" ???

    Thanks.

    Thursday, April 30, 2015 7:33 AM

Answers

  • Hi Ulyadam,

    Based on my understanding, there is no such kind of API in Word object model we can convert HTML to Word content.

    As a workaround, we can open the HTML document using Word application, then copy the content to the target document.

    Regards & Fei


    We are trying to better understand customer views on social support experience, so your participation in this interview project would be greatly appreciated if you have time. Thanks for helping make community forums a great place.
    Click HERE to participate the survey.

    Friday, May 1, 2015 5:22 AM
    Moderator

All replies

  • However, now I have a case where one of my string is html format (bold, italic, bullets), I am wondering how can I convert this "Hmtl string" into "word doc string" ???
    U mean import Html code with tags then show in word as effected text?
    Thursday, April 30, 2015 7:34 AM
  • Yeah for example ,I put the string <b>string</b>,then show a bold string...
    Thursday, April 30, 2015 7:40 AM
  • Then You can try this free DOC library(http://www.e-iceblue.com/Introduce/word-for-net-introduce.html),download and add reference to your project then add the following code

     Document doc = new Document("sample.docx");
                TextSelection selection = doc.FindString("[FILED_0]", true, true);
                TextRange range = selection.GetAsOneRange();
                int index = range.OwnerParagraph.ChildObjects.IndexOf(range);
                Document htmldoc = new Document();
                StringReader sr = new StringReader("<b>hello </b><i>world</i>");
                htmldoc.LoadHTML(sr, XHTMLValidationType.None);
                foreach (Section section in htmldoc.Sections)
                {
                    foreach (Paragraph p in section.Paragraphs)
                    {
                        foreach (DocumentObject obj in p.ChildObjects)
                        {
                            range.OwnerParagraph.ChildObjects.Insert(index++, obj.Clone());
                        }
                    }
                }
                range.OwnerParagraph.ChildObjects.Remove(range);
                doc.SaveToFile("replace1.docx", FileFormat.Docx);

    Thursday, April 30, 2015 7:42 AM
  • Hi Ulyadam,

    Based on my understanding, there is no such kind of API in Word object model we can convert HTML to Word content.

    As a workaround, we can open the HTML document using Word application, then copy the content to the target document.

    Regards & Fei


    We are trying to better understand customer views on social support experience, so your participation in this interview project would be greatly appreciated if you have time. Thanks for helping make community forums a great place.
    Click HERE to participate the survey.

    Friday, May 1, 2015 5:22 AM
    Moderator
  • Here is something that I used before, you can place altChunk elements in the places where you have those custom tags. The "altChunk" is just a place-holder inside a document that point to some file (which among other can be of HTML format) that is stored inside the word document itself.

    Now MS Word does not enable you to create "altChunk" elements however you can accomplish this with OpenXML SDK:
    http://blogs.msdn.com/b/ericwhite/archive/2008/10/27/how-to-use-altchunk-for-document-assembly.aspx

    Alternately you could use this C# library for word documents, for example try the following:

    DocumentModel document = DocumentModel.Load("Sample.docx");
    
    document.Content.Find("[FIELD_0]")
                    .First()
                    .LoadText("<b>Sample </b><i>Sample </i><u>Sample </u>", LoadOptions.HtmlDefault);
    
    document.Save("Sample Out.docx");
    That is how you can insert a HTML content at any document's location, but also you can convert the whole HTML file into a Word document in C#.
    Monday, November 2, 2015 9:23 AM
  • There are special converters, like this for example: http://www.coolutils.com/TotalHTMLConverter. They offer a free trial download, so you can make sure it would be good for you.
    Thursday, November 12, 2015 8:01 AM