none
Generating XSL stylesheet(XSLT) from DOCX file? RRS feed

  • Question

  • Hi,

    I need to generate XSLT file from DOCX file(document.xml, fonttable.xml, numbering.xml, settings.xml, styles.xml, websettings.xml). Once the complete XSLT file is generated I can transform the document.xml to html string. Please help in how to generate a complete(with all word 2007 to word 2013 features) XSLT file for the given DOCX file.

    I have tried with docx2html.xslt and able to generate html. But docx2html.xslt is not a complete xsl stylesheet, so the html is not having images, ordered list and fonts properly.

    Kindly let me know how to generate the complete(with all word 2007 to 2013 features) XSLT file from DOCX file or is the complete(with all word 2007 to word 2013features) XSLT file is available to download.

    Ashok

    Thursday, March 13, 2014 4:35 AM

Answers

  • Hi Ashok,

    As far as I know, there are no complete XSLT file which could convert word to html.

    To view a word document online (html), I suggest to use Office Online Product.

    By the way, OpenXML SDK do not have such kind of features to convert a word document to a *.html file.


    We are trying to better understand customer views on social support experience, so your participation in this interview project would be greatly appreciated if you have time. Thanks for helping make community forums a great place.
    Click HERE to participate the survey.

    Thursday, March 20, 2014 10:39 AM
    Moderator

All replies

  • Hi Ashok,

    I noticed that you want to develop a Windows app.

    >> Once the complete XSLT file is generated I can transform the document.xml to html string. <<

    From Office side, we can convert a docx file to a HTML file directly.

    One way is to use Document.SaveAs Method (set format as wdFormatHTML or wdFormatFilteredHTML through WdSaveFormat Enumeration) to generate a HTML file from a Word document if you can use Word 2010 Primary Interop Assembly in your application. To use this function, the client must install Office.

    Another way is to extract XML information from Word documents through Open XML SDK 2.5 for Office, you can download here.

    Here are some useful articles for your reference:

    Working with WordprocessingML documents (Open XML SDK)

    Word processing (Open XML SDK)

    Here is a sample to convert DOCX to XHTML.

    Transforming Open XML WordprocessingML to XHTML Using the Open XML SDK 2.0

    You can use Open XML SDK Productivity Tool for Microsoft Office (OpenXmlSdkTool.exe in the installation path) to find the related element.

    >> But docx2html.xslt is not a complete xsl stylesheet, so the html is not having images, ordered list and fonts properly. <<

    How do you convert a XML file to XSLT file?

    In fact Office Open XML format is not the same as general XML, it contains its typical XML elements. So you need to find these typical XML elements and convert them to XLST formats.

    For more information, please refer to Structure of a WordprocessingML document (Open XML SDK).


    We are trying to better understand customer views on social support experience, so your participation in this interview project would be greatly appreciated if you have time. Thanks for helping make community forums a great place.
    Click HERE to participate the survey.

    Friday, March 14, 2014 7:00 AM
    Moderator
  • Hi,

    Thanks for your reply.

    I have learned many things related to word processing from the above reply.

    I meant docx2html.xslt file is comes along with SharePoint 2010, it can be used for xml to xsl transformation and resultant html will be used to display in browser. But the docx2html.xslt file does not have the stylesheet for all the features of word.

    I need to know is it possible to have/available a complete(need all the word features) xslt file to download.

    Ashok

    Friday, March 14, 2014 9:44 AM
  • Hi Ashok,

    As far as I know, there are no complete XSLT file which could convert word to html.

    To view a word document online (html), I suggest to use Office Online Product.

    By the way, OpenXML SDK do not have such kind of features to convert a word document to a *.html file.


    We are trying to better understand customer views on social support experience, so your participation in this interview project would be greatly appreciated if you have time. Thanks for helping make community forums a great place.
    Click HERE to participate the survey.

    Thursday, March 20, 2014 10:39 AM
    Moderator
  • Hi Askon

    There is now project which aim to create perfect WYSIWYG transform XSLT from DOCX file to HTML

    Please see docx to html converter in Github.

    So far it does support things you are asking: images, ordered list and fonts.

    Tuesday, September 15, 2015 1:36 PM