none
Docx to HTML - Content of the Header and Footer missing RRS feed

  • Question

  • Hi,

    I am working on a content migration project where in we are migrating documents (.doc) to sharepoint. Here is the approach we are following.

    • Doc is converted to DOCX and then converted into HTML.
    • We use some tool to convert HTML to ASPX.
    • ASPX is pushed into sharepoint. 

    Here I have a problem as the content in the header/footer section in the document are not coming in the HTML. I also looked at the HTML source, in the Header/Footer content are not present in the HTML as well. 

     

    I have converted the DOCX to HTML manually and through CSharp program. In both the cases the Header/Footer are missing. 

    Can any one help me in getting this? This is important for me since the header/footer contains certain data which I need to show in the converted web page.

     

    Regards,

    Abinash


    Sunday, July 3, 2011 6:12 AM

Answers

  • Hi Abinash

    Since HTML doesn't have the concept of header/footer, it's logical that this information is not converted to HTML file format. A header/footer contains text that should appear at the top/bottom of every page of a printed document. Since the HTML result is a single page, no header/footer (with information such as page numbers) is "necessary".

    If you want this information to appear somewhere in the HTML result you'll first need to put (some of) the information somwhere in the main body of the document. I should think you first need to analyse which information would appropriately belong in the HTML result, and where you'd want to see it.

    As this task is Word-specific, I'm going to move your question to the Word for Developers forum. Once you've analysed the required steps, there will be more specialists there who can help you with the coding, but you should also include the version of Word you're using.


    Cindy Meister, VSTO/Word MVP
    • Marked as answer by Bruce Song Monday, July 18, 2011 11:50 AM
    Monday, July 4, 2011 8:39 AM
    Moderator