none
Get Doc Formatting in on web page RRS feed

  • Question

  • Hi to all
    I have succesfully able to get read doc file and able to show on web page

     object path = @"C:\Documents and Settings\user\Desktop\test.docx";
                Microsoft.Office.Interop.Word.Application wordApp;
                wordApp = new Microsoft.Office.Interop.Word.Application();
                object file = path;
                object nullobj = System.Reflection.Missing.Value;
                
                Microsoft.Office.Interop.Word.Document doc = wordApp.Documents.Open(
                    ref file, ref nullobj, ref nullobj,
                    ref nullobj, ref nullobj, ref nullobj,
                    ref nullobj, ref nullobj, ref nullobj,
                    ref nullobj, ref nullobj, ref nullobj);
    
                
                
                doc.ActiveWindow.Selection.WholeStory();
                doc.ActiveWindow.Selection.Copy();
                string sFileText = doc.Content.Text;
                doc.Close(ref nullobj, ref nullobj, ref nullobj);
                context.Response.Write(sFileText);


     Now the problem is that content on web page is npw not getting default css of doc file on web page, can anybody please suggest me how to get default css on my web page??

     

    thanks in advance!!!


    Sumit Kumar
    Thursday, January 5, 2012 11:23 AM

Answers

  • Hi Sumit

    A Word document is not made up of HTML. There's no way to directly take the content of a Word document with formatting and just "drop" it on your web-page. It will require some kind of conversion.

    If this is Word 2003, 2007 or 2010 you can use the Document.Content.XML (2003, 2007, 2010) or Document.Content.WordOpenXML (2007, 2010) to get the XML mark-up of the document, which includes formatting. You'd then need to transform that to HTML.

    A variation on this, assuming a Word 2007/2010 file format, would be to not open the document in the Word application at all. The document is already in a ZIP-Package containing XML files, one of which is the document and its markup. Using the standard .NET Packaging and XML namespaces you can extract that information directly from the document file, without opening it in Word (Word doesn't even have to be installed on the machine).

    The other possibility would be to copy the content then take that from the Clipboard as HTML. When you copy in Word, the content is put on the Clipboard in multiple formats, one of which is HTML.

    Another possible approach is to save the Word file to disk as HTML and use that.


    Cindy Meister, VSTO/Word MVP
    • Marked as answer by Bruce Song Wednesday, January 18, 2012 5:58 AM
    Friday, January 6, 2012 8:03 AM
    Moderator

All replies

  • I don't know exactly what type your 'context' variable is, but you will probably use the StyleSheets property of the document.

    StyleSheets css = doc.StyleSheets;

    Thursday, January 5, 2012 3:48 PM
  • Sorry @JosephFox but you may have miss my comment that i am showing this doc content on web page using above code and context.Response.write is a line which is puting doc content on my web/html page.
    If my .docx file have some formatting then it just be preserved of my web page too.. and this what troubling me, i am not able to show formatting of docs file on web page/html page .

    Sumit Kumar
    Friday, January 6, 2012 2:57 AM
  • Hi Sumit

    A Word document is not made up of HTML. There's no way to directly take the content of a Word document with formatting and just "drop" it on your web-page. It will require some kind of conversion.

    If this is Word 2003, 2007 or 2010 you can use the Document.Content.XML (2003, 2007, 2010) or Document.Content.WordOpenXML (2007, 2010) to get the XML mark-up of the document, which includes formatting. You'd then need to transform that to HTML.

    A variation on this, assuming a Word 2007/2010 file format, would be to not open the document in the Word application at all. The document is already in a ZIP-Package containing XML files, one of which is the document and its markup. Using the standard .NET Packaging and XML namespaces you can extract that information directly from the document file, without opening it in Word (Word doesn't even have to be installed on the machine).

    The other possibility would be to copy the content then take that from the Clipboard as HTML. When you copy in Word, the content is put on the Clipboard in multiple formats, one of which is HTML.

    Another possible approach is to save the Word file to disk as HTML and use that.


    Cindy Meister, VSTO/Word MVP
    • Marked as answer by Bruce Song Wednesday, January 18, 2012 5:58 AM
    Friday, January 6, 2012 8:03 AM
    Moderator