locked
Converting Microsoft word files to Html Programmically RRS feed

  • Question

  • Hello Everyone,

    I have a website written in C# and I was wondering if there was a way to conver Microsoft .Doc or .Docx files to Html programmically?

    I have done a search on this and wasn't able to come up with a solid solution.

    Ideally i would like to use .NET accomplish to this, but any inputs using any frameworks is apperciated


    Best of regards,
    Amir


    • Edited by Amir G Saturday, June 28, 2008 11:05 PM grammer
    Saturday, June 28, 2008 11:04 PM

Answers

All replies

  • You can automate word via COM to save the document as HTML


    MSMVP VC++
    Sunday, June 29, 2008 2:50 AM
  • Hi Amir,

    Can you explain.., What do you want to achieve by doing this?, by converting Doc to HTML.

    Regards,
    Hemanth.

    Hemanth .NET Professional
    Sunday, June 29, 2008 11:15 AM
  • Hey guys,

    Sheng, thanks for the reply, but to follow the method you have mention i have to have a copy of Microsoft Word installed on the server, Is that correct? if so that will be inconvenient for our company, is there a better way ?


    Hemanth, Thanks also for the reply, we are running a system for employees where specific word documentation coming from headquarters needs to be parsed and displayed online, among other things. The problem is sometimes the documentations come in .doc format and something they come in .docx format. I haven't been able to find a solid solution that can properly parse these files, they are some open source word readers but they easily break say as soon as you have a table in your document

    THanks
    AMir
    Software Developer
    Sunday, June 29, 2008 2:16 PM
  • see the Alternatives to server-side Automation section in Considerations for server-side Automation of Office
    MSMVP VC++
    • Marked as answer by Amir G Monday, June 30, 2008 6:55 PM
    Monday, June 30, 2008 6:48 PM
  • Thanks alot

    Amir
    Software Developer
    Monday, June 30, 2008 6:55 PM
  • I googled "Spire.doc for .Net " which is made by E-iceblue. It is effortlessly to use  C#/VB.NET to  convert HTML to Word

    Step 1

    Create a project in Visual Studio and add Spire.Doc as reference.

    Step 2

    Load the HTML file which will be converted to Word doc file by using the follow code:

     

                Document document = new Document();

                document.LoadFromFile(@"D:\Work\Stephen\2011.12.06\test.html",FileFormat.Html,XHTMLValidationType.None);

    Step 3

    The following code below can help us convert the HTML file to Word doc. Furthermore, Spire.Doc also enables convert HTML to PDF, XML, ePub, Text, Dot, etc.

    document.SaveToFile("test.doc", FileFormat.Doc);

    Step 4

    Write the whole simple code into the project and press F5 to start the conversion.

    Full code :

    C#

    using System;

    using Spire.Doc;

    using Spire.Doc.Documents;

     

    namespace Html2Doc

    {

        class Program

        {

            static void Main(string[] args)

            {

                Document document = new Document();

                document.LoadFromFile(@"D:\test.html",FileFormat.Html,XHTMLValidationType.None);

                document.SaveToFile("test.doc", FileFormat.Doc);

            }

        }

    }

    VB.NET Convert HTML to Word:

    Imports System

    Imports Spire.Doc

    Imports Spire.Doc.Documents

     

    Namespace Html2Doc

             Friend Class Program

                       Shared Sub Main(ByVal args() As String)

                                Dim document As New Document()

                                document.LoadFromFile("D:\test.html",FileFormat.Html,XHTMLValidationType.None)

                                document.SaveToFile("test.doc", FileFormat.Doc)

                       End Sub

             End Class

    End Namespace

     

     


    Thursday, August 15, 2013 2:07 AM
  • Hi Amir,

    Spire.doc is able to convert the word (doc, docx) to html, you could have an evaluation about the component. it's the code below


    using System.Collections.Generic;
    using System.Linq;
    using System.Text;
    using Spire.Doc;
    using System.Drawing;
    using System.Drawing.Imaging;

    namespace test
    {
        class Program
        {
            static void Main(string[] args)
            {
                Document doc = new Document();
                doc.LoadFromFile("test.Docx);
                doc.SaveToFile("test.html", FileFormat.HTML);
                doc.Close();
            }
        }
    }
    • Edited by Ringinter Saturday, August 17, 2013 7:47 AM
    Saturday, August 17, 2013 7:46 AM