none
Cuttoff issue with printing MHT file RRS feed

  • Question

  • Hello, 

    We are using C#.net to print MHT file using Word 2013 Interop.

    Input MHT file not ended with proper Html tags, so opening in word file gives missing content or data cutoff issue.

    find shared link for mht file having issue.
    https://drive.google.com/open?id=0B12pIBigSgsfdThQTTZXelp3UWM

    How to correct tags for this kind of MHT file before printing ?
    Is there any option in word interop which will take care of printing all MHT file data ?

    Opening this mht file in word and save this file as MHT, then try to open in word having proper data, but can we to do it from C# while opening word file ?

    https://www.coolutils.com/online/MHT-to-DOC#

    When i try converting file with given online tool it converted and open in word file with out any missing content.

    If there any availabe .net dll which identies this kind of MHT files having issue with ending tag and can modify it before printing ?

    Note: opening file in IE display proper data of MHT.

    Thanks

    Thursday, December 1, 2016 1:48 PM

All replies

  • Hi Viral84,

    MHT is a Web page archive file format. The archived Web page is an MHTML (short for MIME  HTML) document. MHTML saves the Web page content and incorporates external resources, such as images, applets, Flash animations and so on, into HTML documents.  

    In Internet Explorer, when you save a Web page as a Web archive, the page is saved as an MHT file.  Any relative links in the HTML (those that don’t include all information about the location of the content but assume all content is in a directory on the host server) will be remapped so the content can be located.

    MHT files open in Internet Explorer or, with an add-on, in Firefox and some other browsers. MHTML  Converter is one program that converts MHTML files to regular HTML.

    you had asked,"How to correct tags for this kind of MHT file before printing ?
    Is there any option in word interop which will take care of printing all MHT file data ?"

    Word interop doesn't provide anything that can correct the tags of MHT file.

    there is no any option in word interop that can take care of printing all MHT file data.

    but you can try to convert your MHT file to HTML file. then with the use of word interop you can convert that HTML file to word file and then you can print. it can solve your issue or you can directly generate HTML file.

    Below is an example code for convert HTML to word.

    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Text;
    using Microsoft.Office.Interop.Word;
    using System.IO;
     
    namespace Utilities
    {
        /// <summary>
        /// A static class to convert from HTML document to Word document
        /// </summary>
        public static class HTML2WordConverter
        {
     
            private static Application word;
            private static Document document;
     
            /// <summary>
            /// converts a HTML file to Word file
            /// </summary>
            /// <param name="htmlSrcFilePath">the path to the source HTML file</param>
            /// <param name="wordDestFilePath">the path of the destination word file</param>
            public static void Convert(string htmlSrcFilePath, string wordDestFilePath, bool embedImages)
            {
                FileInfo SrcFile = new FileInfo(htmlSrcFilePath);
                FileInfo DestFile = new FileInfo(wordDestFilePath);
                if (SrcFile.Exists == false)
                {
                    throw new Exception(htmlSrcFilePath + " doesn't exist.");
                }
     
                word = new Application();
                document = new Document();
                try
                {
                    document = word.Documents.Add();
                    word.Visible = false;
     
                    document = word.Documents.Open(SrcFile.FullName);
     
                    document.Activate();
     
                    if (embedImages)
                    {
                        //embed inline images in the document
                        foreach (InlineShape image in document.InlineShapes)
                        {
                            if (image.LinkFormat != null)
                            {
                                try
                                {
                                    image.LinkFormat.SavePictureWithDocument = true;
                                    image.LinkFormat.BreakLink();
                                }
                                catch (Exception ex) { /* nothing */ }
                            }
                        }
                    }
     
                    document.ActiveWindow.View.Type = Microsoft.Office.Interop.Word.WdViewType.wdPrintView;
     
                    document.SaveAs(DestFile.FullName, WdSaveFormat.wdFormatDocument);
     
                    document.Close(false);
                    word.Quit();
                }
                catch (Exception ex)
                {
                    try
                    {
                        document.Close(false);
                        word.Quit();
                    }
                    catch (Exception ex2) {/* nothing */}
                    throw ex;
                }
            }
        }
    }
    

    Reference:

    Convert HTML to WORD in .Net

    if you just want to print the Web page content then you can try to save web page as pdf.

    Regards

    Deepak


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    Friday, December 2, 2016 1:24 AM
    Moderator