none
Using Interop.Word in C# ,Programatically Add the whole content of the Pdf Document in the word document RRS feed

  • Question

  • I have a reference doc in pdf format. I have to add the whole content of the pdf in the word document at the end.I want to take the image of the pdf doc and paste this image at the last of the documet.
    Please suggest the best approach for this.

    Thanks Santosh


    • Edited by santoshsp Thursday, February 1, 2018 6:01 AM
    Thursday, February 1, 2018 6:01 AM

All replies

  • Hi santoshsp,

    At first, you need to use a PDF library such as Spire.PDF to read the text and image content from the PDF file, and then insert text and image to the specified location in Word document. You can get Spire.PDF from NuGet and refer the following code:

    PdfDocument doc = new PdfDocument();
    doc.LoadFromFile(@"..\Sample_image.pdf");
    
    StringBuilder buffer = new StringBuilder();
    IList<img border="0"> images = new List<img border="0">();
    
    foreach (PdfPageBase page in doc.Pages)
    {
        buffer.Append(page.ExtractText());
        foreach (Image image in page.ExtractImages())
        {
            images.Add(image);
        }
    }
    
    doc.Close();
    

    For more information, check the following link:

    Read PDF Images and Text in C#, VB.NET

    Thursday, February 1, 2018 7:56 AM
  • I can not use any other thing like paid or open source libraries except Microsoft Libraries.  It may also be possible that we take the snapshot of Pdf Doc and then insert it as an image in the word document.

    Please suggest me around this.


    Thanks Santosh

    Friday, February 2, 2018 3:54 AM
  • Hello santoshsp,

    >>Using Interop.Word in C#  ,Programatically Add the whole content of the Pdf Document in the word document

    As far as I know, Interop.Word namespace doesn't provide a way to convert pdf to word. Because pdf format is Adobe company designed and it doesn't be classified as a Microsoft office software. Microsoft just provided service for offices. If you want to convert the doc file to other format, you could try the below code.

       public bool WordTo(string sourcePath, string targetPath)
            {
                bool result = false;
                Microsoft.Office.Interop.Word.Application application = new Microsoft.Office.Interop.Word.Application();
                Document document = null;          
                try
                {
                    application.Visible = false;
                    document = application.Documents.Open(sourcePath);
    
                    document.ExportAsFixedFormat(targetPath, WdExportFormat.wdExportFormatPDF);
                    result = true;
                }
                catch (Exception e)
                {
                    Console.WriteLine(e.Message);
                    result = false;
                }
                finally
                {
                    document.Close();
                }
                return result;
            }

    Best regards,

    Neil Hu


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    Thursday, February 8, 2018 11:33 AM
    Moderator
  • "I can not use any other thing like paid or open source libraries except Microsoft Libraries"

    Then you cannot build a modern .NET app. It is very hard to build a .NET app today that doesn't rely on at least 1 non-MS library. For example, everything is pretty much REST based today. REST uses JSON. .NET doesn't have a JSON parser (that isn't deprecated). Instead they always use JSON.NET. This is a third party library. 

    I think you need to reevaluate this rule as it doesn't make sense in a modern world. If you're building a web app then it is pretty much impossible without third party library support. No usable web app exists today that doesn't rely on at least 1 third party JavaScript library.

    Most companies tend to just have a requirement that you cannot use libraries that need to be installed. That is fine because you can get everything you need from NuGet. There is no installation requirements. This is how .NET apps are built these days.


    Michael Taylor http://www.michaeltaylorp3.net

    Thursday, February 8, 2018 3:08 PM
    Moderator