locked
Programmatically Convert .docx to .pdf RRS feed

  • Question

  • Please excuse this question that isn't strictly a VSTO question; I figure there may be several people on this forum that can point me to the right direction or provide some helpful thoughts nonetheless.

     

    Our requirement is to be able to create an installation on a workstation that does not have Office 2007 installed, that installs our application and sufficient additional libraries such that we can programmatically convert a Word 2007 .docx file to a .pdf file.   One programmatic solution, which Microsoft offers sample code for, relies on Word 2007 being installed on the workstation along with a Save As PDF add-on also being installed.  But we can't presume Office 2007 is already installed on the target machine.  3rd-party convert-to-pdf libraries exist, including those that can convert a Word 2007 document to pdf, but they appear to rely on Word 2007 being already installed on the workstation.

     

    We are currently using Framework 3.0's System.IO.Packaging to program against the new Word 2007 format, but I'm unaware of it providing a convert-to-pdf facility within itself, such that deploying Framework 3.0 would be enough to meet our needs.  We're planning to deploy Framework 3.0, so if that's all that's needed, please let me know. 

     

    Does Microsoft offer some sort of Office 2007 redistributable DLL(s) that we could include on an install, along with perhaps one of the 3rd-party print-to-PDF SDKs, that would give us what we need?

     

    Thank you.

    Tuesday, May 1, 2007 1:12 AM

Answers

  • No, Microsoft doesn't supply anything like that. But I believe there are third-party products. One such posts regularly in one of the office.development newsgroups (Altsoft, I believe) when a new product becomes available.

     

    Or you can roll your own transform. The PDF file format is public domain.

    Tuesday, May 1, 2007 4:54 PM

All replies

  • No, Microsoft doesn't supply anything like that. But I believe there are third-party products. One such posts regularly in one of the office.development newsgroups (Altsoft, I believe) when a new product becomes available.

     

    Or you can roll your own transform. The PDF file format is public domain.

    Tuesday, May 1, 2007 4:54 PM
  • Word.Application can convert document into .pdf format. When we are saving manually, we can change quality of printing. Programmatically we use method SaveAs with several parameters. But I can not find a parameter assigned to quality of printing. So, how can I change a print quality when printing into .pdf?

    Thanks.
    Tuesday, August 7, 2007 2:10 PM
  • If there's a PDF printer installed, or for Word 2007 the PDF converter has been downloaded and installed (it's not there by default), then yes, Word can save to PDF format.

     

    But the Word object model has no way to work with printer settings or to change the parameters of printing to PDF. Keep in mind: the PDF format belongs to Adobe. Adobe refused to let Microsoft distribute it as part of Office, so these things couldn't be built into the Office object models.

    Tuesday, August 7, 2007 2:26 PM
  • Hi,

    Did this go anywhere towards a solution? I have the same problem in that I have (fabulously) created a complex Word 2007 .docx file on the web server (via content controls some Xpath querying to replicate rows for repeating 'lists' of table data), and now want to send it to the browser. Word compatibilty issues and the problem of it not opening inline are now forcing me to try convert this .docx file to a PDF before redirecting it to the browser. I have looked at one 3rd party tool so far to achieve this, but it fails miserably when trying to render the docx as PDF, I suspect because the samples are really noddy but a real Word 2007 document is much bigger. Is there a way of reliably performing this conversion? I do not want to do this via any Word object model as this does not sem the right way to go. Any thoughts, anyone?

    Thanks Wink
    Monday, December 31, 2007 4:35 PM
  • Once you have the PDF/XPS exporter installed, you can .SaveAs anything Word can open into pdf.

     

    Here is some sample javascript code:

     

    var filename = "c:\\docs\\myfile.docx";

     

    var msword = WScript.CreateObject("Word.Application");

    msword.Visible = false;

    msword.WindowState = 2; // minimized

     

    msword.Documents.Open(filename);

    msword.ActiveDocument.SaveAs(filename + ".pdf", 17); // 17 is the magic number for wdFormatPDF

     

    msword.quit();

     

    Wednesday, March 19, 2008 1:01 AM
  • Yo, Goat my man, love to know more about your generated Word 2007 document. I'm trying to do the same with Word 2008/Mac, but I run into a problem... any rel, image, or XML file in the package that I change (other than the document.xml itself) is seen as corrupt by Word when it tries to open the file. Sure enough, Word is clever enough to recover the remaining elements, but anything I change gets nuked.

    That prevents me from swapping in new images (charts in my case), or modifying any hyperlinks (which exist in .rel files).

    FYI for others out there, I found that I can open up a .docx file directly in Stuffit Archive Manager (without even having to change the extension), which eliminates the need to re-zip the .docx files from scratch (which seems to blow my entire document when I try). Using SAM, I can extract only the file I choose, edit it, put it back, and the other elements remain intact.

    Goat, how did you succeed at altering the document without generating a document is corrupt message? Or did you edit only the document.xml file?

    It seems to be a checksum of some sort that Microsoft is imposing; kinda blows the value of having an open XML document, don't you think? Thanks, MS.

    Anyone, please, this annoying issue stands in the way of a pretty cool document generation system.

    thanks.
    Saturday, October 10, 2009 4:57 PM
  • I've figured out a little more on this issue. The key is the compression technique: the docx isn't actually a Plain Old Zip (POZ) file, it's actually an Open Packaging Convention (OPC) file:

    http://en.wikipedia.org/wiki/Open_Packaging_Conventions

    Now, if I could only find an OPC creator/manager for the Mac... a GUI would be great, but a command line would do as well. I seemed to have found what at first appeared such a tool, but it doesn't seem to do anything except manage the MacPorts:
    http://www.versiontracker.com/dyn/moreinfo/macosx/32608

    FYI, Porticus needs MacPorts installed as well:
    http://www.macports.org/install.php

    However, I can't seem to see how Porticus helps me with the OPC file management... I'm guessing that there are Mac developers here more informed than me, hoping someone can shed light on my OPC requirements.

    Hope this helps others. Thanks.
    Sunday, October 11, 2009 4:28 PM
  • Thanks a lot.. Works superb.. Im running it in vb6..the below four lines makes it..

    I have the Documents object opened which I create from a template, which is the active document..

    Set oDoc = oWord.Documents.Open(docFile)

    '### Convert DOCX to PDF

    oWord.Visible = False

    oWord.WindowState = wdWindowStateMinimize

    oWord.Documents.Open (docFile1)
    oWord.ActiveDocument.SaveAs docFile2, 17

    '### End of Convert DOCX to PDF

     

    Tuesday, July 12, 2011 10:35 AM
  • do you have a code for vb.net?

     


    Monday, August 29, 2011 7:06 AM
  • This is basically it, in C# anyway:

    Microsoft.Office.Interop.Word.Application appWord = new Microsoft.Office.Interop.Word.Application();
    wordDocument = appWord.Documents.Open(DocFrom.docx");
    wordDocument.ExportAsFixedFormat(DocTo.pdf", WdExportFormat.wdExportFormatPDF);
    

     

    Friday, September 30, 2011 6:17 PM
  • Hello :)

    If you may have MSOffice and may to use it....you can use "UseOffice .Net" for the converting. Library convert from DOC/DOcx to PDF and many another converting between different formats.

    Sample code will be helpfull:

    			SautinSoft.UseOffice u = new SautinSoft.UseOffice();
                if (u.InitWord() == 0)
                {
                    //convert Word (RTF, DOC, DOCX to PDF)
                    u.ConvertFile(@"d:\Brochure.docx", @"e:\Brochure.pdf", SautinSoft.UseOffice.eDirection.DOC_to_PDF);
                }
                u.CloseOffice();

    Friday, March 9, 2012 10:58 AM
  • Magnet that solution also requires having MS Word installed, as Eric mentioned I also was looking for something that would not have this dependency. I ended up using this Word's 3rd party component for VB.NET.

    Also here is how to use it in order to convert a DOCX document into a PDF file with VB.NET:

    Dim loadOption = LoadOptions.DocxDefault
    Dim document = DocumentModel.Load("Input File.docx", loadOption)
    
    Dim saveOption = SaveOptions.PdfDefault
    document.Save("Output File.pdf", saveOption)
    Tuesday, November 24, 2015 7:21 AM