none
Text border missing when exporting from Word to PDF via COM interop RRS feed

  • Question

  • I have created a simple VB.NET application that converts Word documents to PDF. It does this via the exportAsFixedFormat API. Usually this works fine, but for some reason, borders around text are not rendered.

    When I try COM interop via the old VB (using cscript.exe), the problem does not occur. Also, when I manually save as PDF in Word, the borders are present. I have no idea what the problem might me.

    The only other reference to this problem on the web is this one: 

    http://stackoverflow.com/questions/13039884/word-2010-interop-pdf-export-missing-border-line

    My environment:

    Office 2010 x86
    Server 2008 R2

    Visual Studio 2010

    Any help appreciated!




    Thursday, January 17, 2013 10:28 AM

Answers

  • Hallo Michael

    A MVP colleague who works in the end-user/accessibility tier of Office and Word has apparently had some experience with this type of problem and gives me permission to share her views:

    "This is a bug that started with Acrobat 9 and Office 2007. If you used a plain border the border would convert nicely to a tagged/or untagged PDF. If you use anything but the simple straight line border, it went missing or got tagged as individual images without Alt text.

    I think that the bug that tags table gridlines as images without Alt Text or elements not in the structure tree of a tagged PDF might be associated with this since the gridlines of a table are “borders.”

    I’m not sure if it is a problem with Acrobat or Microsoft. I know both conversion tools use the PDF 1.8 specifications which are the latest ones so am thinking this is an issue with the DOCX environment because this happens in Office 2007 and later even if you use a DOC document.

    Although I don’t use VBA I do know this is an issue and I’ve bugged it with both PDF teams (Adobe and Microsoft) several times."

    Cindy Meister, VSTO/Word MVP, my blog

    Thursday, January 24, 2013 4:14 PM
    Moderator

All replies

  • Hi Michael,

    Thanks for posting in MSDN Forum.

    As you said that

     When I try COM interop via the old VB (using cscript.exe), the problem does not occur. 

    Why not try to invoke native COM interface directly.

    Have a good day,

    Tom


    Tom Xu [MSFT]
    MSDN Community Support | Feedback to us
    Develop and promote your apps in Windows Store
    Please remember to mark the replies as answers if they help and unmark them if they provide no help.

    Monday, January 21, 2013 7:02 AM
    Moderator
  • Hi Michael

    Have you tried Document.SaveAs and specifying the Format type for PDF, rather than ExportAsFixedFormat? SaveAs will come closer to what you get when working in Word as a user...


    Cindy Meister, VSTO/Word MVP, my blog

    Monday, January 21, 2013 4:34 PM
    Moderator
  • We can't use legacy VB, because that has its own issues, like some files cannot be converted at all, for some odd reason.
    Monday, January 21, 2013 5:26 PM
  • Unfortunately we need the possibility to save as PDF/A and switch between print and screen resolution, and those options aren't available with saveAs(). 
    Monday, January 21, 2013 5:28 PM
  • Please show us the details for:
    - the code giving you problems
    - what you mean by using the "old VB"
    - the steps you follow when saving "by hand".


    Cindy Meister, VSTO/Word MVP, my blog

    Tuesday, January 22, 2013 7:24 AM
    Moderator
  • Here it is. The VB code is complete, VB.NET code shows the important parts. I feel uncomfortable pasting the full code (also its too large for the forum), but can give it to you via mail.

    The VB code (executed via cscript.exe):

    source=WScript.Arguments(0)
    target=WScript.Arguments(1)
    
    Set application= CreateObject("Word.Application")
    application.visible = false
    set document = application.documents.open(source)
    
    ExportFormat = 17 rem pdf
    OpenAfterExport = false
    OptimizeFor = 0
    application.WordBasic.DisableAutoMacros(1)
    document.ExportAsFixedFormat target, ExportFormat, OpenAfterExport, OptimizeFor
    document.Close(0)
    
    application.Quit

    The VB.NET code (simplified):

    ' create app        
    Dim application As word.Application = New word.Application
    
            application.Visible = False
            application.AutomationSecurity = MsoAutomationSecurity.msoAutomationSecurityForceDisable
            application.DisplayAlerts = word.WdAlertLevel.wdAlertsNone
            application.ScreenUpdating = False
            application.FileValidation = MsoFileValidationMode.msoFileValidationSkip
            StrictOffHelper.DisableAutoMacros(application)
    
    ' open doc
            Dim SourcePath As Object = source
            Dim ConfirmConversions As Object = False
            Dim OpenReadOnly As Object = True
            Dim AddToRecentFiles As Object = False
            Dim PasswordDocument As Object = System.Reflection.Missing.Value
            Dim PasswordTemplate As Object = System.Reflection.Missing.Value
            Dim Revert As Object = True
            Dim WritePasswordDocument As Object = System.Reflection.Missing.Value
            Dim WritePasswordTemplate As Object = System.Reflection.Missing.Value
            Dim Format As Object = System.Reflection.Missing.Value
            Dim Encoding As Object = System.Reflection.Missing.Value
            Dim Visible As Object = False
            Dim OpenAndRepair As Object = True
            Dim DocumentDirection As Object = System.Reflection.Missing.Value
            Dim NoEncodingDialog As Object = True
            Dim XMLTransform As Object = System.Reflection.Missing.Value
    
    
    ' export as pdf
    
            Dim OutputFileName = targetPath
            Dim ExportFormat = word.WdExportFormat.wdExportFormatPDF
            Dim OpenAfterExport = False
            Dim OptimizeFor = word.WdExportOptimizeFor.wdExportOptimizeForPrint
            If ("size" Like quality) Then
                OptimizeFor = word.WdExportOptimizeFor.wdExportOptimizeForOnScreen
            End If
    
            Dim Range = word.WdExportRange.wdExportAllDocument
            If (toPage IsNot Nothing) Then
                Range = word.WdExportRange.wdExportFromTo
            End If
    
            Dim PageFrom As Integer = 0
            If (fromPage IsNot Nothing) Then
                PageFrom = Integer.Parse(fromPage)
            End If
    
            Dim PageTo As Integer = 0
            If (toPage IsNot Nothing) Then
                PageTo = Integer.Parse(toPage)
            End If
    
            Dim Item = word.WdExportItem.wdExportDocumentWithMarkup
            Dim IncludeDocProps = True
            Dim KeepIRM = False
            Dim CreateBookmarks = word.WdExportCreateBookmarks.wdExportCreateNoBookmarks
            Dim DocStructureTags = True
            Dim BitmapMissingFonts = True
            Dim UseISO19005_1 = False
            Dim FixedFormatExtClassPtr As Object = System.Reflection.Missing.Value
    
    
            document.ExportAsFixedFormat(OutputFileName, ExportFormat, OpenAfterExport, OptimizeFor, Range, PageFrom, PageTo, Item, IncludeDocProps, KeepIRM, CreateBookmarks, DocStructureTags, BitmapMissingFonts, UseISO19005_1, FixedFormatExtClassPtr)
    
    
    

    Tuesday, January 22, 2013 11:33 AM
  • One major difference I see is that you're passing a lot more parameters in the VB.NET version than the script version. VB.NET isn't C#, you should be able to let it do the same thing as the script version - pass only the parameters you actually want/need.

    If you tell me you need some of these additional parameters, I'd say plug them into your old script and see if that doesn't give you the same "bad" result.


    Cindy Meister, VSTO/Word MVP, my blog

    Tuesday, January 22, 2013 8:30 PM
    Moderator
  • Hi Cindy,

    thanks for the hint, some more experimentation lead me to the solution. I first thought that the underlying reason was a difference in the COM implementation, but it really was the different parameters.

    It turns out that this setting was the culprit:

    application.ScreenUpdating = False

    For some reason, Word won't render borders when screen updating is turned off. I guess I'll have to turn it back on again and accept the performance penalty that comes with it.

    Is there a way to file a bug report, or is this even a known bug?

    Best regards,

    Michael Böckling

    Thursday, January 24, 2013 11:03 AM
  • Hallo Michael

    That's interesting, the ScreenUpdating is affecting this. I would never have thought that saving to PDF would involve reading the screen rendering, rather than simply "transforming" the content. But I suppose, when I stop and really think about it, PDF is a "snapshot", while Word is always "in flux" (recalculating the layout). That's why the option to turn off screen updating is there - to speed things up during macro execution.

    When I look at it that way it makes a certain amount of sense that Word first needs to render everything on-screen before it can generate the PDF file. It has to firmly say to itself "OK, this is where each line breaks, this is where the page breaks, so much space between this line and the next..."

    So, if Word performs a dynamic calculation of when/where to display borders I can see where this could cause problems when taking the "snapshot" of the pages in order to generate the PDF. To quote a well-known figure, "Fascinating".

    Tom could probably escalate a bug report, but based on the above reflections, I think it's not a bug, but a necessity, given how Word functions.

    You can always turn screen updating on just before generating the PDF, force a Repagination (just to be sure things are updated), save to PDF format, then turn screen updating off again. You will get a slight performance hit, but unless these documents are hundreds of pages it shouldn't be a huge problem?


    Cindy Meister, VSTO/Word MVP, my blog

    Thursday, January 24, 2013 1:30 PM
    Moderator
  • Hi Cindy,

    thanks for the explanation. I'll do just that, its probably ok.

    Still, I'd file a bug report, because this behavior is definitely unexpected. If this cannot be fixed due to the way Word works, the devs can still reject the issue, but in my experience it is always good to let people know that there is a problem, because they often are not aware of it.

    Also, documenting this in the API docs for the ScreenUpdating property will help the next por guy that hits this issue.

    Cheers,

    Michael

    Thursday, January 24, 2013 1:39 PM
  • Hallo Michael

    Documenting - ASBOLUTELY!

    Recently, when I had an analogous peeve where I'd spent hours trying to track down the reason for a behavior (it was a real buggy-bug in the Open XML SDK 2.0, fixed in 2.5, but undocumented) and complained, I was told to report suggestions for documentation here:
      http://social.msdn.microsoft.com/Forums/en-US/libraryfeedback/threads/


    Cindy Meister, VSTO/Word MVP, my blog

    Thursday, January 24, 2013 1:48 PM
    Moderator
  • Hallo Michael

    A MVP colleague who works in the end-user/accessibility tier of Office and Word has apparently had some experience with this type of problem and gives me permission to share her views:

    "This is a bug that started with Acrobat 9 and Office 2007. If you used a plain border the border would convert nicely to a tagged/or untagged PDF. If you use anything but the simple straight line border, it went missing or got tagged as individual images without Alt text.

    I think that the bug that tags table gridlines as images without Alt Text or elements not in the structure tree of a tagged PDF might be associated with this since the gridlines of a table are “borders.”

    I’m not sure if it is a problem with Acrobat or Microsoft. I know both conversion tools use the PDF 1.8 specifications which are the latest ones so am thinking this is an issue with the DOCX environment because this happens in Office 2007 and later even if you use a DOC document.

    Although I don’t use VBA I do know this is an issue and I’ve bugged it with both PDF teams (Adobe and Microsoft) several times."

    Cindy Meister, VSTO/Word MVP, my blog

    Thursday, January 24, 2013 4:14 PM
    Moderator