Word-generated tagged PDFs do not have correct structure for paragraphs split across page boundaries


  • We're post-processing PDFs generated by Office 2007 and up.  With the "Document structure tags for accessibility" option checked, it does a pretty good job of generating a description of the document structure in the PDF structure tree.

    However, if a paragraph splits across pages, the structure tree records this as two separate paragraphs with apparently nothing to tie them together.  Using Adobe's Acrobat plug-in for Word (version 9, the latest I have), it makes use of something called a Marked Content Reference to tie the second part of the paragraph to the first (see, page 732).

    Does anyone know if there is a way (either through the UI or programmatically) to generate the correct PDF tagging for split paragraphs in Word?

    Tuesday, July 09, 2013 7:17 PM

All replies