Extract Text from 2007 and 2010 PPT and PPTX files using VSTO RRS feed

  • Question

  • I’m trying to figure out a programmatic method to extract text from PowerPoint slide files.  I literally have hundreds of files and each file has anywhere from 20 to 300 individual slides in them.


    The files are both ppt and pptx formats and the older formatted ones can be easily converted if doing this by an xml form of extract is required.  With that said, my knowledge of extracting data from an xml file using VSTO is very limited.  I use VSTO 2010 and the VB language.


    The text I want to extract is not in the notes; that’s easy and I fully understand how to do that. It’s the embedded text in the shapes such as Titles and Content Placeholders.


    What I want to do is get this text into Word documents, without the shapes, and I’m thinking about either a Word Add-in or Template that I can execute against each slide deck and produce a document.  Conceptually, the idea is to play in reverse the functioning of the PresentIt macro of Word.


    Any help is much appreciated

    Thursday, March 24, 2011 10:17 PM


All replies