Bulk Convert Doc Docx Approaches - OMPM/Office 2007 Format, Word.Interop.Document.Convert 2010 Format
22 กรกฎาคม 2555 7:55
The OMPM overview states that documents are converted to the Office 2010 format:
"Store the scan data, and convert older Office files into the Office 2010 file formats"
However, the OFC tool states that documents are only upgraded to Office 2007 format, and indeed files converted open in compatibility mode:
The benefit of OMPM as I see it is the analysis and reporting as it helps identify conversion issues.
It seems that with some simple Word automation code though, documents can go straight to 2010.
(It might even be as simple as using one of the SaveAs2 overloads and specifying to save as the current version).
Is this approach any less reliable than OMPM?
23 กรกฎาคม 2555 5:53ผู้ดูแล
Thanks for posting in the MSDN Forum.
It seems that you forgot clarify the mean of OFC here. What's mean of it? As far as I known the Word 2007 and Word 2010 all use Open Xml formate, and they use .docx extend name too.
Have a good day,
Tom Xu [MSFT]
MSDN Community Support | Feedback to us
23 กรกฎาคม 2555 13:25
Hi Tom, you will need to follow the links to get a little more context for the question I am asking.
OMPM = Office Migration and Planning Manager
OFC = Office File Converter
Word 2010 and 2007 formats are based on Open XML.
My questions are:
Why can't OFC go to the 2010 format?
Is using the Word.Convert method any less reliable than OFC conversion tool.
1 สิงหาคม 2555 19:23
For Word 2010, the schema for the OOXML DOCX definition was changed. This is why the 2007 DOCX format shows as compatibility mode in Word 2010. I believe this change was only for the Word OOXML document formats and not the other Office formats so that should not be an issue for the other office applications. The OMPM does not convert this change for the Word OOXML documents. It was not included for the OMPM.
The method that you propose of using the object model objects and methods is a perfectly good approach to a mass update of these files. Of course, I would always recommend a good backup of your files prior to running any change like that or converting then saving as a new name just to be safe.
7 สิงหาคม 2555 9:05
Assuming Kieren has read the links he posted, he will understand the Word 2007 DOCX format is different from the Word 2010 DOCX.
The question remains as to why Microsoft have not bothered updating their Office Migration Planning Manager. They took the time to write an article targeted at Word/Office 2010 users.
Since the binary (.doc) file format is open specification, as is the 2010 XML (.docx) format most programmers could write their own file converter. Hell, I'll do it if you pay me. But it would probably take several weeks, unless you're already intimately familiar with those formats. Had Microsoft done it, it would have saved hundreds or thousands of developers a lot of time.
Yes, you can can do it through COM interop (the Word Object Model); the main issue there is how slow it is. Which is 'very'. Also not recommended for server side processing.
- แก้ไขโดย JosephFox 7 สิงหาคม 2555 9:34 Previously I estimated it would take 'several days' to write one's own file converter. I think that was being optimistic.
28 กันยายน 2555 3:47
I automated this using VSTO.
There wasn't too much code to write, and it was still quite fast, but as others have pointed out, it seems quite short sighted that Office Migration Planning Manager 2010 doesn't convert straight to the 2010 format.