Answered by:
How to convert .doc files to .docx in a sharepoint library programmatically.

Question
-
Is there any possibility to Convert .doc files to .docx in a sharepoint document library.
I have thousands and lakhs of .doc files and I need to automate to convert those .doc files to .docx with an automation script or powershell script or doing it programmatically.
Can someone help me get through this.
Thanks
Gayatri
- Changed type Hemendra Agrawal Tuesday, March 19, 2013 12:02 PM q
Tuesday, March 19, 2013 5:41 AM
Answers
-
Hello Gayatri,
You can convert files from doc to docx using following options
Option 1
in bulk using Office File Converter (OFC) and Version Extraction Tool. Please refer below url for reference - http://technet.microsoft.com/en-us/library/cc179019.aspx
Option 2 - PowerShell
please refer url -http://blogs.msdn.com/b/ericwhite/archive/2008/09/19/bulk-convert-doc-to-docx.aspx
Convert DOC to DOCX using PowerShell
I was tasked with taking a large number of .DOC and .RTF files and converting them to .DOCX. The files were then going to be imported into a SharePoint site. So I went out on the web looking for PowerShell scripts to accomplish this. There are plenty to choose from.
All the examples on the web were the same with some minor modifications. Most of them followed this pattern:
$word = new-object -comobject word.application
$word.Visible = $False
$saveFormat = [Enum]::Parse([Microsoft.Office.Interop.Word.WdSaveFormat],”wdFormatDocumentDefault”);#Get the files
$folderpath = “c:\doclocation\*”
$fileType = “*doc”Get-ChildItem -path $folderpath -include $fileType | foreach-object
{
$opendoc = $word.documents.open($_.FullName)
$savename = ($_.fullname).substring(0,($_.FullName).lastindexOf(“.”))
$opendoc.saveas([ref]“$savename”, [ref]$saveFormat);
$opendoc.close();
}#Clean up
$word.quit()After trying out several I started to convert some test documents. All went well until the files were uploaded to SharePoint. The .RTF files were fine but even though the .DOC fiels were now .DOCX files they did not allow for all the functionality of .DOCX to be used.
After investigating a little further it turns out that when doing a conversion from .DOC to .DOCX the files are left in compatibility mode. The files are smaller, but they don’t allow for things like coauthors.
So back to the drawing board and the web and I found a way to set compatibility mode off. The problem was that it required more steps including saving and reopening the files. In order to use this method I had to add a compatibility mode object:
$CompatMode = [Enum]::Parse([Microsoft.Office.Interop.Word.WdCompatibilityMode], “wdWord2010″)
And then change the code inside the {} from above to:
{
$opendoc = $word.documents.open($_.FullName)
$savename = ($_.fullname).substring(0,($_.FullName).lastindexOf(“.”))
$opendoc.saveas([ref]“$savename”, [ref]$saveFormat);
$opendoc.close();
$converteddoc = get-childitem $savename
$opendoc = $word.documents.open($converteddoc.FullName)$opendoc.SetCompatibilityMode($compatMode);
$opendoc.save()
$opendoc.close()
}It worked, but I didn’t like it. So back to the web again and this time I stumbled across the real way to do it. Use the Convert method. No one else seems to have used this in any of the examples but it is a much cleaner way to do it then the compatibility mode setting. So this is how I changed my code and now all the files come in to SharePoint as true .DOCX files.
$word = new-object -comobject word.application
$word.Visible = $False
$saveFormat = [Enum]::Parse([Microsoft.Office.Interop.Word.WdSaveFormat],”wdFormatDocumentDefault”);#Get the files
$folderpath = “c:\doclocation\*”
$fileType = “*doc”Get-ChildItem -path $folderpath -include $fileType | foreach-object
{
$opendoc = $word.documents.open($_.FullName)
$savename = ($_.fullname).substring(0,($_.FullName).lastindexOf(“.”))
$word.Convert()
$opendoc.saveas([ref]“$savename”, [ref]$saveFormat);
$opendoc.close();
}#Clean up
$word.quit()- Edited by Piyush Bhatewara Tuesday, March 19, 2013 5:56 AM
- Proposed as answer by Ravi S Kulkarni Wednesday, March 20, 2013 3:30 PM
- Marked as answer by Kelly Chen 2012 Wednesday, March 27, 2013 1:40 AM
Tuesday, March 19, 2013 5:54 AM -
Hi Gayatri,
I understand that you want to use programmatic to convert doc file to docx, then SharePoint 2010 word automation services available with SharePoint server 2010 supports converting word documents to other formats, including Converting between document formats.
For more detailed information, please refer to
http://msdn.microsoft.com/en-us/library/office/ff181518(v=office.14).aspx
Best Regards.
Kelly Chen
TechNet Community Support- Marked as answer by Kelly Chen 2012 Wednesday, March 27, 2013 1:40 AM
Wednesday, March 20, 2013 6:17 AM
All replies
-
Hello Gayatri,
You can convert files from doc to docx using following options
Option 1
in bulk using Office File Converter (OFC) and Version Extraction Tool. Please refer below url for reference - http://technet.microsoft.com/en-us/library/cc179019.aspx
Option 2 - PowerShell
please refer url -http://blogs.msdn.com/b/ericwhite/archive/2008/09/19/bulk-convert-doc-to-docx.aspx
Convert DOC to DOCX using PowerShell
I was tasked with taking a large number of .DOC and .RTF files and converting them to .DOCX. The files were then going to be imported into a SharePoint site. So I went out on the web looking for PowerShell scripts to accomplish this. There are plenty to choose from.
All the examples on the web were the same with some minor modifications. Most of them followed this pattern:
$word = new-object -comobject word.application
$word.Visible = $False
$saveFormat = [Enum]::Parse([Microsoft.Office.Interop.Word.WdSaveFormat],”wdFormatDocumentDefault”);#Get the files
$folderpath = “c:\doclocation\*”
$fileType = “*doc”Get-ChildItem -path $folderpath -include $fileType | foreach-object
{
$opendoc = $word.documents.open($_.FullName)
$savename = ($_.fullname).substring(0,($_.FullName).lastindexOf(“.”))
$opendoc.saveas([ref]“$savename”, [ref]$saveFormat);
$opendoc.close();
}#Clean up
$word.quit()After trying out several I started to convert some test documents. All went well until the files were uploaded to SharePoint. The .RTF files were fine but even though the .DOC fiels were now .DOCX files they did not allow for all the functionality of .DOCX to be used.
After investigating a little further it turns out that when doing a conversion from .DOC to .DOCX the files are left in compatibility mode. The files are smaller, but they don’t allow for things like coauthors.
So back to the drawing board and the web and I found a way to set compatibility mode off. The problem was that it required more steps including saving and reopening the files. In order to use this method I had to add a compatibility mode object:
$CompatMode = [Enum]::Parse([Microsoft.Office.Interop.Word.WdCompatibilityMode], “wdWord2010″)
And then change the code inside the {} from above to:
{
$opendoc = $word.documents.open($_.FullName)
$savename = ($_.fullname).substring(0,($_.FullName).lastindexOf(“.”))
$opendoc.saveas([ref]“$savename”, [ref]$saveFormat);
$opendoc.close();
$converteddoc = get-childitem $savename
$opendoc = $word.documents.open($converteddoc.FullName)$opendoc.SetCompatibilityMode($compatMode);
$opendoc.save()
$opendoc.close()
}It worked, but I didn’t like it. So back to the web again and this time I stumbled across the real way to do it. Use the Convert method. No one else seems to have used this in any of the examples but it is a much cleaner way to do it then the compatibility mode setting. So this is how I changed my code and now all the files come in to SharePoint as true .DOCX files.
$word = new-object -comobject word.application
$word.Visible = $False
$saveFormat = [Enum]::Parse([Microsoft.Office.Interop.Word.WdSaveFormat],”wdFormatDocumentDefault”);#Get the files
$folderpath = “c:\doclocation\*”
$fileType = “*doc”Get-ChildItem -path $folderpath -include $fileType | foreach-object
{
$opendoc = $word.documents.open($_.FullName)
$savename = ($_.fullname).substring(0,($_.FullName).lastindexOf(“.”))
$word.Convert()
$opendoc.saveas([ref]“$savename”, [ref]$saveFormat);
$opendoc.close();
}#Clean up
$word.quit()- Edited by Piyush Bhatewara Tuesday, March 19, 2013 5:56 AM
- Proposed as answer by Ravi S Kulkarni Wednesday, March 20, 2013 3:30 PM
- Marked as answer by Kelly Chen 2012 Wednesday, March 27, 2013 1:40 AM
Tuesday, March 19, 2013 5:54 AM -
Hi Gayatri,
I understand that you want to use programmatic to convert doc file to docx, then SharePoint 2010 word automation services available with SharePoint server 2010 supports converting word documents to other formats, including Converting between document formats.
For more detailed information, please refer to
http://msdn.microsoft.com/en-us/library/office/ff181518(v=office.14).aspx
Best Regards.
Kelly Chen
TechNet Community Support- Marked as answer by Kelly Chen 2012 Wednesday, March 27, 2013 1:40 AM
Wednesday, March 20, 2013 6:17 AM -
how do you get the convert method? I'm stuck on $word.Convert() is not a supported method. Get-member doesn't show it.Tuesday, November 12, 2013 1:31 AM