none
Code/Property to get original version of MS Word file RRS feed

  • Question

  • Our organization has moved to Office 2013 and we use MS Word to generate documents which are saved in our DB. With the new version of Word 2013, MS added a new 'file block' security feature which our security department will not allow us to use Word 97 and older documents or templates, which in some cases causes a problem for us as we do have older Word documents and templates in the DB. We will be converting our documents and templates to Word 2013, however, I was wondering if there is code out there that will determine the original version of a MS Word file.

    There might be one problem, it is currently designed that in the table, the BLOB/LONG RAW column only contains Word documents and does not store the file extension when it was originally added to the DB, so it is a possibility that there are already .docx files there. Since I currently don't have code that checks to see what version of Word the document was originally created in, I pull the document out of the DB, save it as .DOC and then convert to .DOCX. I would like to remove a step where I don't have to convert an already .DOCX file.

    Question:

    Is there a way to check the original version of a file without giving it a file extension? For example, C:\test\testworddocument .

    I have tried one thing, changed the file extension of the .DOC and .DOCX file to .TXT, and read the binary text, I do find Word.Document8 for .DOC and WordDocument.xml for .DOCX but nothing which states its original version.

    I am writing a console app. using C# code.

    Any help is much appreciated.

    Thank you.


    William

    Friday, May 27, 2016 2:54 PM

Answers

  • The version information you need is the SaveFormat, not the version of Word that created the file. After all, Word 2007 & later can save to both the .doc (97-2003) format and .docx format; even Word 2003 can save to the docx format.

    The Word 97-2003 SaveFormat is same: 0 (wdFormatDocument / wdFormatDocument97) for documents; or 1 (wdFormatTemplate / wdFormatTemplate97), for all Word versions, from Word 97 onwards. Word 95 & earlier used different formats that you now need a converter to open.

    Reading the first two bytes of the file simply means you don't have to open it to find out whether the SaveFormat is any of the values 12-15. Either they're PK (for docx/m & dotx/m files) or they're not (for anything else). Since you say you don't want to have to open docx/m files, testing the first two bytes is the easiest way.


    Cheers
    Paul Edstein
    [MS MVP - Word]

    Friday, June 3, 2016 1:18 PM

All replies

  • If you examine the binary code of a Word 2007 or later document, it will always start with the letters PK - as will any ZIP archive, because that's what docx/m & dotx/m files are.

    Cheers
    Paul Edstein
    [MS MVP - Word]

    Saturday, May 28, 2016 12:01 AM
  • Hi MACH9,

    Here you can make use of ActiveDocument.SaveFormat in your code.

    it prints out the integer representation of the format. so with the help of that you can check the version in your code.

    If ActiveDocument.SaveFormat = wdFormatDocument97 then
        // your code that you want to execute
     else
        // your code that you want to execute
     end if
                                       

    The following is the another link which may help you to get some information regarding your issue.

    How to determine the version of Word a Word Doc was saved in

    Determine version of Word that document was created in

    Disclaimer: This response contains a reference to a third party World Wide Web site. Microsoft is providing this information as a convenience to you. Microsoft does not control these sites and has not tested any software or information found on these sites; therefore, Microsoft cannot make any representations regarding the quality, safety, or suitability of any software or information found there. There are inherent dangers in the use of any software found on the Internet, and Microsoft cautions you to make sure that you completely understand the risk before retrieving any software from the Internet.

    Regards

    Deepak



    We are trying to better understand customer views on social support experience, so your participation in this interview project would be greatly appreciated if you have time. Thanks for helping make community forums a great place.
    Click HERE to participate the survey.

    Monday, May 30, 2016 3:11 AM
    Moderator
  • Thank you, will try that.

    William

    Monday, May 30, 2016 12:47 PM
  • Thank you.

    William

    Monday, May 30, 2016 12:47 PM
  • Can anyone also confirm that the Word version for 2013 would be the same as the Office version (I would assume yes, but that got me in trouble before) :

    Office 97   -  7.0
    Office 98   -  8.0
    Office 2000 -  9.0
    Office XP   - 10.0
    Office 2003 - 11.0
    Office 2007 - 12.0
    Office 2010 - 14.0 (sic!)
    Office 2013 - 15.0
    Office 2016 - 16.0

    Thanks


    William

    Monday, May 30, 2016 2:49 PM
  • A simple Google search for Office versions:

    https://en.wikipedia.org/wiki/Microsoft_Office


    Best regards, George

    Monday, May 30, 2016 3:03 PM
  • Thanks George.

    William

    Monday, May 30, 2016 3:20 PM
  • Hi MACH9,

    is your issue solved? if it solved then please mark the suggestion as an Answer which helped you to solve the problem.

    if you still having problem regarding that please let us know so that we can try to help you further.

    Regards

    Deepak


    We are trying to better understand customer views on social support experience, so your participation in this interview project would be greatly appreciated if you have time. Thanks for helping make community forums a great place.
    Click HERE to participate the survey.

    Thursday, June 2, 2016 6:22 AM
    Moderator
  • Deepak,

    I tried your code above, however if you look at the return values for that property it is not specific to each version of Office.

    https://msdn.microsoft.com/en-us/library/bb238158(v=office.12).aspx

    Versions of Office:

    https://en.wikipedia.org/wiki/Microsoft_Office

    If ActiveDocument.SaveFormat = wdFormatDocument97 then
       
    // your code that you want to execute
    else
       
    // your code that you want to execute
    end if

    I would be looking for something like this, my code below:

    _Application myWordApp = new Application();

    _Document myDoc = myWordApp.Documents.Open(FileName: filename, ReadOnly: false);

    string version= myDoc.document.OriginalVersion;

    or

    string version= myDoc.document.Version;

    and

    version = 15, 14, 12, 11

    myDoc.Close();
    myWordApp.Quit();

    I need to know the exact version of MS Word the file is in.

    Maybe I am missing something.

    Thanks


    William

    Friday, June 3, 2016 10:56 AM
  • Micrsosft Word does not store details of the Word version that created the file. In any event, what you need (if you're going to open the file) is the SaveFormat. Valid WdSaveFormat values are:

    wdFormatDocument

    0

    Microsoft   Word format.

    wdFormatDocument97

    0

    Microsoft   Word 97 document format.

    wdFormatTemplate

    1

    Word   template format.

    wdFormatTemplate97

    1

    Word 97   template format.

    wdFormatText

    2

    Microsoft   Windows text format.

    wdFormatTextLineBreaks

    3

    Windows   text format with line breaks preserved.

    wdFormatDOSText

    4

    Microsoft   DOS text format.

    wdFormatDOSTextLineBreaks

    5

    Microsoft   DOS text with line breaks preserved.

    wdFormatRTF

    6

    Rich text   format (RTF).

    wdFormatEncodedText

    7

    Encoded   text format.

    wdFormatUnicodeText

    7

    Unicode   text format.

    wdFormatHTML

    8

    Standard   HTML format.

    wdFormatWebArchive

    9

    Web   archive format.

    wdFormatFilteredHTML

    10

    Filtered   HTML format.

    wdFormatXML

    11

    Extensible   Markup Language (XML) format.

    wdFormatXMLDocument

    12

    XML   document format.

    wdFormatXMLDocumentMacroEnabled

    13

    XML   document format with macros enabled.

    wdFormatXMLTemplate

    14

    XML   template format.

    wdFormatXMLTemplateMacroEnabled

    15

    XML   template format with macros enabled.

    wdFormatDocumentDefault

    16

    Word   default document file format. For Word 2010, this is the DOCX format.

    wdFormatPDF

    17

    PDF   format.

    wdFormatXPS

    18

    XPS format

    Values 12-15 are for docx/m & dotx/m files. The approach I suggested doesn't require you to open the document to get the details you need. If the file data begin with the letters PK, it has to be one of the docx/m or dotx/m formats.


    Cheers
    Paul Edstein
    [MS MVP - Word]

    Friday, June 3, 2016 11:24 AM
  • Paul,

    Thanks for your response but I am a bit perplexed as to why there isn't a property that indicates what version the current document is in.

    My requirement is, I need to know if the Word file is 95, 97, 2000, 2003, 2007, 2010 or 2013.

    By using a range of values doesn't help me.

    If I do a file read and find in a .txt format or a string array, could I find the exact version #?

    For example, I know Word 97 is Word.Document.8, would the other versions follow suite in the same manner, Word.Document.9 = 2000, Word.Document.11 = 2003?

    For 2007 and above it is PK or word/document.xml as the identifier, would 2010 and above be any different or would it still be identified as word/document.xml or would it be something similar to word/document.xml.2010, word/document.xml.2013?

    Thanks


    William

    Friday, June 3, 2016 12:09 PM
  • Paul,

    How does Word 2013 identify a document or template as 6.0, 95, 2000, 2003, 2007 in the file block settings?

    One would assume it has to know what version of Word the file is in order to block the file or template.

    Thanks


    William

    Friday, June 3, 2016 12:29 PM
  • The version information you need is the SaveFormat, not the version of Word that created the file. After all, Word 2007 & later can save to both the .doc (97-2003) format and .docx format; even Word 2003 can save to the docx format.

    The Word 97-2003 SaveFormat is same: 0 (wdFormatDocument / wdFormatDocument97) for documents; or 1 (wdFormatTemplate / wdFormatTemplate97), for all Word versions, from Word 97 onwards. Word 95 & earlier used different formats that you now need a converter to open.

    Reading the first two bytes of the file simply means you don't have to open it to find out whether the SaveFormat is any of the values 12-15. Either they're PK (for docx/m & dotx/m files) or they're not (for anything else). Since you say you don't want to have to open docx/m files, testing the first two bytes is the easiest way.


    Cheers
    Paul Edstein
    [MS MVP - Word]

    Friday, June 3, 2016 1:18 PM
  • Hi MACH9,

    I think the suggestion given by the macropod is a proper for your question.

    and I am agree with what he suggested you in his above mentioned post that ,"Micrsosft Word does not store details of the Word version that created the file. "

    did you get solution from his post?

    if you think that it can solve your issue then please mark it as an Answer.

    Regards

    Deepak


    We are trying to better understand customer views on social support experience, so your participation in this interview project would be greatly appreciated if you have time. Thanks for helping make community forums a great place.
    Click HERE to participate the survey.


    Monday, June 6, 2016 7:34 AM
    Moderator
  • The suggestion given by John is sufficient at this point and will follow his recommendations.

    I will post the code when I develop it.

    Thanks for everyone's help on this matter.

    William


    William

    Monday, June 6, 2016 11:47 AM
  • John,

    I have one last question, since we don't store the file extension for the Word document (will add this to our program), can I read the file without giving it a file extension?

    ex. C:\test\TestFile

    Thanks


    William

    Monday, June 6, 2016 11:53 AM
  • The presence or absence of an extension makes no difference to testing the file's first two bytes.

    Cheers
    Paul Edstein
    [MS MVP - Word]

    Tuesday, June 7, 2016 2:38 AM
  • Hi Paul,

    Maybe I should have asked this earlier before I actually start building my own file/template conversion tool, does MS have a Word file and template conversion tool?

    I have found this:

    Convert binary Office files by using the Office File Converter (OFC) and Version Extraction Tool (VET)

    https://technet.microsoft.com/en-us/library/cc179019(v=office.14).aspx

    Thanks


    William


    • Edited by MACH9 Tuesday, June 7, 2016 4:21 PM
    Tuesday, June 7, 2016 4:20 PM
  • You can use the conversion tool or Word itself.

    Cheers
    Paul Edstein
    [MS MVP - Word]

    Tuesday, June 7, 2016 10:15 PM
  • Hi MACH9,

    I would recommend you to create a new thread for this issue instead of continuing in the thread which is already closed.

    Ask new Question in New Thread.

    Thanks for your understanding.

    Regards

    Deepak


    We are trying to better understand customer views on social support experience, so your participation in this interview project would be greatly appreciated if you have time. Thanks for helping make community forums a great place.
    Click HERE to participate the survey.

    Wednesday, June 8, 2016 4:22 AM
    Moderator
  • Ok, will do.

    Thanks for everyones help on this.

    William


    William

    Thursday, June 9, 2016 2:23 PM
  • You could also look at the CompatibilityMode property of the file. There is an enum (
    Microsoft.Office.Interop.Word.WdCompatibilityMode) that lists the different versions of documents. 
    Tuesday, June 14, 2016 6:54 PM
  • You could also look at the CompatibilityMode property of the file. There is an enum (

    Microsoft.Office.Interop.Word.WdCompatibilityMode) that lists the different versions of documents.

    The CompatibilityMode property of a document does NOT return the Word or document version. It only returns the compatibility setting of the document when saved in Word 2010 & later, not the save format, and, if wdCompatibilityMode is 'wdCurrent', could be for any Word version. In any event, you need to open the document to read it.

    Cheers
    Paul Edstein
    [MS MVP - Word]

    Tuesday, June 14, 2016 11:18 PM