none
Can we remove remove specific custom properties from a Word Document using OpenXML? RRS feed

  • Question

  • I have 10 custom properties, in my word document. From what I can understand these custom properties are saved in customXML part in the word document structure. What I need to do is, remove 1 property out of the 10 from inside the custom xml part of the document. Is there a way to do this with OpenXML?

    Currently I have been able to retrieve the values from the Item2.xml file in the customXml folder of my document package (renaming the .docx document file to .zip)

    Whenever I try to update the xml contents, the document properties get corrupted. Any help would be appreciated

    Monday, January 27, 2014 12:12 PM

Answers

  • Hi Jonathan

    I think the first step would be to discuss in a Word IT Pro (TechNet site) or SharePoint group whether a document with a DIP defined in it is still linked to the SharePoint information. If it is, then what you need to do to remove this information might be more complex than simply editing these two XML files. You might first need to remove the link/reference to the SharePoint source where this information is coming from.

    The folks in those forums may also have an idea how such information should be removed (what tool/steps in Word or SharePoint) that will give some indication how you need to proceed.

    I, personally, have no in-depth knowledge in this area, although once you have some details I might be able to "guess you in the right direction".


    Cindy Meister, VSTO/Word MVP, my blog

    Friday, January 31, 2014 8:40 PM
    Moderator

All replies

  • Hi Jonathan

    Could you give us an example of the XML in the custom XML part, please, pointing out which node you're changing, what you're changing it to ("before and after")?

    Theoretically, yes, it should be possible.

    Are you working with the Open XML SDK, or something else? How are you doing "update the xml contents"?


    Cindy Meister, VSTO/Word MVP, my blog

    Monday, January 27, 2014 5:03 PM
    Moderator
  • Yes I am using the OpenXML SDK

    Here is a snippet of my item2.xml from my document package (renaming the .docx document file to .zip)

    <?xml version="1.0" encoding="utf-8"?><ct:contentTypeSchema ct:_="" ma:_="" ma:contentTypeName="Declaratory Judgement" ma:contentTypeID="0x01010003B664B5AF4E24438B8B98F46881831501010300B8C8BE1CBFEC0444945C468BED7E4904" ma:contentTypeVersion="4" ma:contentTypeDescription="Declaratory Judgement" ma:contentTypeScope="" ma:versionID="dafdecc0bbc90e1d83266e5158915392" xmlns:ct="http://schemas.microsoft.com/office/2006/metadata/contentType" xmlns:ma="http://schemas.microsoft.com/office/2006/metadata/properties/metaAttributes">
    <xsd:schema targetNamespace="http://schemas.microsoft.com/office/2006/metadata/properties" ma:root="true" ma:fieldsID="d22c4bce87e9df019a35f68fa52bb789" ns1:_="" ns2:_="" ns3:_="" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:p="http://schemas.microsoft.com/office/2006/metadata/properties" xmlns:ns1="http://schemas.microsoft.com/sharepoint/v3" xmlns:ns2="423a85b0-2999-4e28-a12b-c876db43ce13" xmlns:ns3="http://schemas.microsoft.com/sharepoint/v3/fields">
    <xsd:import namespace="http://schemas.microsoft.com/sharepoint/v3"/>
    <xsd:import namespace="423a85b0-2999-4e28-a12b-c876db43ce13"/>
    <xsd:import namespace="http://schemas.microsoft.com/sharepoint/v3/fields"/>
    <xsd:element name="properties">
    <xsd:complexType>
    <xsd:sequence>
    <xsd:element name="documentManagement">
    <xsd:complexType>
    <xsd:all>
    <xsd:element ref="ns1:StartDate" minOccurs="0"/>
    <xsd:element ref="ns3:_EndDate" minOccurs="0"/>
    <xsd:element ref="ns2:_dlc_DocIdPersistId" minOccurs="0"/>
    <xsd:element ref="ns2:d8a319a301464503853a1f15686d92e7" minOccurs="0"/>
    <xsd:element ref="ns2:_dlc_DocId" minOccurs="0"/>
    <xsd:element ref="ns2:ad45abc10d7a476e92e3cfc1172a5b27" minOccurs="0"/>
    <xsd:element ref="ns2:i2625e16cbd640328f293fa4b65e778b" minOccurs="0"/>
    <xsd:element ref="ns2:gb0f2de9e8dc4d81bbba9c650587c6b3" minOccurs="0"/>
    <xsd:element ref="ns2:_dlc_DocIdUrl" minOccurs="0"/>
    <xsd:element ref="ns2:p2c5fca70b5a4807aff8954b4c419536" minOccurs="0"/>
    <xsd:element ref="ns2:ff9d635419114d09a2b8ef683e30301b" minOccurs="0"/>
    <xsd:element ref="ns2:pefb9745303b441f83b92fdcbd9aff29" minOccurs="0"/>
    <xsd:element ref="ns2:k665fcf9399e41ceadc60abfdcd0788d" minOccurs="0"/>
    <xsd:element ref="ns2:TaxCatchAll" minOccurs="0"/>
    <xsd:element ref="ns2:TaxCatchAllLabel" minOccurs="0"/>
    <xsd:element ref="ns2:g4d4c69ea71d4d62b8ad04defc227425" minOccurs="0"/>
    </xsd:all>
    </xsd:complexType>
    </xsd:element>
    </xsd:sequence>
    </xsd:complexType>
    </xsd:element>
    </xsd:schema>
    <xsd:schema targetNamespace="http://schemas.microsoft.com/sharepoint/v3" elementFormDefault="qualified" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:dms="http://schemas.microsoft.com/office/2006/documentManagement/types" xmlns:pc="http://schemas.microsoft.com/office/infopath/2007/PartnerControls">
    <xsd:import namespace="http://schemas.microsoft.com/office/2006/documentManagement/types"/>
    <xsd:import namespace="http://schemas.microsoft.com/office/infopath/2007/PartnerControls"/>
    <xsd:element name="StartDate" ma:index="10" nillable="true" ma:displayName="Start Date" ma:default="[today]" ma:format="DateOnly" ma:internalName="StartDate">
    <xsd:simpleType>
    <xsd:restriction base="dms:DateTime"/>
    </xsd:simpleType>
    </xsd:element>
    </xsd:schema>
    <xsd:schema targetNamespace="http://schemas.microsoft.com/sharepoint/v3/fields" elementFormDefault="qualified" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:dms="http://schemas.microsoft.com/office/2006/documentManagement/types" xmlns:pc="http://schemas.microsoft.com/office/infopath/2007/PartnerControls">
    <xsd:import namespace="http://schemas.microsoft.com/office/2006/documentManagement/types"/>
    <xsd:import namespace="http://schemas.microsoft.com/office/infopath/2007/PartnerControls"/>
    <xsd:element name="_EndDate" ma:index="11" nillable="true" ma:displayName="End Date" ma:default="[today]" ma:format="DateTime" ma:internalName="_EndDate">
    <xsd:simpleType>
    <xsd:restriction base="dms:DateTime"/>
    </xsd:simpleType>
    </xsd:element>
    </xsd:schema>

    The xsd:element nodes (after start date and end date) with the funny names are the properties from SharePoint

    The values are stored in another file called item4.xml as shown below:

    <?xml version="1.0" encoding="utf-8"?>
    <p:properties xmlns:p="http://schemas.microsoft.com/office/2006/metadata/properties" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:pc="http://schemas.microsoft.com/office/infopath/2007/PartnerControls">
      <documentManagement>
        <ad45abc10d7a476e92e3cfc1172a5b27 xmlns="423a85b0-2999-4e28-a12b-c876db43ce13">
          <Terms xmlns="http://schemas.microsoft.com/office/infopath/2007/PartnerControls">
            <TermInfo xmlns="http://schemas.microsoft.com/office/infopath/2007/PartnerControls">
              <TermName>12345</TermName>
              <TermId>fdfbf1ae-e0f4-4aba-999c-4e3895d0f31b</TermId>
            </TermInfo>
          </Terms>
        </ad45abc10d7a476e92e3cfc1172a5b27>
        <k665fcf9399e41ceadc60abfdcd0788d xmlns="423a85b0-2999-4e28-a12b-c876db43ce13">
          <Terms xmlns="http://schemas.microsoft.com/office/infopath/2007/PartnerControls"></Terms>
        </k665fcf9399e41ceadc60abfdcd0788d>
        <p2c5fca70b5a4807aff8954b4c419536 xmlns="423a85b0-2999-4e28-a12b-c876db43ce13">
          <Terms xmlns="http://schemas.microsoft.com/office/infopath/2007/PartnerControls">
            <TermInfo xmlns="http://schemas.microsoft.com/office/infopath/2007/PartnerControls">
              <TermName>Test City</TermName>
              <TermId>e40dcf7a-d011-4875-b24d-3e40f314852c</TermId>
            </TermInfo>
          </Terms>
        </p2c5fca70b5a4807aff8954b4c419536>
        <pefb9745303b441f83b92fdcbd9aff29 xmlns="423a85b0-2999-4e28-a12b-c876db43ce13">
          <Terms xmlns="http://schemas.microsoft.com/office/infopath/2007/PartnerControls"></Terms>
        </pefb9745303b441f83b92fdcbd9aff29>
        <g4d4c69ea71d4d62b8ad04defc227425 xmlns="423a85b0-2999-4e28-a12b-c876db43ce13">
          <Terms xmlns="http://schemas.microsoft.com/office/infopath/2007/PartnerControls"></Terms>
        </g4d4c69ea71d4d62b8ad04defc227425>
        <_EndDate xmlns="http://schemas.microsoft.com/sharepoint/v3/fields">2014-01-21T01:50:31+00:00</_EndDate>
        <TaxCatchAll xmlns="423a85b0-2999-4e28-a12b-c876db43ce13"/>
        <StartDate xmlns="http://schemas.microsoft.com/sharepoint/v3">2014-01-21T01:50:31+00:00</StartDate>
      </documentManagement>
    </p:properties>

    I'm updating item2.xml to remove the xsd:element nodes and in item4.xml, I'm updating the nodes under documentManagement node and setting their value to Null using xsi:nil="true" but this is causing the document properties to get corrupted


    Tuesday, January 28, 2014 4:05 AM
  • Hi Jonathan

    In my book, SharePoint properties do not equate to "custom properties". I consider the latter to be what appears in the "advanced" interface of the Properties dialog/backstage in the Office UI.

    This appears to be the source of a Document Information Panel?

    I'm missing the error message you get that indicates "the document properties [are] corrupted".

    You also did not provide how the XML looks AFTER you've changed it. But more importantly, given the the source of these properties, WHY are you doing what you're doing? I can see little purpose to altering the SCHEMA information in item2.xml. That simply tells a consuming software how to read information from an XML file (item4.xml?)


    Cindy Meister, VSTO/Word MVP, my blog

    Tuesday, January 28, 2014 8:57 AM
    Moderator
  • Hello Cindy,

    You are correct, these properties are the source of the document information panel.

    Regarding Why I am trying to remove these properties,

    I am trying to remove specific SharePoint metadata from the document before sending it to someone

    The recipient should not be able to update or see the values of these properties since they are of a sensitive nature

    I have not been able to update the properties successfully so I was asking how to remove the same and whether this is even possible


    Friday, January 31, 2014 1:26 PM
  • Hi Jonathan

    I think the first step would be to discuss in a Word IT Pro (TechNet site) or SharePoint group whether a document with a DIP defined in it is still linked to the SharePoint information. If it is, then what you need to do to remove this information might be more complex than simply editing these two XML files. You might first need to remove the link/reference to the SharePoint source where this information is coming from.

    The folks in those forums may also have an idea how such information should be removed (what tool/steps in Word or SharePoint) that will give some indication how you need to proceed.

    I, personally, have no in-depth knowledge in this area, although once you have some details I might be able to "guess you in the right direction".


    Cindy Meister, VSTO/Word MVP, my blog

    Friday, January 31, 2014 8:40 PM
    Moderator
  • Thanks Cindy,

    Will ask the question in the respective forums and hopefully will find an Answer

    Thanks for the support

    Monday, February 3, 2014 4:08 AM