none
Document.XML parse with XSLT RRS feed

  • Question

  • Hi,

    I convert a 2003 word document with XML to 2007 Word Document and i want to retrieve the data. Due to the convertion from Office 2007 there is no customXml folder in my docx so i´ve to use XSLT.

    When transforming, my document.xml have this type of node :

    <w:customXml w:uri="INFORMEVALORACION35" w:element="Info">  
    <w:p w:rsidR="00212133" w:rsidRPr="00212133" w:rsidRDefault="00DF632A" w:rsidP="00212133">  
    <w:pPr> 
      <w:jc w:val="right" />   
    <w:rPr> 
      <w:i />   
      <w:sz w:val="16" />   
      <w:szCs w:val="16" />   
      <w:lang w:val="es-ES_tradnl" />   
      </w:rPr> 
      </w:pPr> 
    <w:customXml w:uri="INFORMEVALORACION35" w:element="id">  
    <w:r w:rsidR="00212133">  
    <w:rPr> 
      <w:i />   
      <w:sz w:val="16" />   
      <w:szCs w:val="16" />   
      <w:lang w:val="es-ES_tradnl" />   
      </w:rPr> 
      <w:t>a</w:t>   
      </w:r> 
    <w:r w:rsidR="00212133" w:rsidRPr="00F40FF3">  
    <w:rPr> 
      <w:i />   
      <w:sz w:val="16" />   
      <w:szCs w:val="16" />   
      <w:lang w:val="es-ES_tradnl" />   
      </w:rPr> 
      <w:t>idit-mod-035</w:t>   
      </w:r> 
      </w:customXml> 
     

    The problem that´s all my text is "a"+"idit-mod-035" (I  don´t why Office transform that in 2 nodes !! )

    So i use this XSLT to retrieve the data :

    <?xml version="1.0" encoding="UTF-8"?>  
    <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" exclude-result-prefixes="w">  
        <xsl:template match="w:p | w:r | w:tbl | w:tr | w:tc ">  
            <xsl:apply-templates select="w:customXml | w:p | w:r | w:tbl | w:tr | w:tc"/>  
        </xsl:template> 
        <xsl:template match="w:customXml">  
            <xsl:element name="{@w:element}" namespace="{@w:uri}">  
                <xsl:choose> 
                    <xsl:when test="not(descendant::w:customXml)">  
                        <xsl:value-of select="descendant::w:t"/>  
                    </xsl:when> 
                    <xsl:otherwise> 
                        <xsl:apply-templates select="child::node()"/>  
                    </xsl:otherwise> 
                </xsl:choose> 
            </xsl:element> 
        </xsl:template> 
    </xsl:stylesheet> 
     

    It take all CustomXml Nodes and take the w:t nodes.

    But it retrieve me only the first <w:t> of the node :

    My output XML :

    <Document xmlns="INFORMEVALORACION35">  
    <Info> 
      <id>a</id>  ***** HERE THE FISRT NODE   
      <version>11</version>   
      </Info> 
     
     
    BUT I need the concatenation of the 2 nodes :   
     
    <Info> 
      <id>aidit-mode-035</id>  ***** node 1 and 2 ...
      <version>11</version>   
      </Info> 

    If you have an idea thanks to help me in advance,

    Anto.
    Tuesday, November 11, 2008 11:42 AM

Answers

  • Ok thanks for your response,

    I found my solution and make a modification to my XSLT i give my solution maybe it could Help. 

    <?xml version="1.0" encoding="UTF-8"?>

    <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" exclude-result-prefixes="w">

                            <xsl:template match="w:p | w:r | w:tbl | w:tr | w:tc ">

                            <xsl:apply-templates select="w:customXml | w:p | w:r | w:tbl | w:tr | w:tc "/>

                </xsl:template>

                <xsl:template match="w:customXml">

                            <xsl:element name="{@w:element}" namespace="{@w:uri}">

                                        <xsl:choose>

                                                    <xsl:when test="not(descendant::w:customXml)">

                                                                 <xsl:for-each select="descendant::w:t">

                                                                            <xsl:value-of select="text()"/>

                                                                </xsl:for-each>                                                           

                                                     </xsl:when>

                                                    <xsl:otherwise>

                                                                <xsl:apply-templates select="child::node()"/>           

                                                    </xsl:otherwise>

                                        </xsl:choose>

                            </xsl:element>

                </xsl:template>

    </xsl:stylesheet>


    Thanks,

    Antonio.

    • Marked as answer by Tonito01 Monday, November 17, 2008 4:02 PM
    • Edited by Tonito01 Monday, November 17, 2008 4:02 PM
    Monday, November 17, 2008 4:02 PM

All replies

  • It seems to me that you almost need a pre-process to merge the text content. There are several reasons that Word splits the text in different runs -- different format such as some chars are bold, others are italic. Even if there is no clear visual reason, text could be split into runs for change tracking purposes. Your best bet is probably to merge the text yourself by ignore their formatting before the conversion.
    Monday, November 17, 2008 6:33 AM
  • Ok thanks for your response,

    I found my solution and make a modification to my XSLT i give my solution maybe it could Help. 

    <?xml version="1.0" encoding="UTF-8"?>

    <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" exclude-result-prefixes="w">

                            <xsl:template match="w:p | w:r | w:tbl | w:tr | w:tc ">

                            <xsl:apply-templates select="w:customXml | w:p | w:r | w:tbl | w:tr | w:tc "/>

                </xsl:template>

                <xsl:template match="w:customXml">

                            <xsl:element name="{@w:element}" namespace="{@w:uri}">

                                        <xsl:choose>

                                                    <xsl:when test="not(descendant::w:customXml)">

                                                                 <xsl:for-each select="descendant::w:t">

                                                                            <xsl:value-of select="text()"/>

                                                                </xsl:for-each>                                                           

                                                     </xsl:when>

                                                    <xsl:otherwise>

                                                                <xsl:apply-templates select="child::node()"/>           

                                                    </xsl:otherwise>

                                        </xsl:choose>

                            </xsl:element>

                </xsl:template>

    </xsl:stylesheet>


    Thanks,

    Antonio.

    • Marked as answer by Tonito01 Monday, November 17, 2008 4:02 PM
    • Edited by Tonito01 Monday, November 17, 2008 4:02 PM
    Monday, November 17, 2008 4:02 PM