locked
escaping special characters during XSLT conversion RRS feed

  • Question

  • Hi,

    We're using a XSLT file to convert a XML into RSS complaint XML.

    The input XML contains an element with HtmlEncoded value like "#&174; " for characters like ® or ™. We want to assign the decoded value to the title element of the RSS feed.

    The problem is that, we need to display the title without any Htmlencoded characters.

    Can anyone give us some examples or pointers on how to achieve HtmlDecode functionality in XSLT language?

    The code snippet of the XSLT file which we are using is as shown below:

    <title> <xsl:value-of select="F[@N='1']"/>
    </title>

    Here we are trying to extract the value of the element with name F & attribute N & attribute value 1.

    The code snippet of the input XML looks like this:

    <F N="1"> Tailor Brand ®</F>

    After the transformation, the source of the output is as shown below:

    <title> Tailor Brand ® </title>
    However, the same logic is working properly & rendering the ®, when we place it under the <description> tag of the RSS XML. That is,

    <description>
    <xsl:value-of select="F[@N='1']"></xsl:value-of>
    </description>

    Can anyone provide some pointers on how to achieve the Htmldecoding inside the title tag of RSS also ?

    Thursday, December 14, 2006 1:29 PM

Answers

  • You should not need to worry about this because &#174; and ® are logically identical unicode characters.

    If you are having issues it's probably because of the way your XSLT output is being "encoded" into bytes and then being "decoded" back into unicode characters by whatever application is displaying the result.  If you "encode" the output into bytes using the UTF-8 encoding then everything should work fine since UTF-8 supports the full range of unicode character values.  If you chose some other limited encoding then you might be running into issues where the character value is not represented in the encoding you have chosen.  One common trap is piping output through the Console.Write since the Console by default encodes characters using a very limited ASCII range.  It is always better to write output directly to a file so that it gets encoded correctly.  For help on encoding issues, see How to Encode XML Data.

    -Chris.

    Tuesday, December 19, 2006 7:12 PM

All replies

  • You should not need to worry about this because &#174; and ® are logically identical unicode characters.

    If you are having issues it's probably because of the way your XSLT output is being "encoded" into bytes and then being "decoded" back into unicode characters by whatever application is displaying the result.  If you "encode" the output into bytes using the UTF-8 encoding then everything should work fine since UTF-8 supports the full range of unicode character values.  If you chose some other limited encoding then you might be running into issues where the character value is not represented in the encoding you have chosen.  One common trap is piping output through the Console.Write since the Console by default encodes characters using a very limited ASCII range.  It is always better to write output directly to a file so that it gets encoded correctly.  For help on encoding issues, see How to Encode XML Data.

    -Chris.

    Tuesday, December 19, 2006 7:12 PM
  • <description>
    <xsl:value-of select="F[@N='1']" disable-output-escaping="yes" />
    </description>

    Tuesday, December 19, 2006 8:31 PM