none
Serialize String containing only whitespace such as a " " character

    Question

  • I'm having dufficulty figuring out how to serialize a String property

    [XmlElement("DescriptionDelimiter1")]
    public String DescriptionDelimiter1
    {
        get { return _DescriptionDelimiter1; }
        set { _DescriptionDelimiter1 = value; }
    }

    Granted, I don't really need the XmlElement tag but just in case I want to change the name of the XML element...

    Anyway, what I get in XML is

    <ImportProfiles>
      <
    ImportProfile Name="HPH Import Profile">
        <ConcatenateDescriptions>true</ConcatenateDescriptions>
        <
    DescriptionDelimiter1 />
        <
    DescriptionDelimiter2>\r\n</DescriptionDelimiter2>
        <
    DescriptionDelimiter3>\t</DescriptionDelimiter3>
      <ImportProfile>
    </
    ImportProfiles>

    The problem is when serialized, if the value is " " or a single space, the result is <DescriptionDelimiter1 /> which when deserialized sets the string's value to "" with is not the value I need. I've looked into using CData but there doesn't seem to be a simple way to implement it... I expect

    <DescriptionDelimiter1> </DescriptionDelimiter1>
    or
    <DescriptionDelimiter1><[CDATA[ ]]></DescriptionDelimiter1>

    Is there some setting I can use when serializing the object to indication the process should not trim away whitespace?

    The following is the method used to serialize my object:

    public static String Serialize(DataImportBase dib)
    {
        System.
    Type[] ArrType = new Type[1];
        ArrType[0] =
    typeof(System.DBNull);
        XmlSerializer xs = new XmlSerializer(typeof(DataImportBase), ArrType);
        System.IO.
    MemoryStream aMemoryStream = new System.IO.MemoryStream();
        xs.Serialize(aMemoryStream, dib);
        return System.Text.Encoding.UTF8.GetString(aMemoryStream.ToArray());
    }

    Thanks for any assistance!!!

    Chris

    Friday, March 09, 2007 3:00 AM

Answers

  • I tested with Visual Studio 2005/.NET 2.0 and the following works for me to preserve the white space:

    Foo foo = new Foo();

    foo.Bar = " ";

    XmlSerializer serializer = new XmlSerializer(typeof(Foo));

    StringWriter stringWriter = new StringWriter();

    serializer.Serialize(stringWriter, foo);

    string markup = stringWriter.ToString();

    Console.WriteLine(markup);

    Console.WriteLine();

    XmlReaderSettings readerSettings = new XmlReaderSettings();

    readerSettings.IgnoreWhitespace = false;

    foo = (Foo)serializer.Deserialize(XmlReader.Create(new StringReader(markup), readerSettings));

    Console.WriteLine("Bar.Length: {0}", foo.Bar.Length);

    where the example class Foo is e.g.

    public class Foo

    {

    public Foo() { }

    [XmlElement(ElementName = "Bar")]

    public string Bar

    {

    get

    {

    return _bar;

    }

    set

    {

    _bar = value;

    }

    }

    private string _bar;

    }

     

    Output then is e.g.

    <?xml version="1.0" encoding="utf-16"?>
    <Foo xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
      <Bar> </Bar>
    </Foo>

    Bar.Length: 1

    so the single space is first output as the contents of the Bar element and is later deserialized properly.

    Friday, March 09, 2007 1:44 PM
  • This is what I came up with:

    /// <summary>
    /// Special consideration is made to preserve String values. Since the XML is intended to
    /// be saved in a SQL database, we have to ensure the XML is not modified when converted
    /// and committed to the table column. What was happening is an XmlElement <tag> </tag> was
    /// being converted to <tag /> where the whitespace " " was lost. This has adverse affects
    /// to the profiles where a single space may be a valid value...
    ///
    /// Adding the following fixes the behavior of SQL Server 2005 by indicating that spaces
    /// are protected and should not be removed from data between tags.
    /// </summary>

    [XmlAttribute("xml:space")]
    public String SpacePreserve = "preserve";

    This creates a root node like the following:
    <DataImportBase xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xml:space="preserve">

    which indicates to the next process that handles the XML not to remove the whitespace between tags. What would be optimal is if I could put this Attribute on specific Elements within the XML...

    Friday, March 09, 2007 5:53 PM
  • I have looked through the classes for XML serialization but I haven't found anything that seems to allow one to set that attribute xml:space declaratively on members.

    One way to change the serialization behaviour would be to write a specialized XmlWriter that takes a list of element names for which it writes out the xml:space attribute. This can be done with a few lines if you use the XmlWrappingWriter from the project http://www.codeplex.com/Wiki/View.aspx?ProjectName=MVPXML. The code for XmlWrappingWriter is also shown in Oleg's blog.

    If you use that then you can set up a specialized XmlWriter as follows:

    public class XmlSpaceWriter : XmlWrappingWriter

    {

    private XmlQualifiedName[] ElementNames;

    public XmlSpaceWriter(TextWriter output, XmlQualifiedName[] elementNames)

    : base(XmlWriter.Create(output))

    {

    ElementNames = elementNames;

    }

    public override void WriteStartElement(string prefix, string localName, string ns)

    {

    base.WriteStartElement(prefix, localName, ns);

    foreach (XmlQualifiedName name in ElementNames)

    {

    if (name.Namespace == ns && name.Name == localName && base.XmlSpace != XmlSpace.Preserve)

    {

    base.WriteAttributeString("xml", "space", "http://www.w3.org/XML/1998/namespace", "preserve");

    break;

    }

    }

    }

    }

    and use it like this

    Foo foo = new Foo();

    foo.Bar = " ";

    foo.Example = " ";

    XmlSerializer serializer = new XmlSerializer(typeof(Foo));

    StringWriter stringWriter = new StringWriter();

    XmlSpaceWriter spaceWriter = new XmlSpaceWriter(stringWriter, new XmlQualifiedName[] {new XmlQualifiedName("Bar")});

    serializer.Serialize(spaceWriter, foo);

    spaceWriter.Close();

    string markup = stringWriter.ToString();

    The result then looks like this:

    <?xml version="1.0" encoding="utf-16"?><Foo xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"><Bar xml:space="preserve"> </Bar><Example> </Example></Foo>

    The example code above has just one constructor for a TextWriter but the code could obviously easily extended to have further constructors taking e.g. a stream.

    Saturday, March 10, 2007 2:42 PM

All replies

  • I tested with Visual Studio 2005/.NET 2.0 and the following works for me to preserve the white space:

    Foo foo = new Foo();

    foo.Bar = " ";

    XmlSerializer serializer = new XmlSerializer(typeof(Foo));

    StringWriter stringWriter = new StringWriter();

    serializer.Serialize(stringWriter, foo);

    string markup = stringWriter.ToString();

    Console.WriteLine(markup);

    Console.WriteLine();

    XmlReaderSettings readerSettings = new XmlReaderSettings();

    readerSettings.IgnoreWhitespace = false;

    foo = (Foo)serializer.Deserialize(XmlReader.Create(new StringReader(markup), readerSettings));

    Console.WriteLine("Bar.Length: {0}", foo.Bar.Length);

    where the example class Foo is e.g.

    public class Foo

    {

    public Foo() { }

    [XmlElement(ElementName = "Bar")]

    public string Bar

    {

    get

    {

    return _bar;

    }

    set

    {

    _bar = value;

    }

    }

    private string _bar;

    }

     

    Output then is e.g.

    <?xml version="1.0" encoding="utf-16"?>
    <Foo xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
      <Bar> </Bar>
    </Foo>

    Bar.Length: 1

    so the single space is first output as the contents of the Bar element and is later deserialized properly.

    Friday, March 09, 2007 1:44 PM
  • OK, I implemented your code and I got the same result as before, so I thought I'd check something else... I have always looked at the XML through SQL Management Studio because the XML is saved to a column in a database. In Debug, I validated that the XML is formated exactly as you said it would. The <tag> </tag> exists in the XML string before it is saved to the database. When retrieving the XML, the value will be <tag />.

    Obviously this discovery changes the problem. Thanks for your code. I confirmed that the previous code I posted returns the desired result but that the XML is changed when it is saved in SQL 2005.

    The new question is then how to add "xml:space=preserve" to the element or to the XML Root if needed when performing the serialization. This (I believe) would prevent SQL from altering the XML when it saves it to the database...

    Thanks!

    Chris

    Friday, March 09, 2007 4:32 PM
  • This is what I came up with:

    /// <summary>
    /// Special consideration is made to preserve String values. Since the XML is intended to
    /// be saved in a SQL database, we have to ensure the XML is not modified when converted
    /// and committed to the table column. What was happening is an XmlElement <tag> </tag> was
    /// being converted to <tag /> where the whitespace " " was lost. This has adverse affects
    /// to the profiles where a single space may be a valid value...
    ///
    /// Adding the following fixes the behavior of SQL Server 2005 by indicating that spaces
    /// are protected and should not be removed from data between tags.
    /// </summary>

    [XmlAttribute("xml:space")]
    public String SpacePreserve = "preserve";

    This creates a root node like the following:
    <DataImportBase xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xml:space="preserve">

    which indicates to the next process that handles the XML not to remove the whitespace between tags. What would be optimal is if I could put this Attribute on specific Elements within the XML...

    Friday, March 09, 2007 5:53 PM
  • I have looked through the classes for XML serialization but I haven't found anything that seems to allow one to set that attribute xml:space declaratively on members.

    One way to change the serialization behaviour would be to write a specialized XmlWriter that takes a list of element names for which it writes out the xml:space attribute. This can be done with a few lines if you use the XmlWrappingWriter from the project http://www.codeplex.com/Wiki/View.aspx?ProjectName=MVPXML. The code for XmlWrappingWriter is also shown in Oleg's blog.

    If you use that then you can set up a specialized XmlWriter as follows:

    public class XmlSpaceWriter : XmlWrappingWriter

    {

    private XmlQualifiedName[] ElementNames;

    public XmlSpaceWriter(TextWriter output, XmlQualifiedName[] elementNames)

    : base(XmlWriter.Create(output))

    {

    ElementNames = elementNames;

    }

    public override void WriteStartElement(string prefix, string localName, string ns)

    {

    base.WriteStartElement(prefix, localName, ns);

    foreach (XmlQualifiedName name in ElementNames)

    {

    if (name.Namespace == ns && name.Name == localName && base.XmlSpace != XmlSpace.Preserve)

    {

    base.WriteAttributeString("xml", "space", "http://www.w3.org/XML/1998/namespace", "preserve");

    break;

    }

    }

    }

    }

    and use it like this

    Foo foo = new Foo();

    foo.Bar = " ";

    foo.Example = " ";

    XmlSerializer serializer = new XmlSerializer(typeof(Foo));

    StringWriter stringWriter = new StringWriter();

    XmlSpaceWriter spaceWriter = new XmlSpaceWriter(stringWriter, new XmlQualifiedName[] {new XmlQualifiedName("Bar")});

    serializer.Serialize(spaceWriter, foo);

    spaceWriter.Close();

    string markup = stringWriter.ToString();

    The result then looks like this:

    <?xml version="1.0" encoding="utf-16"?><Foo xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"><Bar xml:space="preserve"> </Bar><Example> </Example></Foo>

    The example code above has just one constructor for a TextWriter but the code could obviously easily extended to have further constructors taking e.g. a stream.

    Saturday, March 10, 2007 2:42 PM