Yesterday a friend of DotNetMarche asked me this question: I have the need to store serialized objects into database , I can choose between binary or xml format, which is smaller in size?
My first answer was Binary should occupy less space because it is more compact but he told me that DBA checked that xml entities actually uses less space than binary ones.
This morning I did some test with a simple class
1
| [Serializable] public class Test {public String Property { get; set; }}
|
I tried this code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
| private static void Test(string testString)
{
MemoryStream bms = new MemoryStream();
BinaryFormatter bf = new BinaryFormatter();
Test t = new Test() {Property = testString};
bf.Serialize(bms, t);
Console.WriteLine("BinaryFormatterSize = {0}", bms.Length);
MemoryStream xms = new MemoryStream();
XmlSerializer xs = new XmlSerializer(typeof(Test));
xs.Serialize(xms, t);
Console.WriteLine("XmlSerializerSize = {0}", xms.Length);
Console.WriteLine(Encoding.UTF8.GetString(bms.ToArray()));
Console.WriteLine(Encoding.UTF8.GetString(xms.ToArray()));
}
|
Basically I created an object of type Test and I serialize it with BinaryFormatter and XmlSerializer dumping the size of the serialized data as well as a conversion to string using the UTF8 Unicode encoding, then I invoke this function with Test(“abcdefghi this is a longer string to test for a different situation”);
The result was.
1
2
3
4
5
6
7
8
9
10
11
| BinaryFormatterSize = 236
XmlSerializerSize = 229
? ????? ?? JConsoleApplication4, Version=1.0.0.0, Culture=neutral, Pu
blicKeyToken=null?? ?ConsoleApplication4.Test? ?<Property>k__BackingField??
?? Cabcdefghi this is a longer string to test for a different situation?
<?xml version="1.0"?>
<Test xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://ww
w.w3.org/2001/XMLSchema">
<Property>abcdefghi this is a longer string to test for a different situation<
/Property>
</Test>
|
As you can verify the binary formatter is actually longer than xml one, this because you can see that in binary serialization the.net environment serialize the whole name of the class as well as the name of the property saved (k__BackingField because it is an auto property). The xml version is smaller but if you change the XmlSerialization in this way
1
2
3
4
5
6
7
8
9
10
| XmlWriterSettings settings = new XmlWriterSettings();
settings.OmitXmlDeclaration = true;
settings.Indent = true;
settings.NewLineOnAttributes = true;
XmlSerializerNamespaces blank = new XmlSerializerNamespaces();
blank.Add("", "");
using (XmlWriter writer = XmlWriter.Create(xms, settings))
{
xs.Serialize(writer, t, blank);
}
|
You are asking for suppression of the XMLDeclaration and no namespace, the result of this test is.
1
2
3
4
5
6
7
8
9
| BinaryFormatterSize = 236
XmlSerializerSize = 110
? ????? ?? JConsoleApplication4, Version=1.0.0.0, Culture=neutral, Pu
blicKeyToken=null?? ?ConsoleApplication4.Test? ?<Property>k__BackingField??
?? Cabcdefghi this is a longer string to test for a different situation?
?<Test>
<Property>abcdefghi this is a longer string to test for a different situation<
/Property>
</Test>
|
WOW! the xml serialization is less than half in size respect to binary serialization, this because the XMLSerializer does not need to store the type of the object into the serialized format, since you specify the type on XmlSerializer constructor. Moreover the Xml format is human readable and can be validated with XSD or manipulated with XSLT.
Alk.
Tags: Serialization .NET Framework