Awesome
StackXML
Stack based zero*-allocation XML serializer and deserializer powered by C# 9 source generators.
Why
Premature optimisation :)
Setup
- StackXML targets netstandard2.1 which means back to .NET Core 3.0 is supported, but I would recommend .NET 5
- Add the following to your project to reference the serializer and enable the source generator
<ItemGroup>
<ProjectReference Include="..\StackXML\StackXML.csproj" />
<ProjectReference Include="..\StackXML.Generator\StackXML.Generator.csproj" OutputItemType="Analyzer" ReferenceOutputAssembly="false" />
</ItemGroup>
- The common entrypoint for deserializing is
XmlReadBuffer.ReadStatic(ReadOnlySpan<char>)
- The common entrypoint for serializing is
XmlWriteBuffer.SerializeStatic(IXmlSerializable)
- This method returns a string, to avoid this allocation you will need create your own instance of XmlWriteBuffer and ensure it is disposed safely like
SerializeStatic
does. TheToSpan
method returns the char span containing the serialized text
- This method returns a string, to avoid this allocation you will need create your own instance of XmlWriteBuffer and ensure it is disposed safely like
Features
- Fully structured XML serialization and deserialization with 0 allocations, apart from the output data structure when deserializing. Serialization uses a pooled buffer from
ArrayPool<char>.Shared
that is released when the serializer is disposed.XmlReadBuffer
handles deserializationXmlWriteBuffer
handles serializationXmlCls
maps a type to an element- Used for the serializer to know what the element name should be
- Used by the deserializer to map to IXmlSerializable bodies with no explicit name
XmlField
maps to attributesXmlBody
maps to child elementsIXmlSerializable
(not actually an interface, see quirks) represents a type that can be read from or written to XML- Can be manually added as a base, or the source generator will add it automatically to any type that has XML attributes
- Parsing delimited attributes into typed lists
<test list='1,2,3,4,6,7,8,9'>
[XmlField("list")] [XmlSplitStr(',')] public List<int> m_list;
- Using StrReader and StrWriter, see below
- StrReader and StrWriter classes, for reading and writing (comma usually) delimited strings with 0 allocations.
- Can be used in a fully structured way by adding
StrField
attributes to fields on aref partial struct
(not compatible with XmlSplitStr, maybe future consideration)
- Can be used in a fully structured way by adding
- Agnostic logging through LibLog
Quirks
- Invalid data between elements is ignored
<test>anything here is completely missed<testInner/><test/>
- Spaces between attributes is not required by the deserializer
- e.g
<test one='aa'two='bb'>
- e.g
- XmlSerializer must be disposed otherwise the pooled buffer will be leaked.
- XmlSerializer.SerializeStatic gives of an example of how this should be done in a safe way
- Data types can only be classes, not structs.
- All types must inherit from IXmlSerializable (either manually or added by the source generator) which is actually an abstract class and not an interface
- Using structs would be possible but I don't think its worth the box
- Types from another assembly can't be used as a field/body. Needs fixing
- All elements in the data to parse must be defined in the type in one way or another, otherwise an exception will be thrown.
- The deserializer relies on complete parsing and has no way of skipping elements
- Comments within a primitive type body will cause the parser to crash (future consideration...)
<n><!--uh oh-->hi<n>
- Null strings are currently output exactly the same as empty strings... might need changing
- The source generator emits a parameterless constructor on all XML types that initializes
List<T>
bodies to an empty list- Trying to serialize a null list currently crashes the serializer....
- When decoding XML text an extra allocation of the input string is required
- WebUtility.HtmlDecode does not provide an overload taking a span, but the method taking a string turns it into a span anyway.. hmm
- The decode is avoided where possible
- Would be nice to be able to use ValueStringBuilder. See https://github.com/dotnet/runtime/issues/25587
Performance
Very simple benchmark, loading a single element and getting the string value of its attribute attribute
BenchmarkDotNet=v0.12.1, OS=Windows 10.0.17134.1845 (1803/April2018Update/Redstone4)
Intel Core i5-6600K CPU 3.50GHz (Skylake), 1 CPU, 4 logical and 4 physical cores
.NET Core SDK=5.0.100
[Host] : .NET Core 5.0.0 (CoreCLR 5.0.20.51904, CoreFX 5.0.20.51904), X64 RyuJIT
DefaultJob : .NET Core 5.0.0 (CoreC CoreCLRLR 5.0.20.51904, CoreFX 5.0.20.51904), X64 RyuJIT
Method | Mean | Error | StdDev | Ratio | RatioSD | Gen 0 | Gen 1 | Gen 2 | Allocated |
---|---|---|---|---|---|---|---|---|---|
ReadBuffer | 95.81 ns | 0.983 ns | 0.872 ns | 1.00 | 0.00 | 0.0178 | - | - | 56 B |
XmlReader | 1,866.22 ns | 37.250 ns | 79.383 ns | 19.57 | 0.87 | 3.3216 | - | - | 10424 B |
XDocument | 2,286.97 ns | 45.784 ns | 124.560 ns | 24.48 | 1.16 | 3.4313 | - | - | 10776 B |
XmlDocument | 2,869.48 ns | 44.058 ns | 39.057 ns | 29.96 | 0.60 | 3.9196 | - | - | 12328 B |
XmlSerializer | 10,386.07 ns | 152.481 ns | 142.631 ns | 108.44 | 1.49 | 4.7150 | - | - | 14882 B |
Example data classes
Simple Attribute
<test attribute='value'/>
[XmlCls("test"))]
public partial class Test
{
[XmlField("attribute")]
public string m_attribute;
}
Text body
<test2>
<name><![CDATA[Hello world]]></name>
</test2>
CData can be configured by setting cdataMode
for serializing and deserializing
<test2>
<name>Hello world</name>
</test2>
[XmlCls("test2"))]
public partial class Test2
{
[XmlBody("name")]
public string m_name;
}
Lists
<container>
<listItem name="hey" age='25'/>
<listItem name="how" age='2'/>
<listItem name="are" age='4'/>
<listItem name="you" age='53'/>
</container>
[XmlCls("listItem"))]
public partial class ListItem
{
[XmlField("name")]
public string m_name;
[XmlField("age")]
public int m_age; // could also be byte, uint etc
}
[XmlCls("container")]
public partial class ListContainer
{
[XmlBody()]
public List<ListItem> m_items; // no explicit name, is taken from XmlCls
}
Delimited attributes
<musicTrack id='5' artists='5,6,1,24,535'>
<n><![CDATA[Awesome music]]></n>
<tags>cool</tags>
<tags>awesome</tags>
<tags>fresh</tags>
</musicTrack>
[XmlCls("musicTrack"))]
public partial class MusicTrack
{
[XmlField("id")]
public int m_id;
[XmlBody("n")]
public string m_name;
[XmlField("artists"), XmlSplitStr(',')]
public List<int> m_artists;
[XmlBody("tags")]
public List<string> m_tags;
}