Custom Binary Serialization in C#

Updated 2019-07-11

There are plenty of ways to serialize your data in C#. I'm going to explore one of these, Binary serialization, but instead of using .NET libraries, we're going to roll our own for fun. Afterwards, I'll do some performance testing to see how it stacks up.

In this article I'll be trying to write directly into the memory location of some structs and primitives. By avoiding extra copying, allocation, and call stack depth, I can hopefully serialize my data faster.

Requirements

I use some new C# features and libraries.

.NET core 2.1 for Span<T> .
I'll also show a hacky work-around you can try for .NET Framework, it's pretty bad.

Further reading: All About Span: Exploring a New .NET Mainstay by Stephen Taub
A compiler that supports C# 7.3 for the unmanaged constraint .

Further reading: Dissecting new generic constraints in C# 7.3 by Sergey Tepliakov
(Optional) The NuGet package: System.Runtime.CompilerServices.Unsafe

Further reading: Subverting .NET Type Safety with 'System.Runtime.CompilerServices.Unsafe' by Matt Warren

Writing to a stream

Old busted

The traditional way to write a struct to a stream has been to use the Marshal class and both StructureToPtr and SizeOf to create a new byte[] which gets passed to Stream.Write. If your objects are large, or if you have a lot of them, your performance and resource usage can be negatively affected.

New hotness

First, some credit to others who have helped me refine this code. Thanks to their suggestions I was able to convert this from the unsafe form to a new safe version.

Ben Adams^[1]^[2] and reddit user ThadeeusMaximus^[3] brought how to avoid an unsafe context by using MemoryMarshal.
Reddit user ILMTitan^[4] demonstrated a safe version of stackalloc.

.NET Core 2.1 added a new overload Stream.Write(ReadOnlySpan<byte>). With this, we will bypass creating a byte[] and converting our struct. We simply point to our existing struct, and trick the stream into thinking it's a collection of bytes.

With some trickery, Span<byte> can be made to point in-place to your existing struct. The main hurdle here is that we don't start off with a byte[], so how can we use any of Span's constructors? There are two options, I'll explain how to use them and the different trade-offs involved.

System.Runtime.CompilerServices.Unsafe
With the AsPointer method, we can convert our struct and use the constructor Span<T>(Void*, Int32). It does require a length, so how do we get it? With the unmanaged constraint, we can use the sizeof operator on our type. It's also a constraint for AsPointer, so we need it anyways.
```
public static unsafe void Write<T>(this Stream stream, T value)
    where T : unmanaged
{
    var pointer = Unsafe.AsPointer(ref value);
    var span = new Span<byte>(pointer, sizeof(T));
    stream.Write(span);
}
```

MemoryMarshal.CreateSpan, MemoryMarshal.AsBytes

With these, we avoid the unsafe context and some potential memory issues.

public static void Write<T>(this Stream stream, T value)
    where T : unmanaged
{
    var tSpan = MemoryMarshal.CreateSpan(ref value, 1);
    var span = MemoryMarshal.AsBytes(tSpan);
    stream.Write(span);
}

Reading from a stream

Again, we have two options. Both save one allocation by reading into our final memory location, and avoid a copy operation by not using an intermediate byte[].

Unsafe:

public static unsafe T Read<T>(this Stream stream)
    where T : unmanaged
{
    var result = default(T);
    var pointer = Unsafe.AsPointer(ref result);
    var span = new Span<byte>(pointer, sizeof(T));
    stream.Read(span);
    return result;
}

Safe:

public static T Read<T>(this Stream stream)
    where T : unmanaged
{
    var result = default(T);
    var tSpan = MemoryMarshal.CreateSpan(ref result, 1);
    var span = MemoryMarshal.AsBytes(tSpan);
    stream.Read(span);
    return result;
}

Differences between the two methods

Suprisingly, MemoryMarshal actually uses System.Runtime.CompilerServices.Unsafe behind the scenes (corefx source).
The big difference is that the generic constraint there is on T : struct and not T : unmanaged like our code. Because of this they have to make sure that RuntimeHelpers.IsReferenceOrContainsReferences<T>() is false.

I found the code for it deep in the JIT interface.

bool getILIntrinsicImplementationForRuntimeHelpers(MethodDesc * ftn,
    CORINFO_METHOD_INFO * methInfo)
{
...
    if (!methodTable->IsValueType() || methodTable->ContainsPointers())
    {
        methInfo->ILCode = const_cast<BYTE*>(returnTrue);
    }
    else
    {
        methInfo->ILCode = const_cast<BYTE*>(returnFalse);
    }
...
}

Type.IsValueType calls IsSubclassOf, a small method that should have no problems executing quickly. ContainsPointers() is a flag check from C++ code and is equally speedy.

So the first call to IsReferenceOrContainsReferences is fairly cheap, and the JIT compiler may be able to optimize it out afterwards.

We just want the code

What you really came here for: StreamExtensions.cs

Performance

Safe vs Unsafe

I ran a benchmark 4 times. The first run was excluded to warm up the profiler. The median of the other 3 was kept as the result. I checked to make sure there were no weird outliers in the data, everything looks good.

The raw data if you care for it.

This was run after I'd generated my graph. The SVG is already minified and I didn't keep a copy of the original. The results are similar, though a bit faster for all cases. I'm going to blame corporate anti-virus.

Person
ZeroFormatter
      Serialize		24.6626 ms
    Deserialize		15.042 ms
    Binary Size		50.00 B
SafeFormatter
      Serialize		22.2043 ms
    Deserialize		23.3115 ms
    Binary Size		26.00 B
UnsafeFormatter
      Serialize		23.8769 ms
    Deserialize		24.2574 ms
    Binary Size		26.00 B
Person[]
ZeroFormatter
      Serialize		14486.7296 ms
    Deserialize		17258.767 ms
    Binary Size		48.83 KB
SafeFormatter		
      Serialize		23333.3715 ms
    Deserialize		24200.0206 ms
    Binary Size		25.39 KB
UnsafeFormatter
      Serialize		23882.1373 ms
    Deserialize		24917.1850 ms
    Binary Size		25.39 KB
int
ZeroFormatter
      Serialize		2.9242 ms
    Deserialize		1.6953 ms
    Binary Size		4.00 B
SafeFormatter
      Serialize		1.7977 ms
    Deserialize		1.9589 ms
    Binary Size		4.00 B
UnsafeFormatter
      Serialize		1.8955 ms
    Deserialize		1.8404 ms
    Binary Size		4.00 B
Vector3
ZeroFormatter
      Serialize		4.7405 ms
    Deserialize		3.0243 ms
    Binary Size		12.00 B
SafeFormatter
      Serialize		1.9061 ms
    Deserialize		2.3206 ms
    Binary Size		12.00 B
UnsafeFormatter
      Serialize		2.0275 ms
    Deserialize		1.9824 ms
    Binary Size		12.00 B
string
ZeroFormatter
      Serialize		34284.1008 ms
    Deserialize		31301.6707 ms
    Binary Size		301.84 KB
SafeFormatter
      Serialize		19599.1895 ms
    Deserialize		29779.7524 ms
    Binary Size		301.84 KB
UnsafeFormatter
      Serialize		19569.0796 ms
    Deserialize		30918.9688 ms
    Binary Size		301.84 KB
Vector3[]
ZeroFormatter
      Serialize		260.0052 ms
    Deserialize		200.6082 ms
    Binary Size		1.18 KB
SafeFormatter
      Serialize		6.1998 ms
    Deserialize		18.7903 ms
    Binary Size		1.18 KB
UnsafeFormatter
      Serialize		7.4503 ms
    Deserialize		20.1956 ms
    Binary Size		1.18 KB

The two methods are neck and neck. It looks like the safe version is a teeny bit faster, but it's so small that I'd consider it observational error.

Therefore, I would stick with the safe version. Glad I tested both!

ZeroFormatter benchmark

Adapting the benchmark used by ZeroFormatter, I'll compare this custom serialization method against it in a few scenarios. It's not exactly a fair test, as all of my usages are hand-coded and ZeroFormatter has to use reflection.

The code can all be found on github.

As predicted, our serializer had excellent performance in the struct cases.
We do lose out a few times. I'm guessing that in the case of Person and Person[], the UTF-8 encoding adds a lot of overhead.

Experimenting with UTF-8

If we avoid remove the UTF-8 code, we can use a pre-built methods for reading strings from the stream. Note that the memory footprint of strings essentially doubles.

public static string ReadString(this Stream stream)
{
    var length = stream.Read<int>();
    return string.Create(length, stream, (chars, streamToUse) =>
    {
        var bytes = MemoryMarshal.AsBytes(chars);
        streamToUse.Read(bytes);
    });
}

The resulting times go down by a lot!
string goes from 19599 ms for deserializing and 29779 ms for serializing, to 2121 ms and 7569 ms respectively.
Person also sees improvement. It's not as drastic but does make it faster than ZeroFormatter.

My use-case

This is the data structure storing my DAWG from the third article in that series. I'm going to use it to test our new method against ZeroFormatter as well as .NET's BinaryFormatter. When loaded with a typical dictionary it consumes just over 1.4MB of memory.

private readonly int _terminalNodeIndex;
private readonly int _rootNodeIndex;
private readonly int[] _firstChildEdgeIndex; // length: 38,745
private readonly int[] _edgesToNodeIndex; //length: 85,600
private readonly char[] _edgeCharacter; //length: 85,600
private readonly ushort[] _reachableTerminalNodes; // length: 38,744
private readonly long[] _wordCount; //length: 82,765

After some initial warm up to avoid JIT issues, I loaded the object from disk 1000 times with each method and here are the results.

Custom: 495ms
BinaryFormatter: 2162ms
ZeroFormatter: 923ms

I win! This shouldn't come as a surprise. We're using a hand-coded serializer and working with its specialty, unmanaged arrays. The other methods have to rely on reflection which can be a substantial burden, but they have the advantage of requiring less work to setup.

The size of the files are also slightly different.

Custom: 1,322,620 bytes
BinaryFormatter: 1,322,944 bytes
ZeroFormatter: 1,408,252 bytes

Impressively, BinaryFormatter is very close to the custom format in size. I'm guessing the slight difference is from storing type information. The larger size of the ZeroFormatter file is almost assuredly from not converting the char[] to UTF-8. I'm surprised it doesn't, as it's output for string matched our size.

Hack for .NET Framework

Without Span<T>, this method isn't feasible and we have to make do with Write(byte[], int, int). How can we convert an arbitrary type to a byte[]? Unions aren't quite right: if the type's size is larger than a single byte, the Length property is much too small and the Stream methods will complain when they perform bounds checking. Unsafe pointers won't do either, casting to a byte[] blows up. Using both, we can do some bad things...

var wrapper = new UnionArrayInt {Typed = value};
Write(wrapper.Typed, wrapper.Bytes);

[StructLayout(LayoutKind.Explicit)]
public struct UnionArrayInt
{
    [FieldOffset(0)]
    public readonly byte[] Bytes;

    [FieldOffset(0)]
    public int[] Typed;
}

public void Write<T>(T[] value, byte[] bytes)
    where T : unmanaged
{
    var oldLength = value.Length;
    Stream.WriteInt32(oldLength);
    
    unsafe
    {
        fixed (T* typedPointer = value)
        {
            var intPointer = (IntPtr*) typedPointer;
            try
            {
                *(intPointer - 1) = new IntPtr(sizeof(T) * oldLength);
                Stream.Write(bytes, 0, bytes.Length);
            }
            finally
            {
                *(intPointer - 1) = new IntPtr(oldLength);
            }
        }
    }
}

Unfortunately, you'll have to write a union struct for each type you want to use this way. I wasn't able to create a generic container.

The magic part: *(intPointer - 1)
By using pointers to edit the length field of the array, I can trick Stream's methods into accepting it as a fake byte[].
It is important to note that I used IntPtr here not as an actual pointer, but as a representative of the word size of the processor. Otherwise, when 32 bit you'd have to use int, while targeting 64 bit would require long.

This relies on the memory layout of arrays. I don't know what kind of guarantee you can expect that it will always be at the same location. This is truly an example of:

It works on my PC

Not to mention it requires endianness to match.

Concluding thoughts

In the end it all depends on your requirements. Are you serializing bits and bytes? Do you have a need for speed? If so, this is a great way to do it. Otherwise, existing serialization libraries make it much easier to write maintainable code.

If you missed the link to the code, here it is: StreamExtensions.cs