Home

Awesome

BlobHandles

"Blob Handles" are a fast & easy way to hash and compare segments of memory in C# / Unity.

They allow you to do two main things:

  1. Use a sequence of bytes as a hash key, like in a Dictionary<BlobHandle, T>.
  2. Quickly compare two handles' slices of memory for equality

Compatibility

Any release will be tested against these versions, (italics in parentheses) indicates the version when written

Primarily tested on windows, but should work on any platform that lets you use pointers.

Blob Strings

This also includes BlobString, a wrapper around BlobHandle that points to an unmanaged representation of a string.

BlobString is designed for use cases that involve reading strings from an unmanaged source (network / disk) & comparing them against some sort of hash set.

Set the BlobString.Encoding property to change the encoding used for strings.
It's recommended to pick an encoding on startup and not change it after you've encoded any strings, unless you dispose all of them before changing the encoding.

Dictionaries

Dictionaries keyed on BlobHandle are one of the main intended uses.

For dealing with strings, there is BlobStringDictionary<T>. You add regular strings and it takes care of conversion to the unmanaged representation for you.

TryGetValueFromBytes()

Dictionary<BlobHandle, T> and BlobStringDictionary<T> both have a method, TryGetValueFromBytes<T>, that allows using a segment of bytes as the key to a dictionary value search, without having to construct a blob handle yourself.

 // dictionary is populated elsewhere
 Dictionary<BlobHandle, Action> m_Actions = new Dictionary<BlobHandle, Action>();        
 byte[] m_Buffer = new byte[64];
 Socket m_Socket;                                                    

 void ReceiveMethodCall()
 {
     var receivedByteCount = m_Socket.Receive(m_Buffer);

     // equivalent to
     //  m_Actions.TryGetValue(new BlobHandle(m_Buffer, receivedByteCount, out Action action)
     if (m_Actions.TryGetValueFromBytes(m_Buffer, receivedByteCount, out Action action))
         action();
 }

There are a number of other TryGetValueFromBytes() overloads for using with offsets into arrays and pointers .

HashSets

HashSets of BlobHandle are another use, and also get their own extension method.

ContainsBlob()

HashSet<BlobHandle> gets a method, ContainsBlob(), that allows using a segment of bytes as the key to a .Contains() check, using the same pattern as the dictionary TryGetValueFromBytes<T> method.

 HashSet<BlobHandle> m_HandleSet = new HashSet<BlobHandle>();        
 byte[] m_Buffer = new byte[64];
 Socket m_Socket;                                                    

 void ReceiveMethodCall()
 {
     var receivedByteCount = m_Socket.Receive(m_Buffer);
     if (m_HandleSet.ContainsBlob(m_Buffer, receivedByteCount))
     {
         // do something in response to matching
     }
 }

There are the same overloads for ContainsBlob() as the dictionary method.

Performance Details

Runtimes

BlobHandle is significantly faster under Mono than IL2CPP in my testing on Windows x64, 2019.1.14.

However, it still performs fine under IL2CPP.

Memory & Constructors

BlobHandle is an immutable struct with only a pointer & a length. This means

BlobString uses a NativeArray to store a copy of the string. This means

Equality Testing

I tested a number of different ways of testing memory equality between handles to find out what method would be faster, and how that changed depending on the number of bytes to compare & the compiler (Mono or IL2CPP) used.

Consistently the fastest method under both runtimes, easily several times faster than anything else (at least on Windows x64 where i've tested) is a direct wrapper around memcmp. Other platforms may call for a different implementation of the equality check if it proves faster there.

GetHashCode()

The included method for getting a blob handle's hash code uses the length of the blob and the value of the last byte in its contents.

Length * 397 ^ Pointer[Length - 1];

This was the fastest method I tested on my data, and it should work well on any data that doesn't have a lot of entries of the same length that also end in the same byte.

You may be able to get better performance with a different method, especially if your data is different.

IL2CPP Options

An attribute is used to disable IL2CPP null checks in some methods of BlobStringDictionary<T>.
The only managed objects inside those methods are readonly members of the class, and initialized when the dictionary is, so they should always not be null.

Tests

Performance tests for BlobHandles include, but are not limited to:

Performance tests are in a runtime assembly so they can be run in players.