Home

Awesome

FastIndex

Leverage FastFilter (XOR filter) algorithm to create an index, resulting in a solution for generating a Perfect Hash

Modifications

This is a C# solution. The modifications required to obtain the index can be seen by reviewing this commit.

Main benefit

The solution has all the characteristics of XOR Filters, and as expressed by researchers, the index has the benefit of fitting into a compact memory space.

Lookup Performance

No fancy charts: On a 10th gen Intel I5 laptop, with 100 million keys, the average lookup time observed was 31ns, similar to Dictionary data structure lookups with hashing.

Improvements

The main property leveraged to compute the index relies on the design of XOR filters making use of 3 constant size array segments, which lends to the simplicity of computing the index.

Computing the index requires knowing which of the 3 array segments—that make up the XOR filters used to generate a fingerprint—is the primary segment, and saving that off. The presented solution stores the primary segment's index value in a lookaside Byte array. But since it only requires 2 bits to store a value between 0 and 2—representing array segments 0, 1, or 2—an improved solution for storage efficiency would be to multiplex those 2 bits onto the fingerprint. The downside of course, is that due to the fact that XOR filters are an example of a probabilistic lookup strategy, loosing 2 bits from the fingerprint means that you increase the likelihood of a hash collision--but, note that multiplexing means that the computation for the index is nearly identical to that of a contains.

Generation Improvements

A simple solution for decreasing the amount of time to construct the filter is to make use of all of the processing cores available, and run the algorithm in parallel on subsets of the key space, by partitioning the key space into distinct subsets, and creating a separate XOR filter for each distinct subset. Of course, doing this implies that the complexity of managing multiple XOR filters for a given key space goes up.

Another approach to speed up construction that was recently published, is to leverage a GPU.

Since the use case for this generally would be to lookup data that may not change a lot, a simple approach would be to save off the data of the completed XOR filter to a persistent store, and load it back in to main memory when needed.

References

  1. FastFilter
  2. Xor Filters: Faster and Smaller Than Bloom and Cuckoo Filters authored by Thomas Mueller Graf, Daniel Lemire

How-To use FastIndex

The best way to learn how to use this class is to load the solution into visual studio and run and examine the tests. The test data used uses a practical real-world scenario around "faked" telephone numbers.

How-To highlights

Benchmarks

dotnet run -c Release --project BenchMark
// * Summary *

BenchmarkDotNet v0.13.12, Windows 10 (10.0.19045.3803/22H2/2022Update)
Intel Core i5-10210U CPU 1.60GHz, 1 CPU, 8 logical and 4 physical cores
.NET SDK 7.0.203
  [Host]     : .NET 6.0.25 (6.0.2523.51912), X64 RyuJIT AVX2
  DefaultJob : .NET 6.0.25 (6.0.2523.51912), X64 RyuJIT AVX2


| Method                                | Mean       | Error    | StdDev   |
|-------------------------------------- |-----------:|---------:|---------:|
| Construct_FastFilter_10_Million_Items | 6,364.8 ms | 85.34 ms | 75.65 ms |
| Contains_1_Million_Lookups            |   381.3 ms |  4.96 ms |  4.64 ms |
| Index_1_Million_Lookups               |   436.2 ms |  7.30 ms | 10.23 ms |

// * Hints *
Outliers
  BenchMarkTests.Construct_FastFilter_10_Million_Items: Default -> 1 outlier  was  removed (6.77 s)
  BenchMarkTests.Index_1_Million_Lookups: Default               -> 6 outliers were removed (485.89 ms..539.99 ms)

// * Legends *
  Mean   : Arithmetic mean of all measurements
  Error  : Half of 99.9% confidence interval
  StdDev : Standard deviation of all measurements
  1 ms   : 1 Millisecond (0.001 sec)

// ***** BenchmarkRunner: End *****
Run time: 00:07:27 (447.39 sec), executed benchmarks: 3

Global total time: 00:07:32 (452.21 sec), executed benchmarks: 3