Home

Awesome

Minisketch: a library for BCH-based set reconciliation

<img align="right" src="doc/minisketch-vs.png" />

libminisketch is an optimized standalone MIT-licensed library with C API for constructing and decoding set sketches, which can be used for compact set reconciliation and other applications. It is an implementation of the PinSketch<sup>[1]</sup> algorithm. An explanation of the algorithm can be found here.

Sketches for set reconciliation

Sketches, as produced by this library, can be seen as "set checksums" with two peculiar properties:

This makes them appropriate for a very bandwidth-efficient set reconciliation protocol. If Alice and Bob each have a set of elements, and they suspect that the sets largely but not entirely overlap, they can use the following protocol to let both parties learn all the elements:

This will always succeed when the size of the difference (elements that Alice has but Bob doesn't plus elements that Bob has but Alice doesn't) does not exceed the capacity of the sketch that Alice sent. The interesting part is that this works regardless of the actual set sizes—only the difference matters.

If the elements are large, it may be preferable to compute the sketches over hashes of the set elements. In that case an additional step is added to the protocol, where Bob also sends the hash of every element he does not have to Alice, who responds with the requested elements.

The doc/ directory has additional tips for designing reconciliation protocols using libminisketch.

Evaluation

<img src="doc/plot_capacity.png" width="432" height="324" /> <img src="doc/plot_diff.png" width="432" height="324" />

<img src="doc/plot_size.png" width="432" height="324" /> <img src="doc/plot_bits.png" width="432" height="324" />

The first graph above shows a benchmark of libminisketch against three other set reconciliation algorithms/implementations. The benchmarks were performed using a single core on a system with an Intel Core i7-7820HQ CPU with clock speed locked at 2.4 GHz. The diagram shows the time needed for merging of two sketches and decoding the result. The creation of a sketch on the same machine takes around 5 ns per capacity and per set element. The other implementations are:

For the largest sizes currently of interest to the authors, such as a set of capacity 4096 with 1024 differences, libminisketch is forty-nine times faster than pinsketch and over eight thousand times faster than cpisync. libminisketch is fast enough on realistic set sizes for use on high-traffic network servers where computational resources are limited.

Even where performance is latency-limited, small minisketches can be fast enough to improve performance. On the above i7-7820HQ, a set of 2500 30-bit entries with a difference of 20 elements can be communicated in less time with a minisketch than sending the raw set so long as the communications bandwidth is 1 gigabit per second or less; an eight-element difference can be communicated in better than one-fifth the time on a gigabit link.

The second graph above shows the performance of the same algorithms on the same system, but this time keeping the capacity constant at 128, while varying the number of differences to reconcile between 1 and 128. It shows how cpisync's reconciliation speed is mostly dependent on capacity, while pinsketch/libminisketch are more dependent on number of differences.

The third graph above shows the size overhead of a typical IBLT scheme over the other algorithms (which are near-optimal bandwidth), for various levels of failure probability. IBLT takes tens of times the bandwidth of libminisketch sketches when the set difference size is small and the required failure rate is low.

The fourth graph above shows the effect of the field size on speed in libminisketch. The three lines correspond to:

It shows how CLMUL implementations are faster for certain fields (specifically, field sizes for which an irreducible polynomial of the form x<sup>b</sup> + x + 1 over GF(2) exists, and to a lesser extent, fields which are a multiple of 8 bits). It also shows how (for now) a significant performance drop exists for fields larger than 32 bits on 32-bit platforms. Note that the three lines are not at the same scale (the Raspberry Pi 3B is around 10x slower for 32-bit fields than the Core i7; the POWER9 is around 1.3x slower).

Below we compare the PinSketch algorithm (which libminisketch is an implementation of) with other set reconciliation algorithms:

AlgorithmSketch sizeDecode successDecoding complexityDifference typeSecure sketch
CPISync<sup>[2]</sup>(b+1)cAlwaysO(n<sup>3</sup>)BothYes
PinSketch<sup>[1]</sup>bcAlwaysO(n<sup>2</sup>)Symmetric onlyYes
IBLT<sup>[6][7]</sup>αbc (see graph 3)ProbabilisticO(n)DependsNo

Building

The build system is very rudimentary for now, and improvements are welcome.

The following may work and produce a libminisketch.a file you can link against:

git clone https://github.com/sipa/minisketch
cd minisketch
./autogen.sh && ./configure && make

Usage

In this section Alice and Bob are trying to find the difference between their sets. Alice has the set [3000 ... 3009], while Bob has [3002 ... 3011].

First, Alice creates a sketch:

#include <stdio.h>
#include <assert.h>
#include "../include/minisketch.h"
int main(void) {

  minisketch *sketch_a = minisketch_create(12, 0, 4);

The arguments are:

Then Alice adds her elements to her sketch. Note that adding the same element a second time removes it again, as sketches have set semantics, not multiset semantics.

  for (int i = 3000; i < 3010; ++i) {
    minisketch_add_uint64(sketch_a, i);
  }

The next step is serializing the sketch into a byte array:

  size_t sersize = minisketch_serialized_size(sketch_a);
  assert(sersize == 12 * 4 / 8); // 4 12-bit values is 6 bytes.
  unsigned char *buffer_a = malloc(sersize);
  minisketch_serialize(sketch_a, buffer_a);
  minisketch_destroy(sketch_a);

The contents of the buffer can then be submitted to Bob, who can create his own sketch:

  minisketch *sketch_b = minisketch_create(12, 0, 4); // Bob's own sketch
  for (int i = 3002; i < 3012; ++i) {
    minisketch_add_uint64(sketch_b, i);
  }

After Bob receives Alice's serialized sketch, he can reconcile:

  sketch_a = minisketch_create(12, 0, 4);     // Alice's sketch
  minisketch_deserialize(sketch_a, buffer_a); // Load Alice's sketch
  free(buffer_a);

  // Merge the elements from sketch_a into sketch_b. The result is a sketch_b
  // which contains all elements that occurred in Alice's or Bob's sets, but not
  // in both.
  minisketch_merge(sketch_b, sketch_a);

  uint64_t differences[4];
  ssize_t num_differences = minisketch_decode(sketch_b, 4, differences);
  minisketch_destroy(sketch_a);
  minisketch_destroy(sketch_b);
  if (num_differences < 0) {
    printf("More than 4 differences!\n");
  } else {
    ssize_t i;
    for (i = 0; i < num_differences; ++i) {
      printf("%u is in only one of the two sets\n", (unsigned)differences[i]);
    }
  }
}

In this example Bob would see output such as:

$ gcc -std=c99 -Wall -Wextra -o example ./doc/example.c -Lsrc/ -lminisketch -lstdc++ && ./example
3000 is in only one of the two sets
3011 is in only one of the two sets
3001 is in only one of the two sets
3010 is in only one of the two sets

The order of the output is arbitrary and will differ on different runs of minisketch_decode().

Applications

Communications efficient set reconciliation has been proposed to optimize Bitcoin transaction distribution<sup>[8]</sup>, which would allow Bitcoin nodes to have many more peers while reducing bandwidth usage. It could also be used for Bitcoin block distribution<sup>[9]</sup>, particularly for very low bandwidth links such as satellite. A similar approach (CPISync) is used by PGP SKS keyservers to synchronize their databases efficiently. Secure sketches can also be used as helper data to reliably extract a consistent cryptographic key from fuzzy biometric data while leaking minimal information<sup>[1]</sup>. They can be combined with dcnets to create cryptographic multiparty anonymous communication<sup>[10]</sup>.

Implementation notes

libminisketch is written in C++11, but has a C API for compatibility reasons.

Specific algorithms and optimizations used:

Some improvements that are still TODO:

References