Home

Awesome

greplin-bloom-filter

Greplin Bloom Filter

An Bloom Filter implementation in Java, that optionally supports persistence and counting buckets.

Status:

This is a very early stage project. It works for our needs. We haven't verified it works beyond that. Issue reports and patches are very much appreciated!

Some improvements we'd love to see include:

Pre-requisites:

[Maven] (http://maven.apache.org/)

Installation

git clone https://github.com/Greplin/greplin-bloom-filter.git

cd greplin-bloom-filter

mvn install

Implementation details

Usage

// if the 'file' is null, the bloom filter is in-memory only, and not-persisted to disk
final File onDiskFile = new File("/tmp/greplin-bloom-filter.bin");
final int expectedItems = 10000;
final double desiredFalsePositiveRate = 0.000001;

final byte[] exampleItemA = "Hello World!".getBytes(Charset.forName("UTF-8"));
final byte[] exampleItemB = "Goodbye Cruel world".getBytes(Charset.forName("UTF-8"));

BloomFilter bloomFilter = new BloomFilter.NewBuilder(onDiskFile, expectedItems, desiredFalsePositiveRate)
    .force(true) // tells it to over-write any existing file at onDiskFile
    .build();

System.out.println(bloomFilter.contains(exampleItemA)); // false
System.out.println(bloomFilter.contains(exampleItemB)); // false

bloomFilter.add(exampleItemA);
bloomFilter.add(exampleItemB);

System.out.println(bloomFilter.contains(exampleItemA)); // true
System.out.println(bloomFilter.contains(exampleItemB)); // true

bloomFilter.remove(exampleItemB);

System.out.println(bloomFilter.contains(exampleItemA)); // true
System.out.println(bloomFilter.contains(exampleItemB)); // false

bloomFilter.close();
bloomFilter = null;


// now, let's reopen the same bloom filter from the on-disk file
bloomFilter = new BloomFilter.OpenBuilder(onDiskFile).build();

System.out.println(bloomFilter.contains(exampleItemA)); // true
System.out.println(bloomFilter.contains(exampleItemB)); // false

bloomFilter.remove(exampleItemA);

System.out.println(bloomFilter.contains(exampleItemA)); // false
System.out.println(bloomFilter.contains(exampleItemB)); // false

bloomFilter.close();

Authors

Greplin, Inc.