Home

Awesome

murmur

Build Status Coverage Status license Maven Central

murmur is a pure Java implementation of all Murmur hashes, namely, Murmur1, Murmur2 and Murmur3. The library is a direct Java implementation of the C++ source code. Hash generation has been 100% unit tested against the hashes generated using the C++ code. The library should help in building out bloom filters, or to just compute the hash for checking sanity of data, as Murmur3 is much faster than MD5 and SHA computations.

The library is tested on the following JDK versions:

Why murmur?

murmur was developed as we could not find pure Java implementations for Murmur1 and Murmur2 hashes. Implementations were available for Murmur3 but for some of the legacy code that I maintain, I needed the Murmur1 and Murmur2 hashes. Thus, I ported the original implementations.

You may find the hash inconsistent with Google Guava library. The hash value is the same, it is the endian-ness of the hash that makes it look different. Refer to Issue #3 for more details.

To convert the hash into byte[] or a hex-string you may use the following code:

/**
 * Convert a given long value to byte-array.
 * 
 * @param x the long value
 * 
 * @return the byte[] array representation of it
 */
public static byte[] longToBytes(long x) {
	ByteBuffer buffer = ByteBuffer.allocate(Long.BYTES);

	// The ByteOrder.LITTLE_ENDIAN format matches the Google Guava toString() format
	buffer.order(ByteOrder.LITTLE_ENDIAN);

	buffer.putLong(x);
	return buffer.array();
}

/**
 * Convert a byte-array to hex string.
 * 
 * @param bytes the byte-array
 * 
 * @return the hex string
 */
public static String bytesToHex(byte[] bytes) {
	char[] hexChars = new char[bytes.length * 2];
	for (int j = 0; j < bytes.length; j++) {
		int v = bytes[j] & 0xFF;
		hexChars[j * 2] = hexArray[v >>> 4];
		hexChars[j * 2 + 1] = hexArray[v & 0x0F];
	}
	return new String(hexChars);
}

Features

Performance

The MurmurPerformanceTests.java file contains tests to compute hashes of 1-million random type-4 UUIDs between various Murmur hashes, and MD5, SHA-1, SHA-256, and SHA-512 hashes.

The results of a sample run on my dev machine are as under:

Windows
-------
Intel i7-2660 CPU @ 3.40Ghz
16-GB RAM
Windows 7, 64-bit, Service Pack 1
Oracle JDK 1.7.0_51 build 13, 64-bit Server VM

OS X
----
Intel i7-4870HQ CPU @ 2.50GHz
16-GB RAM
macOS Sierra 10.12.1
Oracle JDK 1.8.0_101 build 13, 64-bit Server VM
AlgorithmTime Taken Windows (ms)Time Taken OSX (ms)
MD5369338
SHA-1482415
SHA-256677642
SHA-512906782
Murmur-1143101
Murmur-2135123
Murmur-2-6410292
Murmur-3168119
Murmur-3-128160261

Builds

1.0.0

Downloads

The library can be downloaded from Maven Central using:

<dependency>
    <groupId>com.sangupta</groupId>
    <artifactId>murmur</artifactId>
    <version>1.0.0</version>
</dependency>

Versioning

For transparency and insight into our release cycle, and for striving to maintain backward compatibility, murmur will be maintained under the Semantic Versioning guidelines as much as possible.

Releases will be numbered with the follow format:

<major>.<minor>.<patch>

And constructed with the following guidelines:

For more information on SemVer, please visit http://semver.org/.

License

murmur - Pure Java implementation of the Murmur Hash algorithms
Copyright (c) 2014-2018, Sandeep Gupta

The project uses various other libraries that are subject to their
own license terms. See the distribution libraries or the project
documentation for more details.

The entire source is licensed under the Apache License, Version 2.0 
(the "License"); you may not use this work except in compliance with
the LICENSE. You may obtain a copy of the License at

	http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.