Home

Awesome

Perceptual hashing is a method to generate a hash of an image which allows multiple images to be compared by an index of similarity. You can find out more at The Hacker Factor and phash.org.

I've extended this basic method into a class which can also compare images which have been rotated or flipped (but only in 90 degree increments.) It can also match images which have been color corrected (to a degree) or altered. This flexibility, though, does lend itself to false positives and it's ridiculously slow - currently it takes about a second to hash and compare images -- in production you would want to generate a cache of hashes for existing images, perhaps, and then compare the hash of an image to the hashes.

PHasher is available under the MIT license.


###Usage###

include_once('phasher.class.php');
$I = PHasher::Instance();

yes. it's a singleton.

$I->HashImage($res, $rot=0, $mir=0, $size=8);

build a perceptual hash out of an image.

this is currently still buggy.

This returns an array of binary values representing the perceptual hash. This is done as an array to make it easier to process.

$I->Compare($res1, $res2, $rot=0, $precision=1);

build perceptual hashes out of two images, compare them and return the similarity between them as a percentage.

$I->Detect($res1, $res2, $precision=1);

Compare two images through all rotations and return the highest match value.

$I->FastHashImage($res, $size=8);

Faster hashing method without any rotation. Returns the same array as HashImage.

$I->HashAsString($hash, $hex=true);

convert a hash array returned by one of the HashImage methods into a string of hex (if $hex == true) or binary (if $hex==false.)

$I->HashAsTable($hash, $size=8, $cellsize=10);

convert a hash array into an html table, with each cell being either white or black depending on its value. this is only being used for demos and debugging and should be avoided as being mostly useless and slow.

example usage

find a percentage of similarity between 'image1.jpg' and 'image2.jpg.'. Compare at 90 degree angles.

$I = PHasher::Instance();
$file1 = 'image1.jpg';
$file2 = 'image2.jpg';
$result = $I->Compare($file1, $file2);
$result90 = $I->Compare($file1, $file2, 90);
$result180 = $I->Compare($file1, $file2, 180);
$result270 = $I->Compare($file1, $file2, 270);
$max_match = max($result, $result90, $result180, $result270);

the above is the same as:

$I = PHasher::Instance();
$file1 = 'image1.jpg';
$file2 = 'image2.jpg';
$result = $I->Detect($file1, $file2);

notes

Sebastian Lasse of the Redaktor Project has built an implementation of phasher in javascript, using the dojo library, which is pretty cool.

I'm still trying to speed the algorithm up and improve the hashing, since it does on occasion produce false positives (it was originally meant to catch imagespam on forums and imageboards so it's as aggressive as it can be.) If anyone wants to add discrete cosine transform to this by all means make a pull request because I still don't know how to do it.