Home

Awesome

bigann

This mini crate benches the crate hnsw-rs on sampled u8 vectors from the BIGANN benchmark. See BIGANN and IRISA

Files bigann_base.bvecs, bigann_query.bvecs must be dowloaded and installed in some directory (This amounts to 133Gb). Then depending on the size of the data you want to run on the first (10M, 100M, 1B) of the large file bigann_base.bvecs you download the corresponding ground truth as explained in the BIGANN web page.

To run on the first 10M slices of data you download the corresponding ground truth corresponding to this size and extract the file GT_10M/bigann-10M to replace the file public_query_gt100.bin which corresponds to the ground truth on the totality, and check in source the name of file to load ground truth.

commandline

bigann --dir DataDir -hnsw dumpname or bigann --dir DataDir --nbdata 10 (or 100 or 1000) to specify the number million data you want to run.
For more see documentation (cargo doc --no-dpes as usual)

Results for the first 10 Million data points.

Results on Intel E5-2630 v3 @2.4GHz 16 cores 2 thread / core

All parameters are explained in doc of hnsw-rs.

knbnmax_nb_connef_consef_searchextendkeep prunedrecallreq/slast ratio
1064100128nono0.99526101.0002
10064100128nono0.98313501.0006
1024100128nono0.97048451.001
10024100128nono0.92324111.003

Results on Laptop with i7-10875H CPU @ 2.30GHz 8 core 2 Thread /core

time for Hnsw structure construction user :1370 s, cpu time 21493

knbnmax_nb_connef_consef_searchextendkeep prunedrecallreq/slast ratio
1024100128nono0.96059001.001
10024100128nono0.90728001.004
1024400128nono0.97246781.001
10024400128nono0.93823381.003
1024800128nono0.97543131.001
10024800128nono0.942821511.0025