Home

Awesome

LSMDB

LSMDB is a proof-of-concept LSM-tree built on top of MDB (AKA LMDB). It only interacts with MDB using the standard MDB API, and doesn't require any changes to the MDB code or file format.

LSMDB provides its own interface to the application, which is similar to MDB except for the following differences:

LSMDB has many shortcomings and isn't intended to be production-ready. However, I think it demonstrates a viable approach for a simple, reliable and high-performance LSM-tree in around 800 lines of code (not counting the code for MDB, which itself is quite small).

Improvements I would desire in a production version:

Notes regarding the benchmarks:

2,000,000 values, random order:

$ time ./test_leveldb
test_leveldb.c

real	0m48.039s
user	0m13.577s
sys	0m3.937s
$ time ./test_lsmdb
test_lsmdb.c

real	1m12.958s
user	0m14.183s
sys	0m4.691s
$ time ./test_mdb
test_mdb.c

real	3m2.929s
user	0m10.282s
sys	0m21.105s

As you can see, LSMDB is a big improvement over plain MDB for this workload, although it still can't catch up to LevelDB (even without compression).

Bonus test, MDB with sequential writes:

$ time ./test_mdb
test_mdb.c

real	0m47.488s
user	0m1.847s
sys	0m1.520s

As you can see, in this test MDB's sequential write performance (using MDB_APPEND) is merely on par with LevelDB's random write performance. Given that LSMDB heavily relies on MDB_APPEND for fast compaction, it's no surprise that it can't beat LevelDB.

I'm not sure why sequential writes with MDB_APPEND are as slow as they are, but I'd speculate it's due to MDB's higher transaction overhead. After each transaction MDB has to update the header to mark the active root, and the update can never be combined with other writes. This seems like the biggest downside of MDB's otherwise brilliant design.

The other big potential improvement for LSMDB is if it supported concurrent compaction, then perhaps it could make better use of disk bandwidth.

At any rate, LSMDB's ability to speed up MDB while using MDB as-is may point the way for future write-optimized database engines.

LSMDB is provided under the OpenLDAP Public License, just like MDB.