Awesome
flumelog-offset
An flumelog where the offset into the file is the key. Each value is appended to the log with a double ended framing, and the "sequence" is the position in the physical file where the value starts, this means if you can do a read in O(1) time!
Also, this is built on top of aligned-block-file so that caching works very well.
Usage
initialize with a file and a codec, and wrap with flumedb.
var OffsetLog = require('flumelog-offset')
var codec = require('flumecodec')
var Flume = require('flumedb')
var db = Flume(OffsetLog(filename, {codec: codec.json}))
.use(...) //also add some flumeviews
db.append({greets: 'hello!'}, function (cb) {
})
Options
var OffsetLog = require('flumelog-offset')
var log = OffsetLog('/data/log', {
blockSize: 1024, // default is 1024*16
codec: {encode, decode} // defaults to no codec, expects buffers. for json use flumecodec/json
flags: 'r', // default is 'r+' (from aligned-block-file)
cache: {set, get} // default is require('hashlru')(1024)
offsetCodec: { // default is require('./frame/offset-codecs')[32]
byteWidth, // with the default offset-codec, the file can have
encode, // a size of 4GB max.
decodeAsync
}
})
legacy
if you used flumelog-offset
before 3, and want to read your old
data, use require('flumelog-offset/legacy')
recovery
If your system crashes while an append is in progress, it's unlikely
but possible to have a partially written state. flumelog-offset
will rewind to the last good state on the next start up.
After running this for several months (in my personal secure-scuttlebutt instance) I eventually got an error, which lead to the changes in this version.
format
data is stored in a append only log, where the byte index
of the start of a record is the primary key (offset
).
offset-><data.length (UInt32BE)>
<data ...>
<data.length (UInt32BE)>
<file_length (UInt32BE or Uint48BE or Uint53BE)>
by writing the length of the data both before and after each record it becomes possible to scan forward and backward (like a doubly linked list)
It's very handly to be able to scan backwards, as often you want to see the last N items, and so you don't need an index for this.
future ideas
- secured file (hashes etc)
- encrypted file
- make the end of the record be the primary key. this might make other code nicer...
License
MIT