Home

Awesome

TinyInt.jl

Build Status

TinyInt.jl is a Julia package for compressing integer vectors very quickly. Elements of the compressed integer vector can be accessed just as you would with a normal vector.

Requirements

Instalation

julia> Pkg.add("TinyInt")
julia> Pkg.test("TinyInt")

What is TinyInt.jl?

TinyInt.jl quickly compresses and decompresses vectors of unsigned integers.

using TinyInt

x = rand(1:100, 20000000)
tinyx = pack(x)
Base.summarysize(x)
# 160000000
Base.summarysize(tinyx)
# 19531282

Compression is fairly fast.

function pack_example(x)
  tinyx = pack(x)
end

x = rand(1:100, 20000000)
@time pack_example(x)
# 0.043057 seconds (156.52 k allocations: 32.373 MiB)

Decompression is very fast.

function unpack_example(tinyx, out)
  unpack!(tinyx, out)
end

tinyx = pack(rand(1:100, 20000000))
out = zeros(UInt32, 20000000)
@time unpack_example(tinyx, out)
# 0.010429 seconds (5 allocations: 784 bytes)

Additionally, individual elements can be selected, but with a bit of CPU overhead.

function select_example(x)
  y = 0
  for i = 1:length(x)
    y += x[i]
  end
end

x = rand(1:100, 20000000)
tinyx = pack(x)
@time select_example(x)
# 0.009573 seconds (4 allocations: 160 bytes)
@time select_example(tinyx)
# 0.168778 seconds (4 allocations: 160 bytes)

These times are from a quad-core Intel® Core™ i7-4790 CPU @ 3.60GHz

Functions

FunctionDescription
pack(x)Compresses integer vector and outputs compressed integer vector.
unpack(tinyx)Decompresses integer vector and outputs decompressed integer vector.
unpack!(tinyx, out)Decompresses integer vector, storing decompressed integers into out.
tinyx[i]Outputs the i'th integer from a compressed vector.
length(tinyx)Outputs the number of elements in a compressed vector.

Low-level Functions

These functions are provided in order to give developers low-level access to the chunks of the compressed array.

Function                                         Description
unpack(tinyx, chunk)Decompresses an integer vector chunk and outputs decompressed integer vector chunk.
unpack!(tinyx, out, chunk)Decompresses an integer vector chunk, storing decompressed integers into out. <br />Important! Ensure out is a contiguous array of 128 UInt32 integers. This function performs no bounds checking and is intended for speed critical applications.

Notes

References

This Julia package uses the Frame-of-Reference compression code from the SIMDComp library.