Home

Awesome

ZfpCompression.jl

CI

Julia bindings for the data compression library zfp v1.0, written by P Lindstrom (@lindstro). From the zfp documentation:

zfp is an open source library for compressed numerical arrays that support high throughput read and write random access. To achieve high compression ratios, zfp generally uses lossy but optionally error-bounded compression. Bit-for-bit lossless compression is also possible through one of zfp’s compression modes.

zfp works best for 2-4D arrays that exhibit spatial correlation, such as continuous fields from physics simulations, images, regularly sampled terrain surfaces, etc. Although zfp also provides a 1D array class that can be used for 1D signals such as audio, or even unstructured floating-point streams, the compression scheme has not been well optimized for this use case, and rate and quality may not be competitive with floating-point compressors designed specifically for 1D streams.

See the documentation, or zfp's website for more information.

Requires Julia v1.3 or newer. With v0.2.2 we use zfp v1.0.0, previous releases are based on zfp v0.5.5.

Example

OzoneCompression
Compression of ozone (O₃) from the CAMS data set with zfp at various levels of precision. Compression factors are relative to 64-bit floats regarding including the vertical dimension, shown is only one vertical level.

Usage

Lossless compression

1 to 4-D arrays of eltype Int32,Int64,Float32,Float64 can be compressed calling the zfp_compress function.

julia> using ZfpCompression

julia> A = rand(Float32,100,50);

julia> Ac = zfp_compress(A)
16952-element Array{UInt8,1}:
 0xfd
 0xe1
 0x80
 0x8d
    ⋮

which initializes the zfp compression, preallocates the bitstream used for the compressed array and performs the compression. The compressed array is returned as Array{UInt8,1}. By default, the compressed array includes a header with the required information about the type, size and shape of the uncompressed array as well as lossy compression parameters (see below). This header can be deactivated with

julia> Ac = zfp_compress(A,write_header=false)

A compressed array (with header) can be decompressed as

julia> Ad = zfp_decompress(Ac)

Alternatively, the decompression of header-less compressed arrays can be performed into an existing array (with same type, size and dimensions as the uncompressed array)

julia> Ad = similar(A)
julia> zfp_decompress!(Ad,Ac)

In this lossless example the compression is reversible

julia> A == Ad
true

Lossy compression

Lossy compression is achieved by specifying additional keyword arguments for zfp_compress, which are tol::Real, precision::Int, and rate::Real. If none are specified (as in the example above) the compression is lossless (i.e. reversible). Lossy compression parameters are

Only one of tol, precision or rate should be specified. For further details see the zfp documentation.

If we can tolerate a maximum absolute error of 1e-3, we may do

julia> Ac = zfp_compress(A,tol=1e-3)
9048-element Array{UInt8,1}:
 0xff
 0x2c
 0x01
 0x1a
 0xf3
 0xbc
 0xea
 0xbb
 0xc6
 0xd4
    ⋮

which clearly reduces the size of the compressed array. In this case the maximum absolute error is limited to about 3e-4.

julia> A2 = zfp_decompress(Ac)
julia> maximum(abs.(A2 - A))
0.00030493736f0

For header-less compression, it is essential to provide the same compression parameters also for zfp_decompress!. Otherwise the decompressed array is flawed. E.g.

julia> A2 = similar(A)
julia> zfp_decompress!(A2,Ac,tol=1e-3)

OpenMP multi-threading

You can use compress in parallel using the nthreads argument of zfp_compress to trigger multi-threading via OpenMP. No parallel decompression is currently (zfp v0.5.5) provided in the underlying C library. On linux, zfp_jll is automatically built with OpenMP enabled, on macOS this is not supported by default.

julia> zfp_compress(temp,nthreads=8)

Compressing a 590MB array A with precision=10 is benchmarked (@btime) as

Number of threads12481632
Time2.45s1.46s0.73s0.38s0.25s0.20s
Speed-up1x1.7x3.4x6.4x9.8x12.3x

Installation

ZfpCompression.jl is registered in the Julia Registry, so simply do

julia>] add ZfpCompression

and the C library is installed and built automatically.

Citation

If you use this library for your own work, please give us credits by citing our publication this package was developed for

Klöwer M, M Razinger, JJ Dominguez, PD Düben and TN Palmer, 2021. Compressing atmospheric data into its real information content, Nature Computational Science, 1, 713-724, doi:10.1038/s43588-021-00156-2