Home

Awesome

Integer + Floating Point Compression FilterBuild Status

Transpose Benchmark:

- Speed test

Benchmark w/ 16k buffer

BOLD = pareto frontier.<br> E:Encode, D:Decode<br>

    ./tpbench -s# file -B16K   (# = 8,4,2)
E cycles/byteD cycles/byteTranspose 64 bits AVX2
.199.134TurboTranspose Byte
.326.201Blosc byteshuffle
.394.260TurboTranspose Nibble
.848.478Bitshuffle 8
E cycles/byteD cycles/byteTranspose 32 bits AVX2
.121.102TurboTranspose Byte
.451.139Blosc byteshuffle
.345.229TurboTranspose Nibble
.773.476Bitshuffle
E cycles/byteD cycles/byteTranspose 16 bits AVX2
.095.071TurboTranspose Byte
.640.108Blosc byteshuffle
.329.198TurboTranspose Nibble
.7581.177Bitshuffle 2
.067.067memcpy

E MB/sD MB/s16 bits ARM 2019.11
819216384TurboTranspose Byte
81928192blosc byteshuffle
16382341TurboTranspose Nibble
356287blosc bitshuffle
1638416384memcpy
E MB/sD MB/s32 bits ARM 2019.11
81928192TurboTranspose Byte
81928192blosc byteshuffle
18202341TurboTranspose Nibble
372252blosc bitshuffle
E MB/sD MB/s64 bits ARM 2019.11
40968192TurboTranspose Byte
54615461blosc byteshuffle
14901490TurboTranspose Nibble
372260blosc bitshuffle

Transpose/Shuffle benchmark w/ large files (100MB).

MB/s: 1,000,000 bytes/second<br>

    ./tpbench -s# file  (# = 8,4,2)
E MB/sD MB/sTranspose 16 bits AVX2 2019.11
92089795TurboTranspose Byte
83827689Blosc byteshuffle
93779584TurboTranspose Nibble
27502530Blosc bitshuffle
1372513900memcpy
E MB/sD MB/sTranspose 32 bits AVX2 2019.11
97189713TurboTranspose Byte
91819030Blosc byteshuffle
87509472TurboTranspose Nibble
27672942Blosc bitshuffle 4
E MB/sD MB/sTranspose 64 bits AVX2 2019.11
89989573TurboTranspose Byte
87218586Blosc byteshuffle 2
82529222TurboTranspose Nibble
27112053Blosc bitshuffle 2

E MB/sD MB/s16 bits ARM 2019.11
8723998TurboTranspose Byte
6783852blosc byteshuffle
13652195TurboTranspose Nibble
357280blosc bitshuffle
39213913memcpy
E MB/sD MB/s32 bits ARM 2019.11
18283768TurboTranspose Byte
17693713blosc byteshuffle
14562299TurboTranspose Nibble
374243blosc bitshuffle
E MB/sD MB/s64 bits ARM 2019.11
17933572TurboTranspose Byte
17843544blosc byteshuffle
11761267TurboTranspose Nibble
331203blosc bitshuffle

- Compression test (transpose/shuffle+lz4)

:new: Download IcApp a new benchmark for TurboPFor+TurboTranspose<br> for testing allmost all integer and floating point file types.<br> Note: Lossy compression benchmark with icapp only.

- Speed test (file msg_sweep3d)
C sizeratio %C MB/sD MB/sName AVX2
11,348,55418.122764425TurboTranspose Nibble+lz
22,489,69135.816703881TurboTranspose Byte+lz
43,471,37669.2348402SPDP
44,626,40771.010652101bitshuffle+lz
62,865,612100.01330013300memcpy
    ./tpbench -s4 -z *.sp
FileFile sizelz %Tp8lzTp4lzBSlzspdp1spdp9Tp4lzteTp4lzt
msg_bt13319471694.370.466.473.970.0 67.454.732.4
msg_lu97059484100.477.170.475.476.8 74.061.042.2
msg_sppm13949793211.711.612.615.414.4 13.79.05.6
msg_sp145052928100.368.863.768.167.9 65.352.624.9
msg_sweep3d6286561298.735.818.171.069.6 13.79.83.8
num_brain70920000100.476.571.177.479.1 73.963.432.6
num_comet5367398492.479.077.682.184.5 84.670.141.7
num_control7975237299.489.590.788.198.3 98.581.451.2
num_plasma17544800100.40.70.775.530.7 2.90.30.2
obs_error3108040889.273.170.076.978.3 49.420.512.2
obs_info946526493.670.261.972.962.4 43.827.315.1
obs_spitzer9909043298.390.495.693.6100.1 100.780.252.3
obs_temp19967136100.489.592.491.099.4 100.184.055.8

Tp8=Byte transpose, Tp4=Nibble transpose, lz = lz4<br /> eTp4Lzt = lossy compression with lzturbo and allowed error = 0.0001 (1e-4)<br /> Slow but best compression: SPDP9 and lzt = lzturbo,39

FileFile sizelz %Tp8lzTp4lzBSlzspdp1spdp9Tp4lzteTp4lzt
msg_bt26638943294.577.276.581.677.9 75.469.916.0
msg_lu194118968100.482.781.083.783.3 79.675.521.0
msg_sppm27899586418.914.514.919.521.5 19.811.22.8
msg_sp290105856100.479.277.580.278.8 77.171.312.4
msg_sweep3d12573122498.750.736.780.476.2 33.227.31.9
num_brain141840000100.482.681.184.587.8 83.377.016.3
num_comet10734796892.883.378.876.386.5 86.069.821.2
num_control15950474499.692.290.989.497.6 98.985.525.8
num_plasma3508960075.20.70.784.577.3 3.00.30.1
obs_error6216081678.781.077.584.487.9 62.323.46.3
obs_info1893052892.375.470.682.481.7 51.233.17.7
obs_spitzer19818086495.493.293.786.4100.1 102.478.026.9
obs_temp39934272100.493.193.891.798.0 97.488.228.8

eTp4Lzt = lossy compression with allowed error = 0.0001<br />

Compile:

    git clone git://github.com/powturbo/TurboTranspose.git
    cd TurboTranspose
Linux + Windows MingW
	make
    or
	make AVX2=1
Windows Visual C++
	nmake /f makefile.vs
    or
	nmake AVX2=1 /f makefile.vs

	

Testing:

Function usage:

Byte transpose:

void tpenc( unsigned char *in, unsigned n, unsigned char *out, unsigned esize);<br> void tpdec( unsigned char *in, unsigned n, unsigned char *out, unsigned esize)<br /> in : input buffer<br /> n : number of bytes<br /> out : output buffer<br /> esize : element size in bytes (2,4,8,...)<br />

Nibble transpose:

void tp4enc( unsigned char *in, unsigned n, unsigned char *out, unsigned esize);<br> void tp4dec( unsigned char *in, unsigned n, unsigned char *out, unsigned esize)<br /> in : input buffer<br /> n : number of bytes<br /> out : output buffer<br /> esize : element size in bytes (2,4,8,...)<br />

Environment:

OS/Compiler (64 bits):
Multithreading:

References:

Last update: 25 Oct 2019