Awesome
Reed-Solomon
Reed-Solomon Erasure Coding in Haskell, with speeds exceeding multiple GB/s/cpu core implemented in pure Haskell (and some SIMD C/assembler).
This is a Haskell port of the GolangReedSolomon library released by Klaus Post, wich is a port of the JavaReedSolomon library released by Backblaze, with some additional optimizations.
For an introduction on erasure coding, see the post on the Backblaze blog.
Package home: https://github.com/NicolasT/reedsolomon
Performance
Performance depends mainly on the number of parity shards. In rough terms, doubling the number of parity shards will double the encoding time.
Here are the throughput numbers with some different selections of data and parity shards. For reference each shard is 1MB random data, and 1 CPU core is used for encoding.
<table> <thead> <tr> <th>Data</th> <th>Parity</th> <th>Parity</th> <th>SSSE3 MB/s</th> <th>AVX2 MB/s</th> </tr> </thead> <tbody> <tr><td>5</td><td>2</td><td>40%</td><td>3641,66</td><td>3987,24</td></tr> <tr><td>10</td><td>2</td><td>20%</td><td>3951,01</td><td>4444,44</td></tr> <tr><td>10</td><td>4</td><td>40%</td><td>1821,16</td><td>1927,90</td></tr> <tr><td>50</td><td>20</td><td>40%</td><td>398,09</td><td>431,78</td></tr> </tbody> </table>Example of performance on Intel(R) Core(TM) i7-4600U CPU @ 3.30GHz - 2 physical cores, 4 logical cores (note: /proc/cpuinfo
mentions 2.10GHz only). The example uses 10 blocks with 16MB data each and 4 parity blocks.
Links
- Backblaze Open Sources Reed-Solomon Erasure Coding Source Code.
- GolangReedSolomon. Compatible Go library by Klaus Post.
- JavaReedSolomon. Compatible java library by Backblaze.
- go-erasure. A similar library using cgo, slower in my tests.
- Screaming Fast Galois Field Arithmetic. Basis for SSE3 optimizations.
License
This code, as the original GolangReedSolomon and JavaReedSolomon is published under an MIT license. See LICENSE file for more information.