Home

Awesome

msgpack-numpy-rs

Crates.io Docs.rs License

This crate does what Python's msgpack-numpy does in Rust, and a lot faster. It serializes and deserializes NumPy scalars and arrays to and from the MessagePack format, in the same serialized formats as the Python counterpart, so they could interoperate with each other. It enables processing NumPy arrays in a different service in Rust through IPC, or saving Machine Learning results to disk (better paired with compression).

Overview

Motivation

There hasn't been consensus on a good format that is both flexible and efficient for serializing NumPy arrays. They are unique in that they are blocks of bytes in nature, but also have numeric types and shapes. Programmers working on Machine Learning problems found MessagePack to have interesting properties. It is compact with a type system, and has a wide range of language support. The package msgpack-numpy provides de-/serialization for NumPy arrays, standalone or enclosed in arbitrary organizational depths, to be sent over the network, or saved to disk, in a compact format.

If one looks for a more production-oriented, performant format, they might consider Apache Arrow, Parquet, or Protocol Buffers. However, these formats are not as flexible as MessagePack when you need to store intermediate Machine Learning results. In practice, MessagePack with Numpy array support can be quite a good choice for many of these use cases.

This Rust version aims to provide a faster alternative to the Python version, with the same serialized formats as the Python counterpart so they could interoperate with each other. You could use this as a building block for your own Machine Learning pipeline in Rust, or as a way to communicate between Python and Rust.

Examples

use std::fs::File;
use std::io::Read;
use msgpack_numpy::NDArray;

fn main() {
    let filepath = "tests/data/ndarray_bool.msgpack";
    let mut file = File::open(filepath).unwrap();
    let mut buf = Vec::new();
    file.read_to_end(&mut buf).unwrap();
    let deserialized: NDArray = rmp_serde::from_slice(&buf).unwrap();

    match &deserialized {
        NDArray::Bool(array) => {
            println!("{:?}", array);
        }
        _ => panic!("Expected NDArray::Bool"),
    }

    // returns an Option, None if conversion is not possible
    let arr = deserialized.into_u8_array().unwrap();
    println!("{:?}", arr);
}

Please see more in examples/.

Benchmarks

All benchmarks were done with 1 CPU core on a Ubuntu 22.04 instance. CPUs: Intel(R) Xeon(R) Platinum 8375C CPU @ 2.90GHz. The Rust version was compiled in release mode. We are only benchmarking the serialization and deserialization of arrays, in memory. See benches/ for the benchmark code.

This applies to the owned NDArray.

Array TypeArray SizeArraysOperationPython (ms)Rust (ms)Speedup
f32100010000Serialize56.417.13.3x
Deserialize26.118.91.4x
100100000Serialize226.127.18.3x
Deserialize199.350.53.9x
f16100010000Serialize33.54.08.5x
Deserialize21.25.24.1x
100100000Serialize198.912.116.5x
Deserialize195.229.56.6x

The Rust implementation shows significant performance improvements over Python in all cases, with particularly dramatic speedups for small array serialization. The Python version's de-/serialization logic is written in C through NumPy, but small arrays reduce this benefit because each array is a Python object. Notably, the Python version deserializes faster than serializing, while the Rust version serializes faster than deserializing. This range of array sizes is typical for Machine Learning use cases, such as feature embeddings, so Rust will be able to help out when performance is needed.

Zero-Copy Deserialization (when Good Alignment)

For the above arrays, the array buffers always seem to be misaligned during deserialization, so we can't just borrow the data from the serialized slice as the targeted typed array, but instead pay for extra allocation. This is because the MessagePack format doesn't guarantee alignment.

In most cases however, there are good chances of alignment, and we could borrow the array buffer data directly when that happens. This is demonstrated in the following benchmarks. We choose CowNDArray, shape (1024, 2048), 10 arrays each time for demonstration.

Data TypeOperationPython (ms)Rust (ms)Speedup
f16Serialize42.823.41.8x
Deserialize (NDArray)21.620.41.1x
Deserialize (CowNDArray)-10.52.1x
f32Serialize87.843.52.0x
Deserialize (NDArray)44.241.41.1x
Deserialize (CowNDArray)-34.51.3x

Deserialization time went down! For f16, it's about half the chance for good alignment, and 1/4 for f32. The amortized cost of allocation is now lower, and we can see the benefit of zero-copy deserialization. The shortcoming is, CowNDArray only supports rmp_serde::from_slice (consuming from a slice that's fully in memory), but not rmp_serde::from_read (consuming from a reader in a streaming way). So you need to keep the serialized bytes (the compiler will check).

If you really want complete zero-copy deserialization, you should try some other format, like Apache Arrow.

Notes

Scalar Type

There is not a good reason to serialize using Scalar, because you end up representing primitive types with a lot of metadata. This type exists for compatibility reasons - it helps deserialize scalars already serialized this way.

Dependency on ndarray

This crate uses types from ndarray in its public API. ndarray is re-exported in the crate root so that you do not need to specify it as a direct dependency.

Furthermore, this crate is compatible with multiple versions of ndarray and therefore depends on a range of semver-incompatible versions, currently >=0.15, <0.17. Cargo does not automatically choose a single version of ndarray by itself if you depend directly or indirectly on anything but that exact range. For example, as of Aug, 2024, this crate may get 0.16.1 as its own, separate dependency, even if you pin ndarray to 0.15.6 in your own project. This might come as a surprise, and you will get compilation errors like:

     = note: `ArrayBase<CowRepr<'_, f32>, Dim<IxDynImpl>>` and `ArrayBase<CowRepr<'_, f32>, Dim<IxDynImpl>>` have similar names, but are actually distinct types
note: `ArrayBase<CowRepr<'_, f32>, Dim<IxDynImpl>>` is defined in crate `ndarray`
    --> /home/ubuntu/.cargo/registry/src/index.crates.io-6f17d22bba15001f/ndarray-0.15.6/src/lib.rs:1268:1
     |
1268 | pub struct ArrayBase<S, D>
     | ^^^^^^^^^^^^^^^^^^^^^^^^^^
note: `ArrayBase<CowRepr<'_, f32>, Dim<IxDynImpl>>` is defined in crate `ndarray`
    --> /home/ubuntu/.cargo/registry/src/index.crates.io-6f17d22bba15001f/ndarray-0.16.1/src/lib.rs:1280:1
     |
1280 | pub struct ArrayBase<S, D>
     | ^^^^^^^^^^^^^^^^^^^^^^^^^^
     = note: perhaps two different versions of crate `ndarray` are being used?

It can therefore be necessary to manually unify these dependencies. For example, if you specify the following dependencies

msgpack-numpy = "0.1.3"
ndarray = "0.15.6"

this will currently depend on both version 0.15.6 and 0.16.1 of ndarray by default even though 0.15.6 is within the range >=0.15, <0.17. To fix this, you can run

cargo update --package ndarray:0.16.1 --precise 0.15.6

to achieve a single dependency on version 0.15.6 of ndarray. Check your lock file to verify that this worked.

License

This project is licensed under the MIT license.