Home

Awesome

PureHDF

GitHub Actions NuGet

A pure C# library without native dependencies that makes reading and writing of HDF5 files (groups, datasets, attributes, ...) very easy.

This library runs on all platforms (ARM, x86, x64) and operating systems (Linux, Windows, MacOS, Raspbian, etc) that are supported by the .NET ecosystem without special configuration.

The implemention follows the HDF5 File Format Specification (HDF5 1.10).

Please read the docs for samples and API documentation.

Version 1

The minimum supported target framework is .NET Standard 2.0 which includes

Version 1 of PureHDF supports all .NET versions starting with .NET 4.7.2 and continues to receive bug fixes. Features will be backported upon request if feasible.

Version 2

The minimum supported target framework version is .NET 6.0+.

To keep the code base clean, version 2 of PureHDF targets active .NET versions only, which are .NET 6 and .NET 8 as of now (August 2024).

Installation

dotnet add package PureHDF

Quick Start

Reading

// root group
var file = H5File.OpenRead("path/to/file.h5");

// sub group
var group = file.Group("path/to/group");

// attribute
var attribute = group.Attribute("my-attribute");
var attributeData = attribute.Read<int>();

// dataset
var dataset = group.Dataset("my-dataset");
var datasetData = dataset.Read<double>();

See the docs to learn more about data types, multidimensional arrays, chunks, compression, slicing and more.

Writing

The first step is to create a new H5File instance:

var file = new H5File();

A H5File derives from the H5Group type because it represents the root group. H5Group implements the IDictionary interface, where the keys represent the links in an HDF5 file and the value determines the type of the link: either it is another H5Group or a H5Dataset.

You can create an empty group like this:

var group = new H5Group();

If the group should have some datasets, just add them using the dictionary collection initializer - just like with a normal dictionary:

var group = new H5Group()
{
    ["numerical-dataset"] = new double[] { 2.0, 3.1, 4.2 },
    ["string-dataset"] = new string[] { "One", "Two", "Three" }
}

Datasets and attributes can both be created either by instantiating their specific class (H5Dataset, H5Attribute) or by just providing some kind of data. This data can be nearly anything: arrays, scalars, numerical values, strings, anonymous types, enums, complex objects, structs, bool values, etc. However, whenever you want to provide more details like the dimensionality of the attribute or dataset, the chunk layout or the filters to be applied to a dataset, you need to instantiate the appropriate class.

But first, let's see how to add attributes. Attributes cannot be added directly using the dictionary collection initializer because that is only for datasets. However, every H5Group has an Attribute property which accepts our attributes:

var group = new H5Group()
{
    Attributes = new()
    {
        ["numerical-attribute"] = new double[] { 2.0, 3.1, 4.2 },
        ["string-attribute"] = new string[] { "One", "Two", "Three" }
    }
}

The full example with the root group, a subgroup, two datasets and two attributes looks like this:

var file = new H5File()
{
    ["my-group"] = new H5Group()
    {
        ["numerical-dataset"] = new double[] { 2.0, 3.1, 4.2 },
        ["string-dataset"] = new string[] { "One", "Two", "Three" },
        Attributes = new()
        {
            ["numerical-attribute"] = new double[] { 2.0, 3.1, 4.2 },
            ["string-attribute"] = new string[] { "One", "Two", "Three" }
        }
    }
};

The last step is to write the defined file to the drive:

file.Write("path/to/file.h5");

See the docs to learn more about data types, multidimensional arrays, chunks, compression, slicing and more.

Development

The tests of PureHDF are executed against .NET 6 and .NET 8 so these two runtimes are required. Please note that due to an currently unknown reason the writing tests cannot be run in parallel to other tests because some unrelated temp files are in use although they should not be and thus cannot be accessed by the unit tests.

If you are using Visual Studio Code as your IDE, you can simply execute one of the predefined test tasks by selecting Run Tasks from the global menu (Ctrl+Shift+P). The following test tasks are predefined:

The HSDS tests require a python installation to be present on the system with the venv package available.

Comparison Table

Overwhelmed by the number of different HDF 5 libraries? Here is a comparison table:

Note: The following table considers only projects listed on Nuget.org

NameArchPlatformKindModeVersionLicenseMaintainerComment
v1.10
PureHDFallallmanagedrw1.10.*MITApollo3zehn
HDF5-CSharpx86,x64Win,Lin,MacHLrw1.10.6MITLiorBanai
SciSharp.Keras.HDF5x86,x64Win,Lin,MacHLrw1.10.5MITSciSharpfork of HDF-CSharp
ILNumerics.IO.HDF5x64Win,LinHLrw?proprietaryIL_Numerics_GmbHprobably 1.10
LiteHDFx86,x64Win,Lin,MacHLro1.10.5MITsilkfire
hdflibx86,x64WindowsHLwo1.10.6MITbdebree
Mbc.Hdf5Utilsx86,x64Win,Lin,MacHLrw1.10.6Apache-2.0bqstony
HDF.PInvokex86,x64Windowsbindingsrw1.8,1.10.6HDF5hdf,gheber
HDF.PInvoke.1.10x86,x64Win,Lin,Macbindingsrw1.10.6HDF5hdf,Apollo3zehn
HDF.PInvoke.NETStandardx86,x64Win,Lin,Macbindingsrw1.10.5HDF5surban
v1.8
HDF5DotNet.x64x64WindowsHLrw1.8HDF5thieum
HDF5DotNet.x86x86WindowsHLrw1.8HDF5thieum
sharpHDFx64WindowsHLrw1.8MITbengecko
HDF.PInvokex86,x64Windowsbindingsrw1.8,1.10.6HDF5hdf,gheber
hdf5-v120-completex86,x64Windowsnativerw1.8HDF5daniel.gracia
hdf5-v120x86,x64Windowsnativerw1.8HDF5keen

Abbreviations:

Term.NET APINative dependencies
managedhigh-levelnone
HLhigh-levelC-library
bindingslow-levelC-library
nativenoneC-library