Awesome
Paintera Conversion Helper
Script to assist conversion of n5 datasets to paintera-friendly formats, as specified here.
Installation
Releases can be download for Ubuntu and MacOS from the Github Releases
Usage
This conversion tool currently supports any number of datasets (raw or label) with a single (global) block size, and will output to a single N5 group in a paintera-compatible format.
By default, spark will run locally:
paintera-convert to-paintera [...]
paintera-convert
can also convert paintera label sources to scalar label datasets:
paintera-convert extract-to-scalar [...]
extract-to-scalar
will the highest resolution scale level of a Paintera dataset as a scalar uint64
Dataset. This is useful for using Paintera painted labels (and assignments) in downstream processing, e.g. classifier training. Optionally, the fragment-segment-assignment
can be considered and additional assignments can be added. See extract-to-scalar --help
for more details.
Installation
paintera-conversion-helper is available on conda on the hanslovsky
channel:
conda install -cconda-forge -c hanslovsky paintera-conversion-helper
If necessary, you can install openjdk
and maven
from conda-forge
:
conda install -c conda-forge maven openjdk
Alternatively, paintera-conversion-helper
can be installed from PyPI through pip:
pip install paintera-conversion-helper
Compile
To compile the conversion helper into a jar, simply run
mvn -Denforcer.skip=true clean package
To run locally build a fat jar including Spark:
mvn -Denforcer.skip=true -PfatWithSpark clean package
To run on the Janelia cluster build a fat jar without Spark:
mvn -Denforcer.skip=true -Pfat clean package
</details>
Usage Example
To convert the raw
and neuron_ids
datasets of sample A of the cremi challenge into Paintera format with mipmaps on Linux, assuming that you downloaded the data into $HOME/Downloads
, run:
paintera-convert to-paintera \
--scale 2,2,1 2,2,1 2,2,1 2 2 \
--reverse-array-attributes \
--output-container=paintera-converted.n5 \
--container=sample_A_20160501.hdf \
-d volumes/raw \
--target-dataset=volumes/raw2 \
--dataset-scale 3,3,1 3,3,1 2 2 \
--dataset-resolution 4,4,40.0 \
-d volumes/labels/neuron_ids
Usage Help
$ paintera-convert to-paintera --help
Usage: paintera-convert to-paintera [[--block-size=X,Y,Z|U] [--scale=X,Y,Z|U...] [--scale=X,Y,Z|U...]...
[--downsample-block-sizes=X,Y,Z|U...] [--downsample-block-sizes=X,Y,Z|U...]...
[--reverse-array-attributes] [--resolution=X,Y,Z|U] [--offset=X,Y,Z|U] [-m=N...]
[-m=N...]... [--label-block-lookup-n5-block-size=N]
[--winner-takes-all-downsampling]] ([--container=CONTAINER] [] (-d=DATASET
[--target-dataset=TARGET_DATASET] [[--type=TYPE] ])...)... [--overwrite-existing]
[--help] --output-container=OUTPUT_CONTAINER [--spark-master=<sparkMaster>]
Options:
--block-size=X,Y,Z|U Use --container-block-size and --dataset-block-size for container and dataset specific
block sizes, respectively.
--scale=X,Y,Z|U... Relative downsampling factors for each level in the format x,y,z, where x,y,z are
integers. Single integers u are interpreted as u,u,u.
Use --container-scale and --dataset-scale for container and dataset specific scales,
respectively.
--downsample-block-sizes=X,Y,Z|U...
Use --container-downsample-block-sizes and --dataset-downsample-block-sizes for container
and dataset specific block sizes, respectively.
--reverse-array-attributes
Reverse array attributes like resolution and offset, i.e. [x, y, z] -> [z, y, x].
Use --container-reverse-array-attributes and --dataset-reverse-array-attributes for
container and dataset specific setting, respectively.
--resolution=X,Y,Z|U Specify resolution (overrides attributes of input datasets, if any).
Use --container-resolution and --dataset-resolution for container and dataset specific
resolution, respectively.
--offset=X,Y,Z|U Specify offset (overrides attributes of input datasets, if any).
Use --container-offset and --dataset-offset for container and dataset specific resolution,
respectively.
-m, --max-num-entries=N... Limit number of entries for non-scalar label types by N. If N is negative, do not limit
number of entries. If fewer values than the number of down-sampling layers are
provided, the missing values are copied from the last available entry. If none are
provided, default to -1 for all levels.
Use --container-max-num-entries and --dataset-max-num-entries for container and dataset
specific settings, respectively.
--label-block-lookup-n5-block-size=N
Set the block size for the N5 container for the label-block-lookup.
Use --container-label-block-lookup-n5-block-size and
--dataset-label-block-lookup-n5-block-size for container and dataset specific settings,
respectively.
--winner-takes-all-downsampling
Use scalar label type with winner-takes-all downsampling.
Use --container-winner-takes-all-downsampling and --dataset-winner-takes-all-downsampling
for container and dataset specific settings, respectively.
--overwrite-existing
--output-container=OUTPUT_CONTAINER
--spark-master=<sparkMaster>
--container=CONTAINER
-d, --dataset=DATASET
--target-dataset=TARGET_DATASET
--type=TYPE
--help
Janelia cluster
Clone the repository with submodules:
git clone --recursive https://github.com/saalfeldlab/paintera-conversion-helper.git
If you have already cloned the repository, run this after cloning to fetch the submodules:
git submodule update --init --recursive
Then, run the following script to build the package:
./build-for-cluster.py
For submitting a job to the Janelia cluster you can use the following script:
startup-scripts/spark-janelia/convert.py <number of cluster nodes> <other parameters>
The first parameter is the number of cluster nodes to use (for example, 5), and the rest is the same parameters as in the paintera-convert
command.