


A very simple tool that compresses the overall size of the ONNX model by aggregating duplicate constant values as much as possible. Simple Constant value Shrink for ONNX.


Downloads GitHub PyPI CodeQL

<p align="center"> <img src="https://user-images.githubusercontent.com/33194443/170154820-6189b931-a8d9-4680-a880-400bfb61b73b.png" /> </p>

Key concept

1. Setup

1-1. HostPC

### option
$ echo export PATH="~/.local/bin:$PATH" >> ~/.bashrc \
&& source ~/.bashrc

### run
$ pip install -U onnx \
&& python3 -m pip install -U onnx_graphsurgeon --index-url https://pypi.ngc.nvidia.com \
&& pip install -U scs4onnx

1-2. Docker


2. CLI Usage

$ scs4onnx -h

  scs4onnx [-h]
  [-m {shrink,npy}]
  input_onnx_file_path output_onnx_file_path

positional arguments:
    Input onnx file path.

    Output onnx file path.

optional arguments:
  -h, --help
    show this help message and exit

  -m {shrink,npy}, --mode {shrink,npy}
    Constant Value Compression Mode.
    shrink: Share constant values inside the model as much as possible.
            The model size is slightly larger because
            some shared constant values remain inside the model,
            but performance is maximized.
    npy:    Outputs constant values used repeatedly in the model to an
            external file .npy. Instead of the smallest model body size,
            the file loading overhead is greater.
    Default: shrink

    Extracts the constant value of the specified OP name to .npy
    regardless of the mode specified.
    Cannot be used with --forced_extraction_constant_names at the same time.
    e.g. --forced_extraction_op_names aaa bbb ccc

    Extracts the constant value of the specified Constant name to .npy
    regardless of the mode specified.
    Cannot be used with --forced_extraction_op_names at the same time.
    e.g. --forced_extraction_constant_names aaa bbb ccc

  -d, --disable_auto_downcast
    Disables automatic downcast processing from Float64 to Float32 and INT64
    to INT32. Try enabling it and re-running it if you encounter type-related

  -n, --non_verbose
    Do not show all information logs. Only error logs are displayed.

3. In-script Usage

$ python
>>> from scs4onnx import shrinking
>>> help(shrinking)

Help on function shrinking in module scs4onnx.onnx_shrink_constant:

  input_onnx_file_path: Union[str, NoneType] = '',
  output_onnx_file_path: Union[str, NoneType] = '',
  onnx_graph: Union[onnx.onnx_ml_pb2.ModelProto, NoneType] = None,
  mode: Union[str, NoneType] = 'shrink',
  forced_extraction_op_names: List[str] = [],
  forced_extraction_constant_names: List[str] = [],
  disable_auto_downcast: Union[bool, NoneType] = False
  non_verbose: Union[bool, NoneType] = False
) -> Tuple[onnx.onnx_ml_pb2.ModelProto, str]

    input_onnx_file_path: Optional[str]
        Input onnx file path.
        Either input_onnx_file_path or onnx_graph must be specified.

    output_onnx_file_path: Optional[str]
        Output onnx file path.
        If output_onnx_file_path is not specified, no .onnx file is output.

    onnx_graph: Optional[onnx.ModelProto]
        Either input_onnx_file_path or onnx_graph must be specified.
        onnx_graph If specified, ignore input_onnx_file_path and process onnx_graph.

    mode: Optional[str]
        Constant Value Compression Mode.
        'shrink': Share constant values inside the model as much as possible.
            The model size is slightly larger because some shared constant values remain
            inside the model, but performance is maximized.
        'npy': Outputs constant values used repeatedly in the model to an external file .npy.
            Instead of the smallest model body size, the file loading overhead is greater.
        Default: shrink

    forced_extraction_op_names: List[str]
        Extracts the constant value of the specified OP name to .npy
        regardless of the mode specified.
        Cannot be used with --forced_extraction_constant_names at the same time.
        e.g. ['aaa','bbb','ccc']

    forced_extraction_constant_names: List[str]
        Extracts the constant value of the specified Constant name to .npy
        regardless of the mode specified.
        Cannot be used with --forced_extraction_op_names at the same time.
        e.g. ['aaa','bbb','ccc']

    disable_auto_downcast: Optional[bool]
        Disables automatic downcast processing from Float64 to Float32 and INT64 to INT32.
        Try enabling it and re-running it if you encounter type-related errors.
        Default: False

    non_verbose: Optional[bool]
        Do not show all information logs. Only error logs are displayed.
        Default: False

    shrunken_graph: onnx.ModelProto
        Shrunken onnx ModelProto

    npy_file_paths: List[str]
        List of paths to externally output .npy files.
        An empty list is always returned when in 'shrink' mode.

3. CLI Execution

$ scs4onnx input.onnx output.onnx --mode shrink


4. In-script Execution

4-1. When an onnx file is used as input

If output_onnx_file_path is not specified, no .onnx file is output.

from scs4onnx import shrinking

shrunk_graph, npy_file_paths = shrinking(


4-2. When entering the onnx.ModelProto

onnx_graph If specified, ignore input_onnx_file_path and process onnx_graph.

from scs4onnx import shrinking

shrunk_graph, npy_file_paths = shrinking(

5. Sample

5-1. shrink mode sample

5-2. npy mode sample

5-3. .npy file view

$ python
>>> import numpy as np
>>> param = np.load('gmflow_sintel_480x640_shrunken_exported_1646.npy')
>>> param.shape
(8, 1200, 1200)
>>> param
array([[[   0.,    0.,    0., ...,    0.,    0.,    0.],
        [   0.,    0.,    0., ...,    0.,    0.,    0.],
        [   0.,    0.,    0., ...,    0.,    0.,    0.],
        [-100., -100., -100., ...,    0.,    0.,    0.],
        [-100., -100., -100., ...,    0.,    0.,    0.],
        [-100., -100., -100., ...,    0.,    0.,    0.]]], dtype=float32)

6. Sample ONNX models

  1. gmflow_sintel_480x640.onnx - Optical flow calculation - LICENSE Apache License 2.0
  2. hitnet_sf_finalpass_720x960.onnx - Stereo depth estimation - LICENSE Apache License 2.0

7. Reference

  1. https://docs.nvidia.com/deeplearning/tensorrt/onnx-graphsurgeon/docs/index.html
  2. https://github.com/NVIDIA/TensorRT/tree/main/tools/onnx-graphsurgeon
  3. https://github.com/PINTO0309/sne4onnx
  4. https://github.com/PINTO0309/snd4onnx
  5. https://github.com/PINTO0309/snc4onnx
  6. https://github.com/PINTO0309/sog4onnx
  7. https://github.com/PINTO0309/PINTO_model_zoo

8. Issues
