Home

Awesome

Introduction

This repository contains the specification for the SOZip (Seek-Optimized Zip) profile to the ZIP file format.

Logo

What is SOZip ?

A Seek-Optimized ZIP file (SOZip) is a ZIP file that contains one or several Deflate-compressed files that are organized and annotated such that a SOZip-aware reader can perform very fast random access (seek) within a compressed file.

SOZip makes it possible to access large compressed files directly from a .zip file without prior decompression. It is not a new file format, but a profile of the existing ZIP format, done in a fully backward compatible way. ZIP readers that are non-SOZip aware can read a SOZip-enabled file normally and ignore the extended features that support efficient seek capability.

Software implementations

See Annex A: Software implementations for more details.

Examples of SOZip files

Examples of SOZip-enabled files can be found in the sozip-examples repository.

Other ZIP related specification

This GitHub organization also hosts the KeyValuePairs extra-field specification, to be able to encode arbitrary key-value pairs of metadata associated with a file within a ZIP. For example to store the Content-Type of a file.

Benchmarking

Done with GDAL sozip branch, on a laptop running a Intel(R) Core(TM) i7-10750H CPU @ 2.60GHz (6 cores / 12 virtual CPUs).

TimingAction
6.1 sMultithreaded (12 vCPUs) generation of 489 MB SOZip-enabled file from a 1.6 GB uncompresssed GeoPackage file with<br>sozip nz-building-outlines.gpkg.zip nz-building-outlines.gpkg
36 sSingle threaded compression of same file to 480 MB with regular zip utility with<br>zip nz-building-outlines-regular.gpkg.zip nz-building-outlines.gpkg
TimingAction
1.2 sfrom SOZip-compressed GeoPackage file with<br>bench_ogr_batch nz-building-outlines.gpkg.zip
0.7 sfrom uncompressed GeoPackage file with<br>bench_ogr_batch nz-building-outlines.gpkg
TimingAction
1.2 sfrom SOZip-compressed GeoPackage file with<br>ogr2ogr out.gpkg nz-building-outlines.gpkg.zip -spat 1740000 5910000 1750000 5920000
1.1 sfrom uncompressed GeoPackage file with<br>ogr2ogr out.gpkg nz-building-outlines.gpkg -spat 1740000 5910000 1750000 5920000
TimingAction
45 msfrom SOZip-compressed GeoPackage file with<br>ogr2ogr out.gpkg nz-building-outlines.gpkg.zip -fid 1000000
44 msfrom uncompressed GeoPackage file with<br>ogr2ogr out.gpkg nz-building-outlines.gpkg -fid 1000000

How to contribute ?

We welcome contributions to this specification as issues, pull requests or discussions.

If you use SOZip or plan to use it for your data delivery, or consider doing a SOZip implementation, etc., let us know!

Datasets available as SOZip

Social media

Find me on Mastodon

Credits

The SOZip specification and its GDAL implementation have been developed by Spatialys, with support from Safe Software

<!--- # Adopters (Put here a list of organizations, in particular data producers, that have adopted SOZip) -->