Awesome
UNWARCIT: WARC (and WACZ) Unzipping Library
Background
This library provides a command line interface to unzip warc and wacz files.
Builds off of the warcio library to read and validate warc files and the py-wacz library to validate wacz files.
Both libraries are provided by Webrecorder
Setup
Install by cloning the repo and then running: python3 setup.py install
You can now run the tool like so:
unwarcit metro_capture2.wacz data.warc --output myfolder
You can pass a single file or a list of files, either warc or wacz, separated by spaces to unwarcit by placing them after the unwarcit command.
unwarcit warcfile1.warc warcfile2.warc waczfile.wacz
Configuration Options
<details> <summary><b>Unwarcit currently accepts the following parameters:</b></summary> --help Show help [str]
--version Show version number [int]
--output The folder to output the results to [str]
</details>