Home

Awesome

format_parser

is a Ruby library for prying open video, image, document, and audio files. It includes a number of parser modules that try to recover metadata useful for post-processing and layout while reading the absolute minimum amount of data possible.

format_parser is inspired by imagesize, fastimage and dimensions, borrowing from them where appropriate.

Gem Version Build Status

Currently supported filetypes:

...with more on the way!

Basic usage

Pass an IO object that responds to read, seek and size to FormatParser.parse and the first confirmed match will be returned.

match = FormatParser.parse(File.open("myimage.jpg", "rb"))
match.nature        #=> :image
match.format        #=> :jpg
match.display_width_px      #=> 320
match.display_height_px     #=> 240
match.orientation   #=> :top_left

You can also use parse_http passing a URL or parse_file_at passing a path:

match = FormatParser.parse_http('https://upload.wikimedia.org/wikipedia/commons/b/b4/Mardin_1350660_1350692_33_images.jpg')
match.nature        #=> :image
match.format        #=> :jpg

If you would rather receive all potential results from the gem, call the gem as follows:

array_of_results = FormatParser.parse(File.open("myimage.jpg", "rb"), results: :all)

You can also optimize the metadata extraction by providing hints to the gem:

FormatParser.parse(File.open("myimage", "rb"), natures: [:video, :image], formats: [:jpg, :png, :mp4], results: :all)

Return values of all parsers have built-in JSON serialization

img_info = FormatParser.parse(File.open("myimage.jpg", "rb"))
JSON.pretty_generate(img_info) #=> ...

To convert the result to a Hash or a structure suitable for JSON serialization

img_info = FormatParser.parse(File.open("myimage.jpg", "rb"))
img_info.as_json

# it's also possible to convert all keys to string
img_info.as_json(stringify_keys: true)

Creating your own parsers

See the section on writing parsers in CONTRIBUTING.md

Design rationale

We need to recover metadata from various file types, and we need to do so satisfying the following constraints:

Deliberate design choices

Therefore we adapt the following approaches:

Acknowledgements

We are incredibly grateful to Remco van't Veer for exifr and to Krists Ozols for id3tag that we are using for crucial tasks.

Fixture Sources

Unless specified otherwise in this section the fixture files are MIT licensed and from the FastImage and Dimensions projects.

AAC

AIFF

ARW

CR2

CR3

DOCX

DPX

ERF

FDX

FLAC

JPEG

JPEG (EXIF orientation)

KEY

M3U

MOV

MP3

MP4

MPEG

NEF

OGG

PDF

PNG

RW2

TIFF

WAV

WEBP

ZIP

Copyright

Copyright (c) 2020 WeTransfer.

format_parser is distributed under the conditions of the Hippocratic License