Home

Awesome

Incredibly Flexible Data Storage (IFDS) File Format

The Incredibly Flexible Data Storage (IFDS) file format enables the rapid creation of highly scalable and flexible custom file formats. Create your own customized binary file format with ease with IFDS.

Implementations of the IFDS file format specification internally handle all of the difficult bookkeeping bits that occur when inventing a brand new file format, which lets software developers focus on more important things like high level design and application development. See the use-cases and examples below that use the PHP reference implementation (MIT or LGPL, your choice) of the specification to understand what is possible with IFDS!

Donate Discord

Features

Performance

On an Intel Core i7 6th generation CPU, the following measurements were taken using the IFDS PHP reference implementation with the default PFC in-memory layer (can be replicated via the test suite's -perftests option):

The IFDS PHP reference implementation is actually fairly inefficient due to both being implemented in PHP userland and a number of operations are intentionally not fully optimized so that the code is easier to read and understand. PHP itself doesn't perform all that well when modifying large binary data blobs as it is missing multiple inline string modifier functions.

For an apples to oranges comparison, SQLite via PHP PDO on the same hardware can insert approximately 138,000 rows/sec (using transactions and commits) containing the same data into an in-memory SQLite database. SQLite is approximately 2 times faster than the PHP IFDS reference implementation for bulk insertions. However, if you need a database, then you should probably use an existing database.

Use Cases

Here is a short list of ideas for using IFDS:

The possibilities are endless.

Limitations

Use Case: JPEG-PNG-SVG

This use case is a bit contrived but demonstrates combining three different existing file formats into a single, unified file without having to make major changes to an existing file format.

The JPEG file format:

The PNG file format:

The SVG file format:

There is no single image file format that works for all images. This is especially true for images that combine photos/gradients and line art.

So let's make a combined image file format using IFDS and just four objects to store up to three images:

Using the PHP IFDS reference implementation:

<?php
	require_once "support/paging_file_cache.php";
	require_once "support/ifds.php";
	require_once "test_suite/cli.php";

	// Delete any previous file.
	@unlink("a.jps");

	// Create the file.
	$pfc = new PagingFileCache();
	$pfc->Open("a.jps");

	$ifds = new IFDS();
	$result = $ifds->Create($pfc, 1, 0, 0, "JPEG-PNG-SVG");
	if (!$result["success"])  CLI::DisplayError($result);

	// Create 'jpg' object.
	$result = $ifds->CreateRawData("jpg");
	if (!$result["success"])  CLI::DisplayError($result);

	$jpgobj = $result["obj"];

	$data = file_get_contents("a.jpg");

	$result = $ifds->WriteData($jpgobj, $data);
	if (!$result["success"])  CLI::DisplayError($result);

	// Create 'png' object.
	$result = $ifds->CreateRawData("png");
	if (!$result["success"])  CLI::DisplayError($result);

	$pngobj = $result["obj"];

	$data = file_get_contents("a.png");

	$result = $ifds->WriteData($pngobj, $data);
	if (!$result["success"])  CLI::DisplayError($result);

	// Create 'svg' object.
	$result = $ifds->CreateRawData("svg");
	if (!$result["success"])  CLI::DisplayError($result);

	$svgobj = $result["obj"];

	$data = file_get_contents("a.svg");

	$result = $ifds->WriteData($svgobj, $data);
	if (!$result["success"])  CLI::DisplayError($result);

	// Create 'metadata' object.
	$result = $ifds->CreateKeyValueMap("metadata");
	if (!$result["success"])  CLI::DisplayError($result);

	$metadataobj = $result["obj"];

	$metadata = array(
		"width" => "500",
		"height" => "250",
		"desc" => "An amazing, multi-layered image with crisp text and line art!"
	);

	$result = $ifds->SetKeyValueMap($metadataobj, $metadata);
	if (!$result["success"])  CLI::DisplayError($result);

	// Generally a good idea to explicitly call Close() so that all objects get flushed to disk.
	$ifds->Close();
?>

The idea here is that the JPEG image stores photo portion while the PNG image stores the line art and the SVG stores a vector-scalable version of the line art. To display the image, the JPEG would be read and the PNG or SVG is then layered on top. The overall file size is only slightly larger than the combined size of the JPEG + PNG + SVG and the result would be a cleaner, crisper image when displayed on high density devices.

Unfortunately, no existing software out there will currently read this brand new file format and display the contents (e.g. your favorite web browser or image editor won't read this file). It could be argued that the PNG format could handle this internally OR a simpler, combined format could be created. However, this is only meant as a very rudimentary example that barely scratches the surface of what can be accomplished with IFDS.

Use Case: Redesign Text Files

Today's text files and text editors are stuck in the 1970's where storage, RAM, and CPU cycles were extremely limited and every single bit and byte actually mattered. Those extreme limitations generally no longer apply. However, "modern" text editors still act like they do. Editing files that are just a few MB in size dramatically slows down most text editors to a noticeable degree and gets exponentially worse with files in the 50MB+ range. What if someone needs to edit a multi-TB text file...over a network?

There are several major problems with current text files and text editors:

IFDS could be used to solve all of those problems.

IFDS TEXT file format

Root text object (Fixed array, points at Super text chunk objects that can store millions to billions of lines each)
  Each array entry:
    4 byte num lines
    4 byte Super chunk object ID

Super text chunk object (Fixed array, max 65536 entries before splitting the Super chunk)
  Each array entry:
    2 byte num lines
    4 byte Chunk object ID (generally limited to 1 DATA chunk)

Chunk object (Raw data)
  Encoding options:  1 = Raw data, 16 = Deflate compression

Metadata
  newline = Sequence of bytes that denote a line ending (e.g. \r\n, \x00)
  charset = String containing the character set encoding used (e.g. utf-8)
  mimetype = String containing the primary MIME type of the file data (e.g. text/plain, text/html, application/json, text/x-cpp)
  language = String containing the IANA language code used in the file contents (e.g. en-us)
  author = String containing the author of the file (e.g. Bob's Document Farm)
  signature = Object ID of digital signature object (design and implementation is left as an exercise)

With this general structure, the average text editor could handle editing files up to 280TB in size before any notable problems would arise. This is an improvement of 5.6 million times greater when compared to today's average text editor, which begins to have notable problems at around 50MB.

Example implementation usage:

<?php
	require_once "support/paging_file_cache.php";
	require_once "support/ifds.php";
	require_once "support_extra/ifds_text.php";
	require_once "test_suite/cli.php";

	// Delete any previous file.
	@unlink("ifds.iphp");

	// Create the file.
	$pfc = new PagingFileCache();
	$pfc->Open("ifds.iphp");

	$ifdstext = new IFDS_Text();
	$result = $ifdstext->Create($pfc, array("compress" => true, "trail" => false, "mimetype" => "application/x-php"));
	if (!$result["success"])  CLI::DisplayError($result);

	// Write the data.
	$data = file_get_contents("support/ifds.php");

	$result = $ifdstext->WriteLines($data, 0, 0);
	if (!$result["success"])  CLI::DisplayError($result);

	// Generally a good idea to explicitly call Close() so that all objects get flushed to disk.
	$ifdstext->Close();
?>

Now every text editor, every CLI tool (grep, sed, git, etc.), and every library just needs to be updated to support the IFDS TEXT file format. Not difficult and most definitely won't cause anyone to get upset at all.

Again, this simple example barely scratches the surface of IFDS.

Use case: Replace Configuration Files

Configuration files (e.g. INI, conf) are special, common cases of text files. They have a myriad of problems:

IFDS could be used to solve all of those problems.

IFDS CONF file format

Sections (Key-ID map)
  The keys are section names.  The object IDs link to Section objects.

Section object (Key-Value map)
  The keys are option names.
  The empty string key's value is a string that specifies the Context for the section.
  For other keys, the values are as follows:
    1 byte Status/Type (Bit 7:  0 = Use default value(s)/Disabled section, 1 = Use option value(s); Bit 6:  Multiple values; Bits 0-6:  Option Type)
    Remaining bytes are the option's value(s) (Big-endian storage; When using multiple values, string/binary data/section name/unknown is preceded by 4 byte size, other types preceded by 1 byte size)

Metadata
  app = String containing the application this configuration is intended to be used with (e.g. Frank's App)
  ver = String containing the version of the application this configuration is intended to be used with (e.g. 1.0.3)
  charset = String containing the character set encoding used for Strings (e.g. utf-8)
IFDS CONF-DEF file format

Contexts (Key-ID map)
  A mapping of all Context objects.  All keys are strings.

Context object (Key-ID map)
  A mapping of all options for the context.
  The empty string maps to a Doc object (documentation object) for this context.
  All other keys map to Option objects.

Options list (Linked list)

Option object (Linked list node, Raw data)
  1 byte Option Info (Bit 0 = Deprecated)
  1 byte Option Type (0 = Boolean, 1 = Integer, 2 = IEEE Float, 3 = IEEE Double, 4 = String, 5 = Binary data, 6 = Section names; Bit 6 = Multiple values; Bit 7 is reserved)
  4 byte Doc object ID (0 = No documentation)
  4 byte Option Values object ID (0 = Freeform)
  2 byte size of MIME type string + MIME type string for the data that immediately follows (e.g. 'int/bytes', 'int/bits', 'application/json', 'image/jpeg'), allows a configuration tool to adapt what users see/enter
  Remaining bytes are the application's default value(s) for the option (Big-endian storage; When using multiple values, string/binary data/section name/unknown is preceded by 4 byte size, other types preceded by 1 byte size)

Option Values object (Key-ID map)
  The keys are the raw internal allowed values.  The object IDs link to Doc objects.  For Integer types, keys may be an allowed range of values and are stored as a string ("1-4").

Docs list (Linked list)

Doc object (Linked list node, Key-Value map)
  The keys are lowercase IANA language keys (e.g. 'en-us').  Each value is the string containing the documentation to display in that language.
  If the string is a URI/URL, the configuration tool can display the URI/URL as-is and/or grab the content from that URI/URL and display it.

Metadata
  app = String containing the application this configuration definition is intended to be used with (e.g. Frank's App)
  ver = String containing the version of the application this configuration definition is intended to be used with (e.g. 1.0.3)
  charset = String containing the character set encoding used for Strings (e.g. utf-8)

Example implementation usage:

<?php
	require_once "support/paging_file_cache.php";
	require_once "support/ifds.php";
	require_once "support_extra/ifds_conf.php";
	require_once "test_suite/cli.php";

	// Delete any previous file.
	@unlink("ifds.iini");

	// Create the file.
	$pfc = new PagingFileCache();
	$pfc->Open("ifds.iini");

	$ifdsconf = new IFDS_Conf();
	$result = $ifdsconf->Create($pfc, array("app" => "PHP", "ver" => phpversion()));
	if (!$result["success"])  CLI::DisplayError($result);

	// Create a new configuration section.
	$result = $ifdsconf->CreateSection("PHP", "ini");
	if (!$result["success"])  CLI::DisplayError($result);

	$iniobj = $result["obj"];
	$iniopts = $result["options"];

	// Copy all PHP variables to the section.
	$phpini = ini_get_all();
	foreach ($phpini as $key => $info)
	{
		if (isset($info["global_value"]))  $iniopts[$key] = array("use" => true, "type" => IFDS_Conf::OPTION_TYPE_STRING, "vals" => array($info["global_value"]));
	}

	$result = $ifdsconf->UpdateSection($iniobj, $iniopts);
	if (!$result["success"])  CLI::DisplayError($result);

	// Generally a good idea to explicitly call Close() so that all objects get flushed to disk.
	$ifdsconf->Close();
?>

See the test suite for example usage of the IFDS_ConfDef class, which implements the IFDS CONF-DEF format for defining a configuration file that a generic tool could use to modify a IFDS CONF file.

Now every OS just needs to be updated to support the IFDS CONF and IFDS CONF-DEF file formats with both CLI and GUI tools to make it easy and painless to manage application configurations. Not difficult and most definitely won't cause anyone to get upset at all.

Also, once again, this simple example barely scratches the surface of IFDS.

Documentation