Home

Awesome

Roadroller: Flattens Your JavaScript Demo

Roadroller is a heavyweight JavaScript packer for large demos. It was originally designed for js13kGames, but it remains usable for demos as small as 4KB. Depending on the input it can provide up to 15% additional compression compared to best ZIP/gzip recompressors. Try it online!

Roadroller is considered "heavyweight" unlike typical JS packers such as JSCrush or RegPack, because it is quite resource intensive and requires both a considerable amount of memory and a non-negligible run time. The default should work for most devices, but you can configure both aspects as you need.

Quick Start

In addition to the online demo, Roadroller is available as an NPM package:

$ npx roadroller input.js -o output.js

You can also use Roadroller as a library to integrate with your build pipeline.

import { Packer } from 'roadroller';

const inputs = [
    {
        data: 'console.log("Hello, world!");',
        type: 'js',
        action: 'eval',
    },
];

const options = {
    // see the Usage for available options.
};

const packer = new Packer(inputs, options);
await packer.optimize(); // takes less than 10 seconds by default

const { firstLine, secondLine } = packer.makeDecoder();
console.log(firstLine + secondLine);

Roadroller as a library or a CLI command requires Node.js 14 or later. Node.js 16 is strongly recommended because Roadroller is substantially faster in 16 than in 14.

Usage

By default Roadroller receives your JS code and returns a compressed JS code that should be further compressed with ZIP, gzip or PNG bootstrap (or more accurately, DEFLATE). Ideally your JS code should be already minified, probably using Terser or Closure Compiler; Roadroller only does a minimal whitespace and comment suppression.

The resulting code will look like this: (the newline is mostly for the explanation and can be removed)

eval(Function("[M='Zos~ZyF_sTdvfgJ^bIq_wJWLGSIz}Chb?rMch}...'"
,...']charCodeAtUinyxp',"for(;e<12345;c[e++]=p-128)/* omitted */;return o")([],[],12345678,/* omitted */))

The first line is a compressed data. It can contain control characters like  (U+001C) that might not render in certain environments. Nevertheless you should make sure that they are all copied in verbatim.

The second line is a compressor tuned for this particular input. By default the decompressed data immediately goes through eval, but you can configure what to do with that.

The first line is very incompressible unlike the second line, so ideally you should compress two lines separately. This is best done by using ADVZIP from AdvanceCOMP or ECT. The first line and second line may form a single statement as above so they should not be separated; you can only put whitespace between them.

<!-- ### Multiple Inputs You can also give multiple inputs to Roadroller; for example you can put shaders and map data. The executed code will receive all decompressed inputs. This is more efficient than putting them into strings because each input can be separately modelled. -->

Input Configuration

Each input can be further configured by input type and action. In the CLI you put corresponding options before the file path.

Input type (CLI -t|--type TYPE, API type in the input object) determines the preprocessing step to improve the compression. <!--Dropping a file to the input window also tries to detect the correct input type.-->

<!--* **GLSL** (`glsl`) assumes a valid GLSL code. Automatically removes all redundant whitespace and comments.--> <!--* **HTML** (`html`) assumes a valid HTML. (TODO)--> <!--* **Binary** (`binary`) does nothing. You can choose base64 (`binary:base64`) or hex (`binary:hex`) for the input encoding.-->

Input action (CLI -a|--action ACTION, API action in the input object) determines what to do with the decompressed data. <!--All action except for the evaluate produces a value to the variable named `_` by default, which is either a value itself for a single input and an array of values for multiple inputs.-->

<!--* *(JS, text only)* **JSON decode** (`json`) parses and returns a JSON value with `JSON.parse`.--> <!--* *(No binary)* **String** (`string`) returns a string.--> <!--* **Array** (`array`) returns an array of bytes.--> <!--* **Typed array** (`u8array`) returns a `Uint8Array` value.--> <!--* **Base64** (`base64`) returns a base64-encoded string. Handy for data URIs.--> <!-- **Input name** (CLI `-n|--name NAME`, API `name`) is required for accessing each input from the decompressed code. This is required if the input produces an output value. **Extract inputs** (CLI `-x|--extract`, not available in API) can be used for the JavaScript input with the evaluate action. This will try to extract long embedded strings and determine the best type for each input. This assumes that the compressed code doesn't make use of the output variable elsewhere; you can change the variable name from the configuration. -->

Output Configuration

Number of contexts (CLI -S|--selectors xCOUNT) relates to the complexity of modelling. The larger number of contexts will compress better, but at the expense of linear increase in both the time and memory usage. The default is 12, which targets at most 1 second of latency permitted for typical 30 KB input.

Maximum memory usage (CLI -M|--max-memory MEGABYTES, API maxMemoryMB in the options object) configures the maximum memory to be used for decompression. Increasing or decreasing memory usage mostly affects the compression ratio and not the run time. The default is 150 MB and a larger value is not recommended for various reasons:

The actual memory usage can be as low as a half of the specified due to the internal architecture; -v will print the actual memory usage to stderr.

Allowing the decoder to pollute the global scope (CLI -D|--dirty, API allowFreeVars in the options object) is unsafe especially when the Roadroller output should coexist with other code or there are elements with single letter id attributes and turned off by default. But if you can control your environment (typical for demos), you can turn this on for a smaller decoder.

Optimize parameters (CLI -O|--optimize LEVEL, API Packer.optimize) searches for better modelling parameters. If parameters are already given the optimizer will try to improve upon that, and the optimizer prints best parameters at the end which can be reused for faster iteration. Parameters are solely related to the compression ratio so you can try this as many as you can afford. Each level does the following:

Advanced Configuration

Number of context bits (CLI -Zco|--context-bits BITS, API contextBits in the options object) sets the size of individual model as opposed to the total memory use (-M), which is a product of the number of context and the size of each model. This explicit option is most useful for the fair benchmarking, since some parameters like -Zpr or -Zmc affect the memory use and therefore this parameter.

<!--**Optimize for uncompressed size** (CLI `--uncompressed-only`) assumes the absence of the outer comperssion algorithm like DEFLATE. This is *bad* for the compression since the compressor has to work strictly within the limits of JS source code including escape sequences. This should be the last resort where you can't even use the PNG-based self extraction and everything has to be in a single file.-->

Following parameters can be automatically optimized and normally you don't have to touch them unless you want to reproduce a particular set of parameters. As such, the default optimization (-O1) is disabled if any of these arguments are given in the CLI.

Chosen contexts (CLI -S|--selectors SELECTOR,SELECTOR,..., API sparseSelectors in the options object) determine which byte contexts are used for each model. <i>K</i>th bit of the number (where K > 0) is set if the context contains the <i>K</i>th-to-last byte: 5 = 101<sub>(2)</sub> for example would correspond to the context of the last byte and third-to-last byte, also called a sparse context (0,2). There is no particular limit for the number, but Roadroller only considers up to 9th order for the optimization process.

Precision (CLI -Zpr|--precision BITS, API precision in the options object) is the number of fractional bits used in the internal fixed point representation. This is shared between the entropy coder and context models and can't be decoupled. The default of 16 should be enough, you can also try to decrease it.

Learning rate (CLI -Zlr|--learning-rate RATE, API recipLearningRate in the options object) adjusts how fast would the context mixer adapt, where smaller is faster. The default is 500 which should be fine for long enough inputs. If your demo is smaller than 10 KB you can also try smaller numbers.

Model max count (CLI -Zmc|--model-max-count COUNT, API modelMaxCount in the options object) adjusts how fast would individual contexts adapt, where smaller is faster. The model adapts fastest when a particular context is first seen, but that process becomes slower as the context is seen multiple times. This parameter limits how slowest the adaptation process can be. The default of 5 is specifically tuned for JS code inputs.

Model base divisor (CLI -Zmd|--model-base-divisor DIVISOR, API modelRecipBaseCount in the options object) adjusts how fast should individual contexts adapt initially, where larger is faster. The optimal value typically ranges from 10 to 100 for JS code inputs.

Dynamic model flags (CLI -Zdy|--dynamic-models FLAGS, API dynamicModels in the options object) are used to enable or disable specific dynamic models, where each bit is turned on if the model is in use. There is currently one supported model:

Number of abbreviations (CLI -Zab|--num-abbreviations NUM, API numAbbreviations in the options object) affects the preprocessing for JS code inputs. Common identifiers and reserved words can be abbreviated to single otherwise unused bytes during the preprocessing; this lessens the burden of context modelling which can only look at the limited number of past bytes. If this parameter is less than the number of allowable abbreviations some identifiers will be left as is, which can sometimes improve the compression.

Tips and Tricks

See also the wiki for more information.

Compatibility

Roadroller itself and resulting packed codes are ECMAScript 2015 (ES6) compatible and should run in every modern Web browser and JS implementation. Implementations are assumed to be reasonably fast but otherwise it can run in slower interpreters. MSIE is not directly supported but it works fine (slowly) after simple transpiling.

Roadroller and packed codes extensively use Math.exp and Math.log that are implementation-approximated, so there is a small but real possibility that they behave differently in different implementations. This is known to be a non-issue for browser JS engines as well as V8 and Node.js as they use the same math library (fdlibm) for those functions, but you have been warned.

Internals

Roadroller is mostly possible due to the progress in data compression algorithms as recent as 2010s:

The minimal JS code for this algorithm was initially adapted from a golf.horse submission by Hasegawa Sayuri (public domain). The main difference is that Roadroller implements hashed contexts and thus order 3+ context models.

License

The Roadroller compressor proper is licensed under the MIT license. In addition to this, any decoder code produced by Roadroller, that is, everything in the second line is put in the public domain.