Awesome
<img src="logo.png" alt="logo" height="128" />JsonHilo.js
<!-- [![](https://data.jsdelivr.com/v1/package/gh/xtao-org/jsonhilo/badge)](https://www.jsdelivr.com/package/gh/xtao-org/jsonhilo) -->Minimal lossless JSON parse event streaming, akin to SAX.
Handcrafted by <a href="https://djedr.github.io">Darius J Chuck</a>.
<a href="https://donate.stripe.com/00gdUicLv4UueeQcMM" target="_blank"><img src="https://raw.githubusercontent.com/djedr/djedr.github.io/master/gfx/donate-but.svg" style="height: 32px" alt="Donate directly via Stripe" title="Donate directly via Stripe"></a> or <a href='https://ko-fi.com/djedr' target='_blank'><img width="120" style='border:0px;width:120px;' src='https://storage.ko-fi.com/cdn/kofi1.png?v=3' alt='Buy Me a Coffee at ko-fi.com' /></a> <a href="https://buycoffee.to/jevko" target="_blank"><img width="120" src="https://buycoffee.to/btn/buycoffeeto-btn-primary.svg" style="width: 120px" alt="Postaw mi kawę na buycoffee.to"></a>
Fast, modular, and dependency-free.
Provides two interfaces: a high-level one and a low-level one.
Written in runtime-independent JavaScript.
Works in Deno, Node.js, and the browser.
Status
Stable.
Passes standards-compliance tests and performs well in benchmarks.
Battle-tested.
Installation
Node.js
An npm package is available:
npm i @xtao-org/jsonhilo
Deno and the browser
Import modules directly from deno.land/x:
import {JsonHigh} from 'https://deno.land/x/jsonhilo@v0.3.7/mod.js'
Or from a CDN such as jsDelivr:
import {JsonHigh} from 'https://cdn.jsdelivr.net/gh/xtao-org/jsonhilo@v0.3.7/mod.js'
<!-- An easy alternative that will work for all environments is to copy and use [`jsonhilo.bundle.js`](jsonhilo.bundle.js), e.g.:
```
curl https://raw.githubusercontent.com/tree-annotation/json-stream-js/jsonhilo.bundle.js > jsonhilo.bundle.js
```
and then:
```js
import {JsonHigh} from 'jsonhilo.bundle.js'
```
The bundle was obtained with [`deno bundle`](https://deno.land/manual/tools/bundler) and exports the same modules as [`mod.js`](mod.js), all dependencies included. -->
Quickstart
See a basic example in demo/basic.js
, pasted below:
import {JsonHigh} from '@xtao-org/jsonhilo'
const stream = JsonHigh({
openArray: () => console.log('<array>'),
openObject: () => console.log('<object>'),
closeArray: () => console.log('</array>'),
closeObject: () => console.log('</object>'),
key: (key) => console.log(`<key>${key}</key>`),
value: (value) => console.log(`<value type="${typeof value}">${value}</value>`),
})
stream.chunk('{"tuple": [null, true, false, 1.2e-3, "[demo]"]}')
This uses the simplified high-level interface built on top of the more powerful low-level core.
Features
- Simple and minimal
- Dependency-free
- Runtime-independent
- Lossless
- Modular
- Fast
- Streaming-friendly
- Optionally standards-compliant
- Unicode-compatible
Runtime-independent
The library logic is written in modern JavaScript and relies upon some of its features, standard modules in particular.
Beyond that it does not use any runtime-specific features and should work in any modern JavaScript environment. It was tested in Deno, Node.js, and the browser.
That said, the primary target runtime is Deno, and tests depend on it.
Lossless
Unlike any other known streaming JSON parser, JsonHilo provides a low-level interface for lossless parsing, i.e. it is possible to recover the exact input, including whitespace and string escape sequences, from parser events.
This feature can be used to implement accurate translators from JSON to other representations (see Rationale), syntax highlighters (demo below), JSON scanners that search for substrings in strings on-the-fly, without first loading them into memory, and more.
<img src="highlight.gif" alt="Highlight demo" height="320" />Pictured above is the syntax highlighting demo: demo/highlight.js
Modular
The library is highly modular with a fully independent core, around which various adapters and extensions are built, including an easy-to-use high-level interface.
JsonLow
The core module is JsonLow.js
. It has no dependencies, so it can be used on its own. It is very minimal and optimized for maximum performance and accuracy, as well as minimum memory footprint. It provides the most fine-grained control over the parsing process. The events generated by the parser carry enough information to losslessly recreate the input exactly, including whitespace.
See JsonLow.d.ts for type information and demo/highlight.js for usage example.
<!-- *Detailed description to be written.* -->JsonHigh
JsonHigh.js
is the high-level module which provides a more convenient interface. It is composed of auxiliary modules and adapters built around the core. It is optimized for convenience and provides similar functionality and granularity to other streaming parsers, such as Clarinet or creationix/jsonparse.
See JsonHigh.d.ts for type information and Quickstart for usage example.
Parameters
JsonHigh
is called with an object which contains named event handlers that are invoked during parsing. All handlers are optional and described below.
Return value
JsonHigh
returns a stream object with two methods:
chunk
which accepts a JSON chunk to parse. It returns the stream object for chaining.end
with no arguments which signals that the current JSON document is finished. If there is no error, it calls the correspondingend
event handler, passing its return value to the caller.
Events
There are 4 event handlers without arguments which indicate start and end of structures:
openArray
: an array started ([
)closeArray
: an array ended (]
)openObject
: an object started ({
)closeObject
: an object ended (}
)
And 2 event handlers with one argument which capture primitives:
key
: an object's key ended. The argument of the handler contains the key as a JavaScript string.value
: a primitive JSON value ended. The argument of the event contains the corresponding JavaScript value:true
,false
,null
, a number, or a string.
Finally, there is the argumentless end
event handler which is called by the end
method of the stream to confirm that the parsed JSON document is complete and valid.
Note that an event handler won't be called if there is an error in the parsed JSON, see error handling.
Error handling
If there is an error when parsing a chunk
, an Error
is thrown, containing a serialized JSON object with details in the error message.
If there is an error at the end
, that error is returned to the caller. The user-provided end
event handler is not called, so it should not contain any cleanup code.
Cleanup
To run cleanup code at the end of parsing a document regardless of whether there was an error or not, don't put that code in the end handler. Instead put it after .end()
, like so:
// ...
stream.end()
cleanup()
If you want to also handle an error, you can use the isError
helper:
import {isError} from '@xtao-org/jsonhilo'
// ...
const ret = stream.end()
if (isError(ret)) { handle(ret) } // handle error
cleanup()
If your error handler can throw, you can use try-catch-finally
:
import {isError} from '@xtao-org/jsonhilo'
// ...
const ret = stream.end()
try { if (isError(ret)) { handle(ret) } }
catch (e) { /* optional */ }
finally { cleanup() }
Fast
Achieving optimal performance without sacrificing simplicity and correctness was a design goal of JsonHilo. This goal was realized and for applications without extreme performance requirements JsonHilo should be more than fast enough.
It may be worth noting however that using pure JavaScript for extremely performance-sensitive applications is ill-advised and that nothing can replace individual case-by-case benchmarks.
It is difficult to find a parser that can be sensibly compared with JsonHilo. The one that comes the closest and is fairly widely known is Clarinet. It is the only low-level streaming JSON parser featured on JSON.org and the fastest one I could find.
xtao-org/jsonhilo-benchmarks contains simple benchmarks used to compare the performance of JsonHilo with Clarinet and jq (a fast and versatile command-line JSON processor).
According to these benchmarks, for validating JSON (just parsing without any further processing) JsonHilo is the fastest, before jq, which is in turn faster than Clarinet. Overall for comparable tasks the low-level JsonHilo interface is up to 2x faster than Clarinet, whereas the high-level interface is on par.
Again, these results need to be taken with a grain of salt, and there is no replacement for individual benchmarks. Use whatever suits your case best. In most cases, relative performance should not be the only factor to take into account.
Factors which make a fair comparison between JsonHilo and Clarinet problematic are mentioned below.
Differences between JsonHilo and Clarinet
The major differences that make the comparison of the two problematic are:
- Clarinet is not fully ECMA-404-compliant, as measured by JSON Parsing Test Suite by Nicolas Seriot -- it accepts certain invalid JSON and rejects certain valid JSON. JsonHilo is designed to parse the JSON grammar correctly and so can pass the ECMA-404-compliance test suite. JsonHilo is overall safer to use with unknown inputs -- it can very well be used as a validator.
- JsonHilo fundamentally operates on individual Unicode code points as opposed to strings, chunks, or characters. Performance-wise this may be an advantage or a disadvantage, depending on how the input is structured (it may need conversion).
- Even though low-level processing with JsonHilo may be overall significantly faster than Clarinet, the fact that the former does not use regular expressions to parse the input while the latter does may lead to a narrower performance gap between the two.
- JsonHilo is overall simpler in terms of code complexity, making it easier to adjust or audit. The code is also significantly smaller in size than Clarinet, even taking into account the optional high-level interfaces laid on top of the tiny core.
- JsonHilo's core is more low-level and amenable to extension.
Streaming-friendly
By default the parser is streaming-friendly by accepting the following:
-
Multiple consecutive top-level JSON values -- it can read line-delimited JSON and concatenated JSON, e.g. JSON Lines, ndjson. Whitespace-separated primitives are also supported.
-
Trailing commas -- a single trailing comma in an array or an object generates no errors.
-
Zero-length or whitespace-only input -- generates no errors.
Standards-compliant
The streaming-friendly features can be supressed by Ecma404.js
, an adapter module which provides full ECMA-404/RFC 8259 compliance.
This is confirmed by passing the JSON Parsing Test Suite by Nicolas Seriot, available under test/JSONTestSuite
.
Tests can be run with Deno as follows:
deno test --allow-read
Unicode-compatible
The core logic operates on Unicode code points -- in line with spec -- rather than code units or characters.
Rationale
Initially written to enable fast lossless translation between JSON and Jevko, as no suitable JSON parser in JavaScript exists.
I decided to release this as a separate library, because I was tinkering with Deno and found that there was no streaming JSON parser available at all for Deno.
See also
JsonStrum -- a high-level wrapper over JsonHilo which emits fully parsed objects and arrays.
License
Released under the MIT license.
Support this project
<p>I prefer to share my creations for free. However living and creating without money is not possible for me. So I ask companies and people, who want and can, for support. Every symbolic cup of coffee counts!</p> <div class="flexi" style="flex-wrap: wrap;"> <a href="https://donate.stripe.com/00gdUicLv4UueeQcMM" target="_blank"><img src="https://raw.githubusercontent.com/djedr/djedr.github.io/master/gfx/donate-but.svg" style="height: 64px" alt="Donate directly via Stripe" title="Donate directly via Stripe"></a> or <a href='https://ko-fi.com/djedr' target='_blank'><img width='240' style='border:0px;width:240px;' src='https://storage.ko-fi.com/cdn/kofi1.png?v=3' alt='Buy Me a Coffee at ko-fi.com' /></a> <a href="https://buycoffee.to/jevko" target="_blank"><img width='240' src="https://buycoffee.to/btn/buycoffeeto-btn-primary.svg" style="width: 240px" alt="Postaw mi kawę na buycoffee.to"></a> </div>Paid support and online assistance
If you prefer, you can get paid help and support, including direct online assistance, related to JsonHilo.js through Githelp.
At the moment this is a limited opportunity to try an early version of Githelp.
<img src="tao-json.png" alt="tao-json-logo" height="128" />
A stand-alone part of the TAO-JSON project.
© 2024 xtao.org