Home

Awesome

RL_JSON

This package adds a command [json] to the interpreter, and defines a new Tcl_Obj type to store the parsed JSON document. The [json] command directly manipulates values whose string representation is valid JSON, in a similar way to how the [dict] command directly manipulates values whose string representation is a valid dictionary. It is similar to [dict] in performance.

Also provided is a command [json template] which generates JSON documents by interpolating values into a template from a supplied dictionary or variables in the current call frame. The templates are valid JSON documents containing string values which match the regex "^~[SNBJTL]:.+$". The second character determines what the resulting type of the substituted value will be:

None of the first three characters for a template may be escaped.

The value inserted is determined by the characters following the substitution type prefix. When interpolating values from a dictionary they name keys in the dictionary which hold the values to interpolate. When interpolating from variables in the current scope, they name scalar or array variables which hold the values to interpolate. In either case if the named key or variable doesn't exist, a JSON null is interpolated in its place.

Quick Reference

Paths

The commands [json get], [json extract], [json set], [json unset] and [json exists] accept a path specification that names some subset of the supplied json_val. The rules are similar to the equivalent concept in the [dict] command, except that the paths used by [json] allow indexing into JSON arrays by the integer key (or a string matching the regex "^end(-[0-9]+)?$"). If a path to [json set] includes a key within an object that doesn't exist, it and all later elements of the path are created as nested keys into (new) objects. If a path element into an array is outside the current bounds of the array, it resolves to a JSON null (for [json get], [json extract], [json exists]), or appends or prepends null elements to resolve the path (for [json set], or does nothing ([json unset]).

json get {
    {
        "foo": [
            { "name": "first" },
            { "name": "second" },
            { "name": "third" }
        }
    }
} foo end-1 name

Returns "second"

Properly Interpreting JSON from Other Systems

Rl_json operates on characters, not bytes, and so considerations of encoding are strictly out of scope. However, interoperating with other systems properly in a way that conforms to the standards is a bit tricky, and requires support for encodings Tcl currently doesn't natively support, like utf-32be. To ease this burden and take care of things like replacing broken encoding sequences, the [json decode] subcommand is provided. Using it in an application would look something like:

proc readjson file {
    set h [open $file rb]    ;# Note that the file is opened in binary mode
    try {
        json decode [read $h]
    } finally {
        close $h
    }
}

If the encoding is known via some out-of-band channel (like headers in an HTTP response), it can be supplied to override the BOM-based detection. The supported encodings are those listed in the JSON standards: utf-8 (the default), utf-16le, utf-16be, utf-32le and utf-32be.

Examples

Creating a document from a template

Produce a JSON value from a template:

json template {
    {
        "thing1": "~S:val1",
        "thing2": ["a", "~N:val2", "~S:val2", "~B:val2", "~S:val3", "~L:~S:val1"],
        "subdoc1": "~J:subdoc",
        "subdoc2": "~T:subdoc"
    }
} {
    val1   hello
    val2   1e6
    subdoc {
        { "thing3": "~S:val1" }
    }
}

Result:

{"thing1":"hello","thing2":["a",1000000.0,"1e6",true,null,"~S:val1"],"subdoc1":{"thing3":"~S:val1"},"subdoc2":{"thing3":"hello"}}

Construct a JSON array from a SQL result set

# Given:
# sqlite> select * from languages;
# 'Tcl',1,'http://core.tcl-lang.org/'
# 'Node.js',1,'https://nodejs.org/'
# 'Python',1,'https://www.python.org/'
# 'INTERCAL',0,'http://www.catb.org/~esr/intercal/'
# 'Unlambda',0,NULL

set langs {[]}
sqlite3 db languages.sqlite3
db eval {
    select
        rowid,
        name,
        active,
        url
    from
        languages
} {
    if {$url eq ""} {unset url}

    json set langs end+1 [json template {
        {
            "id":       "~N:rowid",
            "name":     "~S:name",
            "details": {
                "active":   "~B:active",  // Template values can be nested anywhere
                "url":      "~S:url"      /* Both types of comments are
                                             allowed but stripped at parse-time */
            }
        }
    }]
}

puts [json pretty $langs]

Result:

[
    {
        "id":      1,
        "name":    "Tcl",
        "details": {
            "active": true,
            "url":    "http://core.tcl-lang.org/"
        }
    },
    {
        "id":      2,
        "name":    "Node.js",
        "details": {
            "active": true,
            "url":    "https://nodejs.org/"
        }
    },
    {
        "id":      3,
        "name":    "Python",
        "details": {
            "active": true,
            "url":    "https://www.python.org/"
        }
    },
    {
        "id":      4,
        "name":    "INTERCAL",
        "details": {
            "active": false,
            "url":    "http://www.catb.org/~esr/intercal/"
        }
    },
    {
        "id":      5,
        "name":    "Unlambda",
        "details": {
            "active": false,
            "url":    null
        }
    }
]

Performance

Good performance was a requirement for rl_json, because it is used to handle large volumes of data flowing to and from various JSON based REST apis. It's generally the fastest option for working with JSON values in Tcl from the options I've tried, with the next closest being yajltcl. These benchmarks report the median times in microseconds, and produce quite stable results between runs. Benchmarking was done on a MacBook Air running Ubuntu 14.04 64bit, Tcl 8.6.3 built with -O3 optimization turned on, and using an Intel i5 3427U CPU.

Parsing

This benchmark compares the relative performance of extracting the field containing the string "obj" from the JSON doc:

{
	"foo": "bar",
	"baz": ["str", 123, 123.4, true, false, null, {"inner": "obj"}]
}

The compared methods are:

NameNotesCode
old_json_parsePure Tcl parserdict get [lindex [dict get [json_old parse [string trim $json]] baz] end] inner
rl_json_parsedict get [lindex [dict get [json parse [string trim $json]] baz] end] inner
rl_json_getUsing the built-in accessor methodjson get [string trim $json] baz end inner
yajltcldict get [lindex [dict get [yajl::json2dict [string trim $json]] baz] end] inner
rl_json_get_nativejson get $json baz end inner

The use of [string trim $json] is to defeat the caching of the parsed representation, forcing it to reparse the string each time since we're measuring the parse performance here. The exception is the rl_json_get_native test which demonstrates the performance of the cached case.

-- parse-1.1: "Parse a small JSON doc and extract a field" --------------------
                   | This run
    old_json_parse |  241.595
     rl_json_parse |    5.540
       rl_json_get |    4.950
           yajltcl |    8.800
rl_json_get_native |    0.800

Validating

If the requirement is to validate a JSON value, the [json valid] command is a light-weight version of the parsing engine that skips allocating values from the document and only returns whether the parsing succeeded or failed, and optionally a description of the failure. It takes about a third of the time to validate a document as parsing it, so the performance win is substantial. On a relatively modern CPU validation takes about 11 cycles per byte, or around 200MB of JSON per second on a 2.3 GHz Intel i7.

Generating

This benchmark compares the relative performance of various ways of dynamically generating a JSON document. Although all the methods produce the same string, only the "template" and "template_dict" variants handle nulls in the general case - the others manually test for null only for the one field that is known to be null, so the performance of these variants would be worse in a real-world scenario where all fields would need to be tested for null.

The JSON doc generated in each case is the one produced by the following JSON template (where a(not_defined) does not exist and results in a null value in the produced document):

{
	"foo": "~S:bar",
	"baz": [
		"~S:a(x)",
		"~N:a(y)",
		123.4,
		"~B:a(on)",
		"~B:a(off)",
		"~S:a(not_defined)",
		"~L:~S:not a subst",
		"~T:a(subdoc)",
		"~T:a(subdoc2)"
	]
}

The produced JSON doc is:

{"foo":"Bar","baz":["str\"foo\nbar",123,123.4,true,false,null,"~S:not a subst",{"inner":"Bar"},{"inner2":"Bar"}]}

The code for these variants are too long to include in this table, refer to bench/new.bench for the details.

NameNotes
old_json_fmtPure Tcl implementation, builds JSON from type-annotated Tcl values
rl_json_newrl_json's [json new], API compatible with the pure Tcl version used in old_json_fmt
templaterl_json's [json template]
yajltclyajltcl's type-annotated Tcl value approach
template_dictAs for template, but using a dict containing the values to substitute
yajltcl_dictAs for yajltcl, but extracting the values from the same dict used by template_dict
-- new-1.1: "Various ways of dynamically assembling a JSON doc" ---------------
                 | This run
    old_json_fmt |   49.450
     rl_json_new |   10.240
        template |    4.520
         yajltcl |    7.700
   template_dict |    2.500
    yajltcl_dict |    7.530

Deprecations

Version 0.10.0 deprecates various subcommands and features, which will be removed in a near future version:

Under the Hood

Older versions used the yajl c library to parse the JSON string and properly quote generated strings when serializing JSON values, but currently a custom built parser and string quoter is used, removing the libyajl dependency. JSON values are parsed to an internal format using Tcl_Objs and stored as the internal representation for a new type of Tcl_Obj. Subsequent manipulation of that value use the internal representation directly.

License

Copyright 2015-2023 Ruby Lane. Licensed under the same terms as the Tcl core.