Home

Awesome

jqjq

jqjq is a jq implementation in jq.

It started out researching how to write format decoders in jq for fq which ended up involving some AST fiddling and then escalated from a joke into a fun and educational project. But it's also a nice way to show off jq as a very expressive, capable and neat language!

It can currently run with jq, gojq, jaq and jqjq.

This jqplay demo snippet can be used to play around with it.

Use via jqjq wrapper

$ ./jqjq -n 'def f: 1,8; [f,f] | map(.+105) | implode'
"jqjq"

$ ./jqjq '.+. | map(.+105) | implode' <<< '[1,8]'
"jqjq"

# eval example above using jqjq in jqjq. will take some time.
# eval the concatenation of jqjq.jq as a string and the example
$ ./jqjq "eval($(jq -Rs . jqjq.jq)+.)" <<< '"eval(\"def f: 1,8; [f,f] | map(.+105) | implode\")"'
"jqjq"

# jqjq has a REPL
$ ./jqjq --repl
> 1,2,3 | .*2
2
4
6
> "jqjq" | explode | map(.-32) | implode
"JQJQ"
> "jqjq" | [eval("explode[] | .-32")] | implode
"JQJQ"
> ^D

# run with gojq or jaq
$ ./jqjq --jq gojq -n 1+2
3
$ ./jqjq --jq jaq -n 1+2
3

# run 01mf02's adaptation of itchyny's bf.jq running fib.bf
$ ./jqjq -n "\"$(cat fib.bf)\" | $(cat bf.jq)"
"1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233"

Use with jq

$ jq -n -L . 'include "jqjq"; eval("def f: 1,8; [f,f] | map(.+105) | implode")'
"jqjq"

$ jq -L . 'include "jqjq"; eval("(.+.) | map(.+105) | implode")' <<< '[1,8]'
"jqjq"

# can be used as path expression (only gojq for now because of jq issue)
$ gojq -cn -L . 'include "jqjq"; {a:0, b:1} | eval(".a, .b") += 1'
{"a":1,"b":2}

Run tests

# run jqjq's tests with jq and jqjq (via jq)
$ make test
# run jqjq's tests with jqjq (via jaq)
$ JQ=jaq make test-jqjq

Note that the tests are meant to be used with jq 1.7.1.

Progress

jq's test suite

$ ./jqjq --run-tests < ../jq/tests/jq.test | grep passed
312 of 469 tests passed

Note that expected test values are based on stedolan's jq. If you run with a different jq implementation like gojq some tests might fail because of different error messages, support for arbitrary precision integers etc.

Design overview

jqjq has the common lex, parse, eval design.

Lex

Lexer gets a string and chews off parts from left to right producing an array of tokens [{<name>: ...}, ...]. Each chew is done by testing regex:s in a priority order to make sure to match longer prefixes first, ex: += is matched before +. For a match a lambda is evaluated, usually {<token-name>: .}, but in some cases like for quoted strings it is a bit more complicated.

The lexer also has a stack to keep track of balance of seen (, ) and \( to properly know how to chop of a string with interpolation into tokens. e.g. is ) a right parenthesis or continuation of a string as in "abc \(123) def"?

You can use ./jqjq --lex '...' to lex and see the tokens.

Parse

Parser takes an array of tokens and uses a left-to-right (LR) parser with backtracking in combination with precedence climbing for infix operators to not end up in an infinite loop (ex parser rule E -> E + E). Backtracking is done by outputting empty for non-match and // to try the next rule, ex: a // b // ... // error where a and b are functions that try to match a rule. When a rule has matched it returns an array with the pair [<tokens left>, <ast>]. <ast> uses the same AST design as gojq.

You can use ./jqjq --parse '...' to lex and parse and see the AST tree.

Eval

Eval is done by traversing the AST tree and evaluates each AST node and also keeps track of the current path and environment.

Path is used in jq to keep track of current path to where you are in the input. The tracking can be done as long as you only operate on the input and don't create and output new values. I.e path(.a[1]), .b outputs ["a",1] and ["b"]. This is also used to implement assignment and some other operators.

Environment is an object with current functions and bindings. Functions have the key name <name>/<arity> and the value is a function AST. Bindings use the key name $<name>/0 and the value is {value: <value>} where value is normal jq value.

When evaluating the AST eval function get the current AST node, path and environment and will output zero, one or more arrays with the pair [<path>, <value>]. Path can be [null] if the evaluation produced a "new" value etc so that path tracking is not possible.

Problems, TODOs, issues and unknowns

Useful references

Tools and tricks

Thanks to

License

Copyright (c) 2022 Mattias Wadman

jqjq is distributed under the terms of the MIT License.

See the LICENSE file for license details.