Home

Awesome

Literate jq+shell Programming with jqmd

jqmd is a tool for writing well-documented, complex manipulations of YAML or JSON data structures using bash scripting and jq. It allows you to mix both kinds of code -- plus snippets of YAML or JSON data! -- within one or more markdown documents, making it easier to write scripts that do complex things like generate docker-compose configurations or manipulate serialized Wordpress options.

jqmd is implemented as an extension of mdsh, which means you can extend it to process additional kinds of code blocks by defining functions inside your shell @mdsh blocks. But you do not need to install mdsh, and you can use jqmd --compile to make distributable scripts that don't require jqmd or mdsh.

Contents

<!-- toc --> <!-- tocstop -->

Installation

If you have basher on your system, you can install jqmd with basher install bashup/jqmd; otherwise, just download the jqmd executable, chmod +x it, and put it in a directory on your PATH.

Usage

Running jqmd some-document.md args... will read and interpret unindented, triple-backquote fenced code blocks from some-document.md, according to the language listed on the block:

(As with mdsh, you can extend the above list by defining appropriate hook functions in shell @mdsh blocks; see the section below on "Supporting Additional Languages" for more info.)

Once all blocks have been executed or added to the filter pipeline, jq is run on standard input with the built-up filter pipeline, if any. (If the filtering pipeline is empty, jq is not run.) Filter pipeline elements are automatically separated with |, so you should not include a | at the beginning or end of your jq blocks or APPLY / FILTER code.

As with mdsh, you can optionally make a markdown file directly executable by giving it a shebang line such as #!/usr/bin/env jqmd, or use a shelldown header to make it executable, sourceable, and pretty. :) A sample shelldown header for jqmd might look like:

#!/usr/bin/env bash
: '
<!-- ex: set ft=markdown : '; eval "$(jqmd --eval "$BASH_SOURCE")" # -->

# My Awesome Script

...markdown and code start here...

Also as with mdsh, you can run jqmd --compile to output a bash version of your script, with no external dependencies (other than jq and maybe yaml2json or PyYAML). jqmd --compile and jqmd --eval both inject the necessary jqmd runtime functions into the script so that it will work on systems without jqmd installed. (Note that unless your script uses the YAML or yaml2json functions at runtime, your script's users will not need it installed.)

(If you'd like more information on compiling, sourcing, and shelldown headers, feel free to have a look at the mdsh docs!)

Data Merging

In a jqmd program, one is often incrementally defining some sort of data structure (such as, e.g. a docker-compose project specification, or a set of Wordpress options). While jq expressions can be used directly to manipulate such a data structure, a more intuitive way to express such data structures is as a series of JSON or YAML blocks that are combined in some way. For this reason, jqmd defines an intuitive data structure merging function to apply such data blocks to an existing data structure. This merging function is exposed to jqmd programs as jqmd::data($data), and is used by default to merge JSON and YAML data. The merge algorithm is as follows:

For most programs, this algorithm is sufficient to do most incremental data structure creation. If you have different needs, however, you can define a jqmd_data function of your own: JSON and YAML data are wrapped with a call to jqmd_data, but the default jqmd_data just calls jqmd::data.

If you want to override the data merging for all data as of the start of the filter chain, you define a jqmd_data function in a DEFINE call or a jq defs block. Or, you can override it for just a few filters or blocks by defining it in an APPLY or FILTER call or jq block. Afterwards, you can restore the original data merging algorithm like this:

FILTER 'def jqmd_data($data): jqmd::data($data) ; .'

Reusable Blocks

Normally, code or data blocks are executed immediately, at the point they appear in the document. But for more complex scripts or libraries, this is a bit limiting. So jqmd allows you to turn blocks into shell functions, so they can be called more than once (or not at all), possibly with parameters. For example, the following markdown:

```jq @func setElement key="$1" @val="$2"
.[$key] = $val
```

```yaml @func mksite SITE WP_HOME
services:
  \($SITE):
    environment:
      WP_HOME: \($WP_HOME)
```

...expands into the following two shell functions:

function setElement() {
	APPLY $'.[$key] = $val\n' \
		key="$1" @val="$2"
}

function mksite() {
	APPLY $'jqmd_data({"services":{"\\($SITE)":{"environment":{"WP_HOME":"\\($WP_HOME)"}}}})\n' \
		SITE WP_HOME
}

Everything after the @func name part of the block opener becomes arguments to APPLY, which maps shell variables or other values to jq variables with the specified names. An @ before an argument name means, "this variable or value is already JSON-encoded", and the absence of an = means "create a jq variable with the same name and value as this shell or environment variable". (Note: values after = should be quoted as shown above if they contain variables or shell parameters like $1.)

So, our example setElement function takes two positional arguments and sets a key (given as a string) to a value (given as JSON data). So e.g. setElement foo 42 would be equivalent to the jq expression .foo = 42.

The second example function, mksite, sets the WP_HOME for a docker-compose service named $SITE with the current contents of $SITE and $WP_HOME. (Unlike normal docker-compose string interpolation -- which can only use one value for an environment variable -- this function can be called several times with different SITE and WP_HOME values to build up configuration for mutliple containers.)

These are just a few examples of what you can do with reusable @func blocks. @func can only be used with json, yaml, or jq blocks. jq and json blocks can refer directly to parameter variables, while yaml blocks can only use string interpolation (\( $var ) ) to insert string keys or values. jq blocks are applied as-is, while json and yaml blocks are wrapped in a call to jqmd_data() (as described in Data Merging, above).

Named Constants

Data blocks can also be tagged as "named constants": a code block starting with e.g. ```yaml !const foo will have its contents defined as a zero-argument jq function named foo.

That is, the following two code blocks do the exact same thing:

```jq defs
def pi: 3.14159;
```
```json !const pi
3.14159
```

Programming Models

jqmd supports developing three types of programs: filters, scripts, and extensions. The main differences are that:

Filters

Filters are programs that build up a single giant jq pipeline, and then act as a filter, typically taking JSON input from stdin and sending the result to stdout. If your markdown document defines at least one filter, and doesn't use RUN_JQ or CLEAR_FILTERS to reset the pipeline, it's a filter. jqmd will automatically run jq to do the filtering from stdin to stdout, after the entire markdown document has been processed. If you don't want jq to read from stdin, you can use JQ_OPTS -n within your script to start the filter pipeline without any file input. (Similarly, you can use JQ_OPTS -- somefile to force jq to read input from a specific file instead of stdin.)

Scripts

If your program isn't a filter, it's probably a script. Scripts can run jq with shared imports, functions, and arguments, using the RUN_JQ function. (They must not add anything to the filter pipeline after the last RUN_JQ or CLEAR_FILTERS call, though, or jqmd will think the program's a filter!)

You'll generally use this approach if your script needs to run jq multiple times with different inputs and filters. Each time a script uses the CLEAR_FILTERS or RUN_JQ functions, the filter pipeline is reset to empty and can then be built up again to run different operations.

(Note: unlike the filter pipeline, jq options, arguments, imports, and defintions are cumulative. They can only be added to as the program executes, and cannot be reset. Thus, they are shared across all invocations of RUN_JQ. So anything specific to a given run of jq should be specified as a filter, or passed as an explicit command-line argument to RUN_JQ.)

Extensions

jqmd itself can be extended by other shell scripts, to make more-specialized tools or custom interpreters. Sourcing jqmd from a bash script will define all its functions, but not actually run a program. In this way, you can use all of the available functions described below (plus any of mdsh's underlying API) in a shell script, rather than a markdown file. (You can also use or redefine jqmd and mdsh's internal functions, but those not documented here or in the mdsh documentation are subject to change without notice!)

If you are sourcing jqmd (whether it's to write an extension or reuse its functions), you should also read the mdsh docs, since jqmd is an extension of mdsh.

Available Functions

Within shell blocks, many functions are available for your use. When passing jq code to them, it's best to use single quotes to avoid unwanted interpretation of $ variables or other quoting issues, e.g.:

DEFINE '
def recursive_add($other): . as $original |
    reduce paths(type=="array") as $path (
        (. // {}) * $other; setpath( $path; ($original | getpath($path)) + ($other | getpath($path)) )
    );
'
DEFINE 'def jqmd_data($arg): recursive_add($arg);'

Adding jq Code and Data

Notice that JSON and YAML blocks are always filtered through a jqmd_data() function, which by default does data merging, but you can always redefine the function to do something different, even as part of a FILTER or jq block. (Just remember that while filters can begin with function definitions, they must each end with an expression, even if it's only a ..)

Also note that data passed to the JSON and YAML functions can contain jq interpolation expressions, which means that you must not pass untrusted data to them. If you need to process a user-supplied JSON string, the simplest way is to use JSON "( %s | fromjson)" "$untrusted_json". Alternately, you can call ARGJSON someJQvarname "$untrusted_json" to create the jq variable $someJQvarname, and then use it with e.g. JSON '$someJQvarname' . (Note the single quotes!)

(If your user-supplied data is in YAML form, you can use the same approaches, but must convert it to JSON first.)

JSON Escaping and Data Structures

These functions don't do anything to jq or the filter pipeline; they simply escape, quote, or otherwise format values into JSON, returning the result(s) via REPLY. You can then use them to build up FILTER strings, or pipe them to jq as input.

Adding jq Options and Arguments

(Note: the added options will reset to empty again after RUN_JQ, CALL_JQ, or CLEAR_FILTERS.)

Controlling jq Execution

Note: piping into RUN_JQ or CALL_JQ, or invoking them in a subshell or shell substituion will not reset the current filter pipeline. To capture jq's output, use CALL_JQ instead of shell substitution. To pipe input into jq, pass it as a post--- argument to RUN_JQ or CALL_JQ, e.g.:

$ echo '"something"' | RUN_JQ .       # WRONG: CLEAR_FILTERS won't run
$ RUN_JQ . -- <(echo '"something"')   # RIGHT: use process substitution instead of piping

$ foo bar "$(RUN_JQ)"        # WRONG: CLEAR_FILTERS won't run
$ CALL_JQ; foo bar "$REPLY"  # RIGHT

Command-line Arguments

You can pass additional arguments to jqmd, after the path to the markdown file. These additional arguments are available as $1, $2, etc. within any top-level shell code in the markdown file.

Supporting Additional Languages

By default, jqmd only interprets unindented, triple-backquoted markdown blocks tagged as shell, jq, jq defs, jq imports, yaml, yml, or json. Unindented triple-backquoted blocks with any other tags are interpreted as data and assigned to shell variables, as described in the mdsh docs on data blocks.

As with mdsh, however, you can define interpreters for other block types by defining mdsh-lang-X or mdsh-compile-X functions in shell @mdsh blocks, via a wrapper script, or as exported functions in your bash environment. (You can also override these functions to change jqmd's default interpretation of jq, YAML, or JSON blocks.)

For more information on how to do this, see the mdsh docs on processing non-shell languages, or consult the mdsh docs in general for more info on what you can do with jqmd.