Home

Awesome

Mazeppa

CI

Supercompilation 1 is a program transformation technique that symbolically evaluates a given program, with run-time values as unknowns. In doing so, it discovers execution patterns of the original program and synthesizes them into standalone functions; the result of supercompilation is a more efficient residual program. In terms of transformational power, supercompilation subsumes both deforestation 2 and partial evaluation 3, and even exhibits certain capabilities of theorem proving.

Mazeppa is a modern supercompiler intended to be a compilation target for call-by-value functional languages. Having prior supercompilers diligently compared and revised, Mazeppa

  1. Provides the full set of primitive data types for efficient computation.
  2. Supports manual control of function unfolding.
  3. Is fully transparent in terms of what decisions it takes during transformation.
  4. Is designed with efficiency in mind from the very beginning.

Installation

First, install the OCaml system on your machine:

$ bash -c "sh <(curl -fsSL https://raw.githubusercontent.com/ocaml/opam/master/shell/install.sh)"
$ opam init --auto-setup

Then clone the repository and install Mazeppa:

$ git clone https://github.com/mazeppa-dev/mazeppa.git
$ cd mazeppa
$ ./scripts/install.sh

Type mazeppa --help to confirm the installation.

<details> <summary>Building with Flambda</summary>

Flambda is a powerful program inliner and specializer for OCaml. If you build Mazeppa with an Flambda-enabled OCaml compiler, you may see much better performance. To set it up:

$ opam switch create 5.2.0+flambda ocaml-variants.5.2.0+options ocaml-option-flambda
$ eval $(opam env --switch=5.2.0+flambda)

(You may use a different version instead of 5.2.0 if you wish.)

To check if Flambda was successfully enabled, run:

$ ocamlopt -config | grep flambda
</details>

Hacking

You can play with Mazeppa without actually installing it. Having OCaml installed and the repository cloned (as above), run the following command from the root directory:

$ ./scripts/play.sh

(Graphviz is required: sudo apt install graphviz.)

This will launch Mazeppa with --inspect on playground/main.mz and visualize the process graph in target/graph.svg. The latter can be viewed in VS Code by the Svg Preview extension.

./scripts/play.sh will automatically recompile the sources in OCaml, if anything is changed.

Supercompilation by example

The best way to understand how supercompilation works is by example. Consider the following function that takes a list and computes a sum of its squared elements:

[examples/sum-squares/main.mz]

main(xs) := sum(mapSq(xs));

sum(xs) := match xs {
    Nil() -> 0i32,
    Cons(x, xs) -> +(x, sum(xs))
};

mapSq(xs) := match xs {
    Nil() -> Nil(),
    Cons(x, xs) -> Cons(*(x, x), mapSq(xs))
};

This program is written in the idiomatic, listful functional style. Every function does only one thing, and does it well. However, there is a serious problem here: mapSq essentially constructs a list that will be immediately deconstructed by sum, meaning that we 1) we need to allocate extra memory for the intermediate list, and 2) we need two passes of computation instead of one. The solution to this problem is called deforestation 2, which is a special case of supercompilation.

Let us see what Mazeppa does with this program:

$ mkdir sum-squares
$ cd sum-squares
# Copy-paste the program above.
$ nano main.mz
$ mazeppa run --inspect

The --inspect flag tells Mazeppa to give a detailed report on the transformation process. The sum-squares/target/ directory will contain the following files:

target
├── graph.dot
├── nodes.json
├── output.mz
└── program.json

output.mz will contain the following code:

[examples/sum-squares/target/output.mz]

main(xs) := f0(xs);

f0(x0) := match x0 {
    Cons(x1, x2) -> +(*(x1, x1), f0(x2)),
    Nil() -> 0i32
};

The supercompiler has successfully merged sum and mapSq into a single function, f0! Unlike the original program, f0 works in a single pass, without having to allocate any extra memory.

How did the supercompiler got to this point? Let us see the generated process graph:

<div align="center"> <img src="media/sum-squares.svg" width="500px" /> </div>

For reference, nodes.json contains the following data in JSON:

[
  [ "n0", "main(xs)" ],
  [ "n1", "sum(mapSq(xs))" ],
  [ "n2", "sum(.g1(xs))" ],
  [ "n3", "xs" ],
  [ "n4", "sum(Cons(*(.v0, .v0), mapSq(.v1)))" ],
  [ "n5", ".g0(Cons(*(.v0, .v0), mapSq(.v1)))" ],
  [ "n6", "+(*(.v0, .v0), sum(mapSq(.v1)))" ],
  [ "n7", "+(*(.v0, .v0), sum(.g1(.v1)))" ],
  [ "n8", "*(.v0, .v0)" ],
  [ "n9", ".v0" ],
  [ "n10", ".v0" ],
  [ "n11", "sum(.g1(.v1))" ],
  [ "n12", ".v1" ],
  [ "n13", "+(.v3, .v4)" ],
  [ "n14", ".v3" ],
  [ "n15", ".v4" ],
  [ "n16", "sum(Nil())" ],
  [ "n17", ".g0(Nil())" ],
  [ "n18", "0i32" ]
]

(We will not need to inspect program.json for this tiny example, but feel free to look at it: it is not too complicated.)

The supercompiler starts with node n0 containing main(xs). After two steps of function unfolding, we reach node n2 containing sum(.g1(xs)), where .g1 is the IR function that corresponds to our mapSq. At this point, we have no other choice than to analyze the call .g1(xs) by conjecturing what values xs might take at run-time. Since our mapSq only accepts the constructors Nil and Cons, it is sufficient to consider the cases xs=Cons(.v0, .v1) and xs=Nil() only.

Node n4 is what happens after we substitute Cons(.v0, .v1) for xs, where .v0 and .v1 are fresh variables. After three more function unfoldings, we arrive at n7. This is the first time we have to split the call +(*(.v0, .v0), sum(.g1(.v1))) into .v3=*(.v0, .v0) (n8) and .v4=sum(.g1(.v1)) (n11) and proceed supercompiling +(.v3, .v4) (n13); the reason for doing so is that a previous node (n2) is structurally embedded in n7, so supercompilation might otherwise continue forever. Now, what happens with sum(.g1(.v1)) (n11)? We have seen it earlier! Recall that n2 contains sum(.g1(xs)), which is just a renaming of sum(.g1(.v1)); so we decide to fold n11 into n2, because it makes no sense to supercompile the same thing twice. The other branches of n7, namely n13 and n8, are decomposed, meaning that we simply proceed transforming their arguments.

As for the other branch of n2, sum(Nil()) (n16), it is enough to merely unfold this call to 0i32 (n18).

After the process graph is completed, residualization converts it to a final program. During this stage, dynamic execution patterns become functions -- node n2 now becomes the function f0, inasmuch as some other node (n11) points to it. In any residual program, there will be exactly as many functions as there are nodes with incoming dashed lines, plus main.

In summary, supercompilation consists of 1) unfolding function definitions, 2) analyzing calls that pattern-match an unknown variable, 3) breaking down computation into smaller parts, 4) folding repeated computations, and 5) decomposing calls that cannot be unfolded, such as +(.v3, .v4) (n13) in our example. The whole supercompilation process is guaranteed to terminate, because when some computation is becoming dangerously bigger and bigger, we break it down into subproblems and solve them in isolation.

There are a plenty of other interesting examples of deforestation in the examples/ folder, including tree-like data structures. In fact, we have reimplemented all samples from the previous work on higher-order call-by-value supercompilation by Peter A. Jonsson and Johan Nordlander 4 5; in all cases, Mazeppa has performed similarly or better.

Specializing the power function

Now consider another example, this time involving partial evaluation:

[examples/power-sq/main.mz]

main(a) := powerSq(a, 7u8);

powerSq(a, x) := match =(x, 0u8) {
    T() -> 1i32,
    F() -> match =(%(x, 2u8), 0u8) {
        T() -> square(powerSq(a, /(x, 2u8))),
        F() -> *(a, powerSq(a, -(x, 1u8)))
    }
};

square(a) := *(a, a);

powerSq implements the famous exponentiation-by-squaring algorithm. The original program is inefficient: it recursively examines the x parameter of powerSq, although it is perfectly known at compile-time. Running Mazeppa on main(a) will yield the following residual program:

[examples/power-sq/target/output.mz]

main(a) := *(a, let x0 := *(a, *(a, a)); *(x0, x0));

The whole powerSq function has been eliminated, thus achieving the effect of partial evaluation. (If we consider powerSq to be an interpreter for a program x and input data a, then it is the first Futamura projection: specializing an interpreter to obtain an efficient target program.) Also, notice how the supercompiler has managed to share the argument *(a, *(a, a)) twice, so that it is not recomputed each time anew. The residual program indeed reflects exponentiation by squaring.

Synthesizing the KMP algorithm

Let us go beyond deforestation and partial evaluation. Consider a function matches(p, s) of two strings, which returns T() if s contains p and F() otherwise. The naive implementation in Mazeppa would be the following, where p is specialized to "aab":

[examples/kmp-test/main.mz]

main(s) := matches(Cons('a', Cons('a', Cons('b', Nil()))), s);

matches(p, s) := go(p, s, p, s);

go(pp, ss, op, os) := match pp {
    Nil() -> T(),
    Cons(p, pp) -> goFirst(p, pp, ss, op, os)
};

goFirst(p, pp, ss, op, os) := match ss {
    Nil() -> F(),
    Cons(s, ss) -> match =(p, s) {
        T() -> go(pp, ss, op, os),
        F() -> failover(op, os)
    }
};

failover(op, os) := match os {
    Nil() -> F(),
    Cons(_s, ss) -> matches(op, ss)
};

(Here we represent strings as lists of characters for simplicity, but do not worry, Mazeppa provides built-in strings as well.)

The algorithm is correct but inefficient. Consider what happens when "aa" is successfully matched, but 'b' is not. The algorithm will start matching "aab" once again from the second character of s, although it can already be said that the second character of s is 'a'. The deterministic finite automaton built by the Knuth-Morris-Pratt algorithm (KMP) 6 is an alternative way to solve this problem.

By running Mazeppa on the above sample, we can obtain an efficient string matching algorithm for p="aab" that reflects KMP in action:

[examples/kmp-test/target/output.mz]

main(s) := f0(s);

f0(x0) := match x0 {
    Cons(x1, x2) -> match =(97u8, x1) {
        F() -> f1(x2),
        T() -> f2(x2)
    },
    Nil() -> F()
};

f1(x0) := f0(x0);

f2(x0) := match x0 {
    Cons(x1, x2) -> match =(97u8, x1) {
        F() -> f1(x2),
        T() -> f4(x2)
    },
    Nil() -> F()
};

f3(x0) := f2(x0);

f4(x0) := match x0 {
    Cons(x1, x2) -> match =(98u8, x1) {
        F() -> match =(97u8, x1) {
            F() -> f1(x2),
            T() -> f4(x2)
        },
        T() -> T()
    },
    Nil() -> F()
};

The naive algorithm that we wrote has been automatically transformed into a well-known efficient version! While the naive algorithm has complexity O(|p| * |s|), the specialized one is O(|s|).

Synthesizing KMP is a standard example that showcases the power of supercompilation with respect to other techniques (e.g., see 7 and 8). Obtaining KMP by partial evaluation is not possible without changing the original definition of matches 9 10.

Metasystem transition

Valentin Turchin, the inventor of supercompilation, describes the concept of metasystem transition in the following way 11 12 13:

Consider a system S of any kind. Suppose that there is a way to make some number of copies from it, possibly with variations. Suppose that these systems are united into a new system S' which has the systems of the S type as its subsystems, and includes also an additional mechanism which controls the behavior and production of the S-subsystems. Then we call S' a metasystem with respect to S, and the creation of S' a metasystem transition. As a result of consecutive metasystem transitions a multilevel structure of control arises, which allows complicated forms of behavior.

Thus, supercompilation can be readily seen as a metasystem transition: there is an object program in Mazeppa, and there is the Mazeppa supercompiler which controls and supervises execution of the object program. However, we can go further and perform any number of metasystem transitions within the realm of the object program itself, as the next example demonstrates.

We will be using the code from examples/lambda-calculus/. Below is a standard normalization-by-evaluation procedure for obtaining beta normal forms of untyped lambda calculus terms:

normalize(lvl, env, t) := quote(lvl, eval(env, t));

normalizeAt(lvl, env, t) := normalize(+(lvl, 1u64), Cons(vvar(lvl), env), t);

vvar(lvl) := Neutral(NVar(lvl));

eval(env, t) := match t {
    Var(idx) -> indexEnv(env, idx),
    Lam(body) -> Closure(env, body),
    Appl(m, n) ->
        let mVal := eval(env, m);
        let nVal := eval(env, n);
        match mVal {
            Closure(env, body) -> eval(Cons(nVal, env), body),
            Neutral(nt) -> Neutral(NAppl(nt, nVal))
        }
};

quote(lvl, v) := match v {
    Closure(env, body) -> Lam(normalizeAt(lvl, env, body)),
    Neutral(nt) -> quoteNeutral(lvl, nt)
};

quoteNeutral(lvl, nt) := match nt {
    NVar(var) -> Var(-(-(lvl, var), 1u64)),
    NAppl(nt, nVal) -> Appl(quoteNeutral(lvl, nt), quote(lvl, nVal))
};

indexEnv(env, idx) := match env {
    Nil() -> Panic(++("the variable is unbound: ", string(idx))),
    Cons(value, xs) -> match =(idx, 0u64) {
        T() -> value,
        F() -> indexEnv(xs, -(idx, 1u64))
    }
};

(eval/quote are sometimes called reflect/reify.)

This is essentially a big-step machine for efficient capture-avoiding substitution: instead of reconstructing terms on each beta reduction, we maintain an environment of values. eval projects a term to the "semantic domain", while quote does the opposite; normalize is simply the composition of quote and eval. To avoid bothering with fresh name generation, we put De Bruijn indices in the Var constructor and De Bruijn levels in NVar; the latter is converted into the former in quoteNeutral.

Now let us compute something with this machine:

main() := normalize(0u64, Nil(), example());

example() := Appl(Appl(mul(), two()), three());

two() := Lam(Lam(Appl(Var(1u64), Appl(Var(1u64), Var(0u64)))));

three() := Lam(Lam(Appl(Var(1u64), Appl(Var(1u64), Appl(Var(1u64),
    Var(0u64))))));

mul() := Lam(Lam(Lam(Lam(Appl(
    Appl(Var(3u64), Appl(Var(2u64), Var(1u64))),
    Var(0u64))))));

The body of main computes the normal form of the lambda term example() that multiplies Church numerals two() and three().

By supercompiling main(), we obtain the Church numeral of 6:

[examples/lambda-calculus/target/output.mz]

main() := Lam(Lam(Appl(Var(1u64), Appl(Var(1u64), Appl(Var(1u64), Appl(Var(1u64)
    , Appl(Var(1u64), Appl(Var(1u64), Var(0u64)))))))));

The lambda calculus interpreter has been completely annihilated!

In this example, we have just seen a two-level metasystem stairway (in Turchin's terminology 14): on level 0, we have the Mazeppa supercompiler transforming the object program, while on level 1, we have the object program normalizing lambda calculus terms. There can be an arbitrary number of interpretation levels, and Mazeppa can be used to collapse them all. This general behaviour of supercompilation was explored by Turchin himself in 1 (section 7), where he was able to supercompile two interpretable programs, one Fortran-like and one in Lisp, to obtain a speedup factor of 40 in both cases.

The lambda normalizer also shows us how to incarnate higher-order functions into a first-order language. In Mazeppa, we cannot treat functions as values, but it does not mean that we cannot simulate them! By performing a metasystem transition, we can efficiently implement higher-order functions in a first-order language. Along with defunctionalization and closure conversion, this technique can be used for compilation of higher-order languages into efficient first-order code.

Related examples: imperative virtual machine, self-interpreter.

Restricting supercompilation

In retrospect, the major problem that prevented widespread adoption of supercompilation is its unpredictability -- the dark side of its power. To get a sense of what it means, consider how can we solve any SAT problem "on the fly":

[examples/sat-solver/main.mz]

main(a, b, c, d, e, f, g) := solve(formula(a, b, c, d, e, f, g));

formula(a, b, c, d, e, f, g) :=
    and(or(Var(a), or(Not(b), or(Not(c), F()))),
    and(or(Not(f), or(Var(e), or(Not(g), F()))),
    and(or(Var(e), or(Not(g), or(Var(f), F()))),
    and(or(Not(g), or(Var(c), or(Var(d), F()))),
    and(or(Var(a), or(Not(b), or(Not(c), F()))),
    and(or(Not(f), or(Not(e), or(Var(g), F()))),
    and(or(Var(a), or(Var(a), or(Var(c), F()))),
    and(or(Not(g), or(Not(d), or(Not(b), F()))),
    T()))))))));

or(x, rest) := match x {
    Var(x) -> If(x, T(), rest),
    Not(x) -> If(x, rest, T())
};

and(clause, rest) := match clause {
    If(x, m, n) -> If(x, and(m, rest), and(n, rest)),
    T() -> rest,
    F() -> F()
};

solve(formula) := match formula {
    If(x, m, n) -> analyze(x, m, n),
    T() -> T(),
    F() -> F()
};

analyze(x, m, n) := match x {
    T() -> solve(m),
    F() -> solve(n)
};

There are two things wrong with this perfectly correct code: 1) the supercompiler will expand the formula in exponential space, and 2) the supercompiler will try to solve the expanded formula in exponential time. Sometimes, we just do not want to evaluate everything at compile-time.

However, despair not: we provide a solution for this problem. Let us first consider how to postpone solving the formula until run-time. It turns out that the only thing we need to do is to annotate the function formula with @extract as follows:

@extract
formula(a, b, c, d, e, f, g) :=
    // Everything is the same.

When Mazeppa sees solve(formula(a, b, c, d, e, f, g)), it extracts the call to formula into a fresh variable .v0 and proceeds supercompiling the extracted call and solve(.v0) in isolation. The latter call will just reproduce the original SAT solver.

But supercompiling the call to formula will still result in an exponential blowup. Let us examine why this happens. Our original formula consists of calls to or and and; while or is obviously not dangerous, and propagates the rest parameter to both branches of If (the first match case) -- this is the exact place where the blowup occurs. So let us mark and with @extract as well:

@extract
and(clause, rest) := match clause {
    // Everything is the same.
};

That is it! When and is to be transformed, Mazeppa will extract the call out of its surrounding context and supercompile it in isolation. By adding two annotations at appropriate places, we have solved both the problem of code blowup and exponential running time of supercompilation. In general, whenever Mazeppa sees ctx[f(t1, ..., tN)], where f is marked @extract and ctx[.] is a non-empty surrounding context with . in a redex position, it will plug a fresh variable v into ctx and proceed transforming the following nodes separately: f(t1, ..., tN) and ctx[v].

Finally, note that @extract is only a low-level mechanism; a compiler front-end must carry out additional machinery to tell Mazeppa which functions to extract. This can be done in two ways:

Both methods can be combined to achieve a desired effect.

Lazy evaluation

Mazeppa employs an interesting design choice to have eager functions and lazy constructors. The following example, where magic(1u32, 1u32) generates Fibonacci numbers, was adopted from Haskell:

[examples/lazy-fibonacci/main.mz]

main() := getIt(magic(1u32, 1u32), 3u64);

magic(m, n) := match =(m, 0u32) {
    T() -> Nil(),
    F() -> Cons(m, magic(n, +(m, n)))
};

getIt(xs, n) := match xs {
    Nil() -> Panic("undefined"),
    Cons(x, xs) -> match =(n, 1u64) {
        T() -> x,
        F() -> getIt(xs, -(n, 1u64))
    }
};
<!--- This example is covered in the unit tests. -->

If constructors were eager, magic(1u32, 1u32) would never terminate. However, Cons does not evaluate its arguments! Since getIt only consumes a finite portion of the infinite list, the program terminates and prints 2u32:

$ mazeppa eval
2u32

Lazy constructors enable effortless deforestation, as discussed below.

Mazeppa as a library

After running ./scripts/install.sh, Mazeppa is available as an OCaml library!

Set up a new Dune project as follows:

$ dune init project my_compiler

Add mazeppa as a third-party library into your bin/dune:

(executable
 (public_name my_compiler)
 (name main)
 (libraries my_compiler mazeppa))

Paste the following code into bin/main.ml (this is examples/sum-squares/main.mz encoded in OCaml):

open Mazeppa

let input : Raw_program.t =
    let sym = Symbol.of_string in
    let open Raw_term in
    let open Checked_oint in
    [ [], sym "main", [ sym "xs" ], call ("sum", [ call ("mapSq", [ var "xs" ]) ])
    ; ( []
      , sym "sum"
      , [ sym "xs" ]
      , Match
          ( var "xs"
          , [ (sym "Nil", []), int (I32 (I32.of_int_exn 0))
            ; ( (sym "Cons", [ sym "x"; sym "xs" ])
              , call ("+", [ var "x"; call ("sum", [ var "xs" ]) ]) )
            ] ) )
    ; ( []
      , sym "mapSq"
      , [ sym "xs" ]
      , Match
          ( var "xs"
          , [ (sym "Nil", []), call ("Nil", [])
            ; ( (sym "Cons", [ sym "x"; sym "xs" ])
              , call
                  ( "Cons"
                  , [ call ("*", [ var "x"; var "x" ]); call ("mapSq", [ var "xs" ]) ] ) )
            ] ) )
    ]
;;

let () =
    try
      let output = Mazeppa.supercompile input in
      Printf.printf "%s\n" (Raw_program.show output)
    with
    | Mazeppa.Panic msg ->
      Printf.eprintf "Something went wrong: %s\n" msg;
      exit 1
;;

Run dune exec my_compiler to see the desired residual program:

[([], "main", ["xs"], (Raw_term.Call ("f0", [(Raw_term.Var "xs")])));
  ([], "f0", ["x0"],
   (Raw_term.Match ((Raw_term.Var "x0"),
      [(("Cons", ["x1"; "x2"]),
        (Raw_term.Call ("+",
           [(Raw_term.Call ("*", [(Raw_term.Var "x1"); (Raw_term.Var "x1")]));
             (Raw_term.Call ("f0", [(Raw_term.Var "x2")]))]
           )));
        (("Nil", []), (Raw_term.Const (Const.Int (Checked_oint.I32 0))))]
      )))
  ]

You can call Mazeppa as many times as you want, including in parallel. Note that we expose a limited interface to the supercompiler; in particular, there is no way to inspect what it does in the process (i.e., --inspect).

Besides supercompilation, we also provide a built-in evaluator:

val eval : Raw_program.t -> Raw_term.t

It can only be called on programs whose main functions do not accept parameters. Unlike supercompile, it produces an evaluated term of type Raw_term.t and can possibly diverge.

See other API functions and their documentation in lib/mazeppa.mli.

Translation to C

Suppose that main.mz contains a slightly modified version of the lazy Fibonacci example:

main(n) := getIt(magic(1u32, 1u32), n);

magic(m, n) := match =(m, 0u32) {
    T() -> Nil(),
    F() -> Cons(m, magic(n, +(m, n)))
};

getIt(xs, n) := match xs {
    Nil() -> Panic("undefined"),
    Cons(x, xs) -> match =(n, 1u64) {
        T() -> x,
        F() -> getIt(xs, -(n, 1u64))
    }
};

The following command translates it to C11 with GNU extensions (i.e., -std=gnu11):

$ cat main.mz | mazeppa translate --language C --entry fib
<details> <summary>Show the output</summary>
#include "mazeppa.h"

MZ_ENUM_USER_TAGS(op_Cons, op_Nil);

static mz_Value op_main(mz_ArgsPtr args);

static mz_Value op_magic(mz_ArgsPtr args);

static mz_Value op_getIt(mz_ArgsPtr args);

static mz_Value thunk_0(mz_EnvPtr env) {
    mz_Value var_m = (env)[0];
    return var_m;
}

static mz_Value thunk_1(mz_EnvPtr env) {
    mz_Value var_m = (env)[0];
    mz_Value var_n = (env)[1];
    return ({
    struct mz_value args[2];
    (args)[0] = var_n;
    (args)[1] = MZ_OP2(var_m, add, var_n);
    op_magic(args);
    });
}

static mz_Value op_main(mz_ArgsPtr args) {
    mz_Value var_n = (args)[0];
    return ({
    struct mz_value args[2];
    (args)[0] = ({
    struct mz_value args[2];
    (args)[0] = MZ_INT(U, 32, UINT32_C(1));
    (args)[1] = MZ_INT(U, 32, UINT32_C(1));
    op_magic(args);
    });
    (args)[1] = var_n;
    op_getIt(args);
    });
}

static mz_Value op_magic(mz_ArgsPtr args) {
    mz_Value var_m = (args)[0];
    mz_Value var_n = (args)[1];
    return ({
    struct mz_value tmp = MZ_OP2(var_m, equal, MZ_INT(U, 32, UINT32_C(0)));
    switch((tmp).tag) {
    case op_T: {
    tmp = MZ_EMPTY_DATA(op_Nil);
    break;
    }
    case op_F: {
    tmp = MZ_DATA(op_Cons, 2, MZ_THUNK(thunk_0, 1, var_m), MZ_THUNK(thunk_1, 2, var_m, var_n));
    break;
    }
    default: MZ_UNEXPECTED_TAG((tmp).tag);
    }
    tmp;
    });
}

static mz_Value op_getIt(mz_ArgsPtr args) {
    mz_Value var_xs = (args)[0];
    mz_Value var_n = (args)[1];
    return ({
    struct mz_value tmp = var_xs;
    switch((tmp).tag) {
    case op_Nil: {
    tmp = mz_panic(MZ_STRING("undefined"));
    break;
    }
    case op_Cons: {
    mz_Value var_x = ((tmp).payload)[0];
    mz_Value var_xs$ = ((tmp).payload)[1];
    tmp = ({
    struct mz_value tmp = MZ_OP2(var_n, equal, MZ_INT(U, 64, UINT64_C(1)));
    switch((tmp).tag) {
    case op_T: {
    tmp = mz_force(var_x);
    break;
    }
    case op_F: {
    tmp = ({
    struct mz_value args[2];
    (args)[0] = mz_force(var_xs$);
    (args)[1] = MZ_OP2(var_n, sub, MZ_INT(U, 64, UINT64_C(1)));
    op_getIt(args);
    });
    break;
    }
    default: MZ_UNEXPECTED_TAG((tmp).tag);
    }
    tmp;
    });
    break;
    }
    default: MZ_UNEXPECTED_TAG((tmp).tag);
    }
    tmp;
    });
}

extern mz_Value fib(mz_Value var_n) {
    return MZ_CALL_MAIN(var_n);
}
</details>

The translate command requires both --language , which is the target language for translation, and --entry, which is the name of an external symbol that will correspond to your main function. The input Mazeppa program comes from stdin (cat main.mz in our example); the output GNU11 program is written to stdout.

Let us advance further and compile the output program to an object file. First, copy c/deps/sds.c, c/deps/sds.h, and c/deps/sdsalloc.h to your current directory; second, install Boehm GC on your computer:

$ sudo apt install libgc-dev -y

then execute the following command:

$ cat main.mz \
    | mazeppa translate --language C --entry fib --dump-header-to . \
    | gcc -c -o program.o -std=gnu11 -xc -

The --dump-header-to option writes the content of mazeppa.h to a specified location; this is needed for the output program to compile. The gcc command accepts the output program from stdin and produces program.o.

Now what is left is to actually invoke the generated fib function. Create main.c with the following content:

#include "mazeppa.h"

mz_Value fib(mz_Value n);

int main(void) {
    // Always initialize Boehm GC before invoking Mazeppa code.
    GC_INIT();
    mz_Value v = fib(MZ_INT(U, 64, 10));
    printf("fib(10) = %" PRIu32 "\n", MZ_GET(U32, v));
}

This "driver" program just invokes fib with a Mazeppa integer (MZ_INT) and prints the result. You can use any functionality from mazeppa.h, provided that it is not prefixed with mz_priv_ or MZ_PRIV_.

To bring all the pieces together:

$ gcc main.c program.o sds.c -lgc -std=gnu11

./a.out prints fib(10) = 55 and exits, as expected.

Usage considerations

Technical decisions

Mazeppa employs several interesting design choices (ranked by importance):

While most of the above is not particularly novel, we believe that the combination of these features makes Mazeppa a more practical alternative than its predecessors.

Further research

A note about built-in panics

Some built-in operations raise a panic on certain conditions. Consider the following program:

main(x) := ignore(+(x, 100u8));

ignore(_x) := 5u8;

It will be supercompiled as follows:

main(x) := 5u8;

These two programs are not semantically equivalent! Whereas the former one raises a panic if x is 200u8, the latter does not. In general, Mazeppa can remove some built-in panics from the original program, or make them happen at some later point.

Is this an appropriate behaviour?

We believe it is. Take the semantics of Rust for example: when the code is compiled in the debug mode, integer overflow/underflow operations raise a panic, but in the release mode, they do not (modular arithmetic is used instead). The reason is that checked arithmetic introduces an overhead, and so application testing should detect those panics. The Rust compiler essentially changes the semantics of code depending on a compilation mode; as a result, some release programs may execute longer and terminate more often.

In practice, this scheme usually works well. While it is theoretically possible to preserve even built-in panics at the cost of additional analyses, we decide to keep the current behaviour as it is.

Language definition

Lexical structure

A symbol <SYMBOL> is a sequence of letters (a, ..., z and A, ..., Z) and digits (0, ..., 9), followed by an optional question mark (?), followed by an optional sequence of ' characters. The underscore character (_) may be the first character of a symbol, which may informally indicate that the value or function being defined is not used; otherwise, the first character must be a letter. The following sequences of characters are also permitted as symbols: ~, #, +, -, *, /, %, |, &, ^, <<, >>, =, !=, >, >=, <, <=, ++. The following are reserved words that may not be used as symbols: match, let.

There are four classes of unsigned integer constants:

Notes:

A string constant <STRING> is a sequence, between double quotes ("), of zero or more printable characters (we refer to printable characters as those numbered 33-126 in the ASCII character set), spaces, or string escape sequences:

Escape sequenceMeaning
\fForm feed (ASCII 12)
\nLine feed (ASCII 10)
\rCarriage return (ASCII 13)
\tHorizontal tab (ASCII 9)
\vVertical tab (ASCII 11)
\xhhASCII code in hexadecimal
\""
\\\

where h is either 0, ..., 9 or a, ..., f or A, ..., F.

A character constant <CHAR> is either a sole character enclosed in single quotes (') or a character escape sequence enclosed in single quotes. The character escape sequence is the same as for strings, except that \" is replaced by \'.

There are no other constants in Mazeppa.

A comment <COMMENT> is any sequence of characters after //, which is terminated by a newline character. (We only allow single-line comments for simplicity.)

Surface syntax

The entry point <program> is defined by the following rules:

where <def-attr-list> is a whitespace-separated sequence of function attributes (the same attribute can occur multiple times). Right now, the only allowed function attribute is @extract.

<term> is defined as follows:

The rest of the auxiliary rules are:

<const>:

<match-case>:

<pattern>:

In Mazeppa, primitive operations employ the same syntax as that of ordinary function calls. To distinguish between the two, we define <op1> and <op2> to be the following sets of symbols:

Furthermore, <op2> has the following subclasses:

Restrictions

If a program, function, or term conforms to these restrictions, we call it well-formed.

Desugaring

Original formDesugared formNotes
// ... restrestrest is in <program> or <term>
let p := t; umatch t { p -> u }p is in <pattern>
cASCII(c)c is in <CHAR>

where ASCII(c) is an appropriate u8 integer constant, according to the ASCII table; for example, ASCII('a') is 97u8.

Evaluation

Suppose that t is a well-formed term closed under environment env (defined below) and program is a well-formed program. Then the evaluation of t is governed by the following big-step environment machine:

Notation:

(Note that eval is a partial function, so evaluation of t can "get stuck" without a superimposed type system.)

In what follows, 1) signed integers are represented in two's complement notation, 2) panic denotes Panic(s), where s is some (possibly varying) implementation-defined string constant.

evalOp1 takes care of the unary operators for primitive types (x is in <INT>, s is in <STRING>):

Likewise, evalOp2 takes care of the binary operators for primitive types:

The definition of eval is now complete.

Release procedure

  1. Update the version field in dune-project and bin/main.ml.
  2. Type dune build to generate mazeppa.opam.
  3. Update CHANGELOG.md.
  4. Release the project in GitHub Releases.

FAQ

Is Mazeppa production-ready?

Not yet, we need to battle-test Mazeppa on some actual programming language. Our long-term goal is to find suitable heuristics to profitably supercompile any source file under 10'000 LoC (in Mazeppa).

How can I execute programs in Mazeppa?

For debugging and other purposes, we provide a built-in definitional interpreter that can execute Mazeppa programs. You can launch it by typing mazeppa eval (make sure that your main function does not accept parameters). For the purpose of real execution, we recommend translating Mazeppa to C and then compiling C to machine code, as shown above.

How can I perform I/O in Mazeppa?

Since Mazeppa is a purely functional language, the only way to implement I/O is as in Haskell 23: having a pure program that performs computation and a dirty runtime that performs side effects issued by the program. There are no plans to introduce direct I/O into Mazeppa: it will only make everything more complicated.

Will Mazeppa have a type system?

No, we do not think that a type system is necessary at this point. It is the responsibility of a front-end compiler to ensure that programs do not "go wrong".

Can I use Mazeppa for theorem proving?

The more we make supercompilation predictable, the less it is capable of theorem proving. For those interested in program analysis rather than optimization, we suggest looking into distillation 24.

Where can I learn more about supercompilation?

For the English audience, the following paper presents a decent introduction into supercompilation:

However, the following papers in Russian describe a supercompilation model that is closer the majority of existing supercompilers, including Mazeppa:

Mazeppa itself is inspired by this excellent paper (in English):

Finally, the international META workshops are great collections of articles about supercompilation and adjacent fields:

Can supercompilation be even more powerful?

Several approaches can lead to superlinear speedup of non-esoteric programs by supercompilation:

None of the above is planned to be implemented in Mazeppa, because 1) we think that writing asymptotically good programs is the responsibility of the programmer, not the optimizer, and 2) predictability of supercompilation is of greater importance to us. However, for those who are interested in this topic, the references may be helpful.

How do I contribute?

Just fork the repository, work in your own branch, and submit a pull request. Prefer rebasing when introducing changes to keep the commit history as clean as possible.

References

Footnotes

  1. Valentin F. Turchin. 1986. The concept of a supercompiler. ACM Trans. Program. Lang. Syst. 8, 3 (July 1986), 292–325. https://doi.org/10.1145/5956.5957 2

  2. Philip Wadler. 1988. Deforestation: transforming programs to eliminate trees. Theor. Comput. Sci. 73, 2 (June 22, 1990), 231–248. https://doi.org/10.1016/0304-3975(90)90147-A 2

  3. Futamura, Y. (1983). Partial computation of programs. In: Goto, E., Furukawa, K., Nakajima, R., Nakata, I., Yonezawa, A. (eds) RIMS Symposia on Software Science and Engineering. Lecture Notes in Computer Science, vol 147. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-11980-9_13

  4. Peter A. Jonsson and Johan Nordlander. 2009. Positive supercompilation for a higher order call-by-value language. SIGPLAN Not. 44, 1 (January 2009), 277–288. https://doi.org/10.1145/1594834.1480916 2

  5. Jonsson, Peter & Nordlander, Johan. (2010). Strengthening supercompilation for call-by-value languages. 2

  6. D. E. Knuth, J. H. Morris, and V. R. Pratt. Fast pattern matching in strings. SIAM Journal on Computing, 6:page 323 (1977).

  7. Glück, R., Klimov, A.V. (1993). Occam's razor in metacomputation: the notion of a perfect process tree. In: Cousot, P., Falaschi, M., Filé, G., Rauzy, A. (eds) Static Analysis. WSA 1993. Lecture Notes in Computer Science, vol 724. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-57264-3_34

  8. Sørensen MH, Glück R, Jones ND. A positive supercompiler. Journal of Functional Programming. 1996;6(6):811-838. doi:10.1017/S0956796800002008

  9. Consel, Charles, and Olivier Danvy. "Partial evaluation of pattern matching in strings." Information Processing Letters 30.2 (1989): 79-86.

  10. Jones, Neil & Gomard, Carsten & Sestoft, Peter. (1993). Partial Evaluation and Automatic Program Generation.

  11. Turchin, V.F. (1996). Metacomputation: Metasystem transitions plus supercompilation. In: Danvy, O., Glück, R., Thiemann, P. (eds) Partial Evaluation. Lecture Notes in Computer Science, vol 1110. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-61580-6_24

  12. Turchin, Valentin F. "Program transformation with metasystem transitions." Journal of Functional Programming 3.3 (1993): 283-313.

  13. Turchin, Valentin F.. “A dialogue on Metasystem transition.” World Futures 45 (1995): 5-57.

  14. Turchin, V., and A. Nemytykh. Metavariables: Their implementation and use in Program Transformation. CCNY Technical Report CSc TR-95-012, 1995.

  15. Maximilian Bolingbroke and Simon Peyton Jones. 2010. Supercompilation by evaluation. In Proceedings of the third ACM Haskell symposium on Haskell (Haskell '10). Association for Computing Machinery, New York, NY, USA, 135–146. https://doi.org/10.1145/1863523.1863540 2

  16. Friedman, Daniel P. and David S. Wise. “CONS Should Not Evaluate its Arguments.” International Colloquium on Automata, Languages and Programming (1976).

  17. Mitchell, Neil. “Rethinking supercompilation.” ACM SIGPLAN International Conference on Functional Programming (2010).

  18. Sørensen, M.H.B. (1998). Convergence of program transformers in the metric space of trees. In: Jeuring, J. (eds) Mathematics of Program Construction. MPC 1998. Lecture Notes in Computer Science, vol 1422. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0054297

  19. Leuschel, Michael. "Homeomorphic embedding for online termination of symbolic methods." The essence of computation: complexity, analysis, transformation (2002): 379-403.

  20. Jonsson, Peter & Nordlander, Johan. (2011). Taming code explosion in supercompilation. PERM'11 - Proceedings of the 20th ACM SIGPLAN Workshop on Partial Evaluation and Program Manipulation. 33-42. 10.1145/1929501.1929507. 2

  21. Tejiščák, Matúš. Erasure in dependently typed programming. Diss. University of St Andrews, 2020.

  22. Glück, Robert, Andrei Klimov, and Antonina Nepeivoda. "Nonlinear Configurations for Superlinear Speedup by Supercompilation." Fifth International Valentin Turchin Workshop on Metacomputation. 2016.

  23. Peyton Jones, Simon. (2002). Tackling the Awkward Squad: monadic input/output, concurrency, exceptions, and foreign-language calls in Haskell.

  24. G. W. Hamilton. 2006. Poitín: Distilling Theorems From Conjectures. Electron. Notes Theor. Comput. Sci. 151, 1 (March, 2006), 143–160. https://doi.org/10.1016/j.entcs.2005.11.028

  25. Klyuchnikov, Ilya, and Dimitur Krustev. "Supercompilation: Ideas and methods." The Monad. Reader Issue 23 (2014): 17.

  26. Klimov, Andrei & Romanenko, Sergei. (2018). Supercompilation: main principles and basic concepts. Keldysh Institute Preprints. 1-36. 10.20948/prepr-2018-111.

  27. Romanenko, Sergei. (2018). Supercompilation: homeomorphic embedding, call-by-name, partial evaluation. Keldysh Institute Preprints. 1-32. 10.20948/prepr-2018-209.

  28. Robert Glück and Morten Heine Sørensen. 1996. A Roadmap to Metacomputation by Supercompilation. In Selected Papers from the International Seminar on Partial Evaluation. Springer-Verlag, Berlin, Heidelberg, 137–160. https://dl.acm.org/doi/10.5555/647372.724040

  29. Secher, J.P. (2001). Driving in the Jungle. In: Danvy, O., Filinski, A. (eds) Programs as Data Objects. PADO 2001. Lecture Notes in Computer Science, vol 2053. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44978-7_12

  30. Hoffmann, B., Plump, D. (1988). Jungle evaluation for efficient term rewriting. In: Grabowski, J., Lescanne, P., Wechler, W. (eds) Algebraic and Logic Programming. ALP 1988. Lecture Notes in Computer Science, vol 343. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-50667-5_71

  31. Hamilton, Geoff. (2007). Distillation: Extracting the essence of programs. Proceedings of the ACM SIGPLAN Symposium on Partial Evaluation and Semantics-Based Program Manipulation. 61-70. 10.1145/1244381.1244391.

  32. Hamilton, G.W. (2010). Extracting the Essence of Distillation. In: Pnueli, A., Virbitskaite, I., Voronkov, A. (eds) Perspectives of Systems Informatics. PSI 2009. Lecture Notes in Computer Science, vol 5947. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-11486-1_13

  33. Hamilton, Geoff & Mendel-Gleason, Gavin. (2010). A Graph-Based Definition of Distillation.

  34. Hamilton, Geoff & Jones, Neil. (2012). Distillation with labelled transition systems. Conference Record of the Annual ACM Symposium on Principles of Programming Languages. 15-24. 10.1145/2103746.2103753.

  35. Hamilton, Geoff. "The Next 700 Program Transformers." International Symposium on Logic-Based Program Synthesis and Transformation. Cham: Springer International Publishing, 2021.

  36. Klyuchnikov, Ilya, and Sergei Romanenko. "Towards higher-level supercompilation." Second International Workshop on Metacomputation in Russia. Vol. 2. No. 4.2. 2010.