Awesome

Rhine

Rhine is a Clojure-inspired Lisp on LLVM JIT featuring variable-length untyped arrays, first-class functions, closures, and macros. While Clojure hides the lower-level details by running atop the JVM, Rhine aims to expose how common Lisp constructs map to hardware.

Building

First, opam switch 4.02.1 to make sure that you're running a custom-built ocaml (for camlp4). First, run brew install libffi. Then, run opam install ocamlfind menhir core textutils ctypes, open a new shell to refresh env, and invoke make.

Troubleshooting the build

There are a number of reasons for the build failing:

Silly things like git submdoule --init failing can be fixed easily; just anchor the submodule to a valid commit and send a PR.
opam problems. If you run into one of these after following the instructions presented above, open an issue. An upstream update probably screwed something up.
Silly build issues usually arise from the build not being perfectly parallel: due to races, the -j8 picks the dependee too earlier. Either go into llvm-build/ and keep hitting make -j8 until it succeeds, or drop the -j8 together, waiting for a longer time for a predictable result.
Running into more serious build issues usually means that llvm upstream has changed in a trivial way; you can attempt to fix this yourself, or open an issue.

How it works

An untyped system means that all values are boxed/unboxed from a value_t structure at runtime:

%value_t = type {
	 i32,                                ; type of data
	 i64,                                ; integer
	 i1,                                 ; bool
	 i8*,                                ; string
	 %value_t**,                         ; array/fenv
	 i64,                                ; array/string length
	 double,                             ; double
	 %value_t* (i32, %value_t**, ...)*,  ; function
	 i8                                  ; char
}

The overhead of boxing/unboxing is paid by all dynamic languages, although multiple optimizations (including speculative optimization) can reduce the overhead. Rhine currently only implements the basic optimizations bundled with LLVM.

Rhine does automatic type conversions, so (+ 3 4.2) will do the right thing. To implement this, IR is generated to inspect the types of the operands (zeroth member of value_t), and a br (branch) is used to take the right codepath. A possible optimization is to generate a branchless codepath for all-integer arguments.

LLVM provides the array type and vector type. They cannot be used since they are fixed-length; i.e. the length must be known at compile-time. The problem is that a construct like (cons 8 coll) generates a runtime length which is equal to the (length coll) + 1. So, we malloc, getelementptr, and store by hand. It has the type specified by the fourth member of value_t.

To implement first-class functions, note that all functions must have the same type; i.e. the type of the function pointer (the seventh member of value_t). How else would you implement:

(defn map
  [f coll]
  (if coll
    (cons (f (first coll))
          (map f (rest coll)))
    []))

(defn map2
  [f c1 c2]
  (if (and c1 c2)
    (cons (f (first c1) (first c2))
          (map2 f (rest c1) (rest c2)))
    []))

Here, f takes one argument in map, but takes two arguments in map2. A function pointer type embracing variable arguments is implemented using the varargs framework of LLVM. Note that va_arg doesn't work on x86, so Rhine extracts the values by hand. The first argument gives the number of arguments, and is used to implement varargs in the Rhine language. The second argument is the closure environment (which has the same type as an array).

Closures are simple to implement with this framework in place. First, when a function is declared, parse out all the unbound variable names (not present in arguments or let), sort the names, put it in a hashtable for later reference, and codegen the code required to bind the names from the env argument in order. At the callsite, look up this hashtable, and pack all the corresponding environment variables into the env argument. So, stuff like this will work:

(defn quux [] (let [a aenv] (println a) (println env)))
(let [env 12 aenv 17] (quux))

But there's a problem because we have first-class functions. What happens to this?

(defn t [y] (+ x y))
(defn f [x] t)
(let [g (f 3)] (println (g 4)))

Here, the callsite for t is not in f, but in the anonymous function at the end. But the anonymous function doesn't have the environment variable x that t requires, f does. So, we have to augment function pointers with the environment (resuing the fourth argument of value_t).

It's important to realize that macros require that we go back-and-forth between LLVM values and the OCaml codegen engine. How else would you evaluate something like:

(defmacro baz [x]
  `[1 2 ~x])
(baz (+ 2 2))

Note that macro arguments must be passed unevaluated at the callsite. The maco-expand stage now needs to codegen [1 2 <something>], where that <something> itself needs to be codegen'ed by evaluating the argument: the result from the compiler must be returned as an AST object to OCaml. This requires some involved construction of OCaml objects from C.

Another subtle point to note is that macros must be lifted out of the program and macro-expanded at the beginning of the program. This is because we can't suddenly codegen segments required for macroexpansion in the middle of codegen'ing another function.

Todo

Lambdas. Requires lifting them out and codegen'ing them first.
Garbage collection. LLVM provides several garbage collection intrinsincs, but the main challenge is to make sure that the C bindings don't leak memory.
Custom optimizations.
Copy-on-write for persistent data structures, and persistent variables with setq. How do you access a variable's history?
FFI to C. It's simply a question of defining a good interface, because we already use malloc and memcpy internally in LLVM IR.
Self-hosting compiler. Necessary to co-develop the language and compiler.
Optional typesystem. Use the :- syntax to provide type safety and optimizations!
Concurrency primitives. See Clojure's core.async.
Support for programs spanning multiple files.
Polished error reporting with file:line annotations.

Notes

LLVM codegen statements all have side-effects. In most places, order in which variables are codegen'ed are unimportant, and lets can be moved around; but in conditionals, statements must be codegen'ed exactly in order: this means let statements can't be permuted, leading to imperative-style OCaml code.
Since LLVM IR is strongly typed, it is possible to inspect the types of llvales from OCaml at generation-time. However, there is no way for the codegen-stage to inspect the values of variables while the program is running. This has the consequence that a loop hinged upon an llvalue must be implemented in LLVM IR; hence, for functions that require iterating on variable-length arrays, we end up writing tedious LLVM IR generation code instead of an equivalent OCaml code.
Debugging errors from LLVM bindings is hard. Using ocamldebug does not work, since the callstack leading up to an LLVM call in C++ is not owned by OCaml. The alternative is to use lldb, but matching up the C++ call with a line in the OCaml code is non-trivial.
Implementing complex functions as builtins (by directly generating IR) is perhaps more efficient than implementing them in Rhine, but the gap is very small due to optimization passes applied on the Rhine-generated IR. The marginal benefit outweighs the cost of correctly weaving complex IR.