Home

Awesome

GrassVM

GrassVM is a programmable virtual CPU for the Grass programming language.

GrassVM supports an extended version of the ELVM instruction set and architecture written by Shinichiro Hamaji. Using GrassVM, you can enjoy writting Grass programs in assembly programming style.

GrassVM is based on LambdaVM and LambdaCraft. The implementation details for these projects are also available at the repos for lambda-8cc and LambdaLisp.

Lisp in Grass

GrassVM is now integrated into ELVM as the ELVM Grass backend. Using ELVM, you can compile C to Grass. An example is lisp.c included in the ELVM repository. The compiled Grass program lisp.w is available in my Gist, together with a Makefile to build lisp.w from source.

Instead of writing C, you can hand-assemble Grass programs using the contents of this repo.

Grass Programming Tips

I've documented tips on writing raw Grass programs in grass-tips.md.

Usage

Requirements

Build Instructions

This will build rot13.w (without the newline pretty printing):

make

This will build the file out/rot13.w. This will take several minutes.

Note that running make will also install grass and plant under the location specified by stack. The Makefile assumes ~/.local/bin as the installation location, specified with the configuration STACK_BIN_PATH=~/.local/bin. If make fails finding grass and plant, try changing this configuration.

out/rot13.w can be run as:

$ make bin/grass_ml
$ echo 'Hello, world!' | bin/grass_ml out/rot13.w
Uryyb, jbeyq!
$ echo 'Uryyb, jbeyq!' | bin/grass_ml out/rot13.w
Hello, world!
$ bin/grass_ml out/rot13.w  # Also runs as a REPL
Hello, world!
Uryyb, jbeyq!

Building the GrassVM Core in the ELVM Grass Backend

To build the variable GRASS_VM in wcore.h from the ELVM Grass backend, run:

make out/main.w

How it Works

Here I will explain how rot13.w is built using GrassVM.

Compilation Chain

GrassVM's compilation chain is based on @susisu's Grassy toolkit. The toolkit includes grass and plant. grass is a Grass interpreter. plant is a Grass transpiler that transpiles an OCaml-style language to Grass. For example programs for plant, see Grasspiler written also by @susisu.

Having Lots of Fun

LambdaVM is first compiled to the OCaml-like plant syntax using LambdaCraft. The output is a one-liner lambda term:

let main _ = (fun x -> fun y -> ((fun z -> ((fun a -> ((fun b -> (a (fun c -> (b b c))
)) (fun b -> (a (fun c -> (b b c)))))) (fun a -> fun b -> fun c -> fun d -> ((fun e ->
((fun f -> (f (fun g -> fun h -> fun i -> fun i -> fun k -> k) (fun g -> fun h -> g)))
e)) b (fun e -> (d c (Succ c))) (fun e -> (b (fun f -> fun g -> (a g c (fun h -> fun c
-> (a g c (fun j -> fun c -> (d (fun l -> (l h j)) c)))))))) (fun e -> e))) z ((fun a 
-> fun b -> ((fun a -> fun b -> ((fun a -> fun b -> (a (a b))) ((fun a -> fun b -> ((
...

This is generated by running rot13.cl in Common Lisp. This program generates the lambda term encodings for the assembly listing and applies it to GrassVM, creating a standalone lambda term that can be run in a Grass interpreter (after transpiling it with plant). This file is saved as out/rot13.ml. For the reason why main takes an unused argument, please see the language specs.

out/rot13.ml is then transpiled to Grass using plant.

Source Codes

The main source code for the virtual machine is lambdavm.cl. The I/O wrapper is written in grassvm.cl. The assembly listing is written in rot13.cl. For details on the assembly, please see the LambdaVM repo.

ELVM Grass Backend Implementation Details

rot13.w was compiled using plant. Here, the entire source code including the assembly listing written in rot13.cl was compiled using plant.

In the ELVM Grass backend, instead of using plant, the assembly listing is directly written down as a raw Grass program. This is done as follows:

Here the GrassVM core is pre-compiled using plant, and included as wcore.h in the ELVM Grass backend. The GrassVM core can be build by running make out/main.w.

The main difference between rot13.cl and the ELVM Grass backend is how the assembly listing is generated. This is largely helped by the memory and program builders.

This section descirbes the modifications that were made to build the ELVM Grass backend, along with other further implementations details.

The Memory and Program Builder

GrassVM is based on LambdaVM. LambdaVM accepts a memory list and program list, each expressed as a list of 24-bit integers and a list of lists of instructions. rot13.cl builds these lists as built-in lambda terms compiled by LambdaCraft and plant.

To build these lists in the ELVM Grass backend, the Grass source code for building these lists must be generated in the C source code of ELVM. To make this easy, GrassVM uses a stack machine which I named as the memory builder and program builder.

The memory and program builders are defined as follows in main.cl as MemlistBuilder and ProglistBuilder.

Memory Builder

MemlistBuilder is defined as follows:

(defrec-lazy MemlistBuilder (cont curlist option)
  (if option
    ;; t -> construct int
    (lambda (b1 b2 b3 b4 b5 b6 b7 b8 b9 b10
             b11 b12 b13 b14 b15 b16 b17 b18 b19 b20
             b21 b22 b23 b24)
      (MemlistBuilder
        cont
        (cons
          (list b1 b2 b3 b4 b5 b6 b7 b8 b9 b10
            b11 b12 b13 b14 b15 b16 b17 b18 b19 b20
            b21 b22 b23 b24)
          curlist)))
    ;; nil -> return int
    (cont curlist)))

MemlistBuilder is a state machine that behaves as follows:

Since MemlistBuilder pushes its incoming item on top of the stack as (cons next curlist), the integers must be pushed to MemlistBuilder in reverse order when creating the list. The ELVM Grass backend therefore reverses the memory list first when emitting the memory initialization clauses.

Program Builder

ProglistBuilder is defined as follows:

(defun-lazy iscons4-not-nil (expr)
  (expr (lambda (a b c d) (lambda (x) t)) nil))

(defrec-lazy ProglistBuilder (cont curlist curtag input)
  (if (iscons4-not-nil input)
    ;; The input is a cons4 -> build tag
    (ProglistBuilder cont curlist (cons input curtag))
    ;; The input is nil
    (if (isnil curtag)
      ;; curtag is closed -> apply the program list to the VM and return it
      (cont curlist)
      ;; curtag is open -> close tag and stack it on curlist
      (ProglistBuilder cont (cons curtag curlist) nil))))

ProglistBuilder is a state machine that behaves as follows:

Similar to MemlistBuilder, ProglistBuilder also pushes its incoming instruction and tag on top of the stack as (cons next curlist). Therefore, the instructions must be pushed in reverse order when creating the list. The ELVM Grass backend therefore reverses the instruction list first when emitting the program list.

Self Application Specification

The memory and program builders are combined with GrassVM in main.cl as follows:

(def-lazy GrassVMCore
  (MemlistBuilder
    (lambda (memlist)
      (ProglistBuilder
        (lambda (proglist)
          (lambda (x) (GrassVM memlist proglist))) nil nil))
    nil))

GrassVMCore first exposes MemlistBuilder to build the memory initialization clauses. When the memory initialization finishes, it then exposes ProglistBuilder to build the program initialization clauses. When ProglistBuilder finishes running, it returns the object (lambda (x) (GrassVM memlist proglist)), encapsulating a standalone GrassVM applied with memlist and proglist thus making it ready to run.

The reason why GrassVM is contained in a lambda (lambda (x) ...) is because of the self-application specification of the Grass language. In Grass, it is specified that when the interpreter encounters the final v definition clause in the program, it takes the object at the top of the stack and calls it with the object itself as the argument. After finishing building the memory and program, the object (lambda (x) (GrassVM memlist proglist)) is left at the top of the stack. The interpreter therefore applies this object to itself to start the main execution phase. This exposes the closure (GrassVM memlist proglist) to the execution region, thus starting the execution of the program loaded to GrassVM.

ELVM Implementation

Grass is a stack-based language. When a new value is created via function application, the result is pushed to the top of the environment stack. Due to the designs of the memory and program builder, after each integer or instruction is pushed to the builder functions, the new closure is pushed to the top of the stack. This makes writing raw Grass code for building the memory and programs easy, since the VM always stays at the top of the stack before pushing another integer or instruction.

Since all of the necessary Grass primitive function references are already contained inside the VM, the ELVM implementation can forget about the total depth of the stack, so it suffices to only track the relative position of the VM within the environment stack.

I/O Wrapper

GrassVM first creates a binary tree that stores all of the 256 characters from 0 to 255. This structure is used to convert to and from Grass's primitive character object to a list of booleans. Using this structure, it defines the functions putchar and getchar which is later referenced inside LambdaVM. The definitions for these structures are written in grassvm.cl.

Evaluation Strategy

The original LambdaVM assumes a lazy evaluation strategy. On the other hand, Grass is an eagerly evaluated language as specified in the language specs (in the "Intuitive explanation" section).

Several modifications were made to let LambdaVM run in an eager evaluation strategy. Most of the difficult changes were made by simply changing the definitions of if and typematch-nil-cons defined in macros.cl. The way Lisp abstracts these control flows with macros helped a lot when rewriting this program for a different evaluation strategy.

Refining the Interpreter

GrassVM is tested using the interpreter grass.ml, originally written by @ytomino and modified by @youz and @woodrush. Here are the modifications made from the original implementation:

Miscellaneous

Writing Assembly in Raw Grass

In the previous section, the assembly was transpiled to lambda terms using plant. You can also choose to directly write GrassVM assembly in raw Grass terms. This is done in asm.w, which is connected to out/main.w which can be generated by make out/main.w. The standalone Grass program can be built by out/asm-standalone.w. This can be run by make run-asm.