Awesome

Overview

Why should C++ people have all the meta-programming fun that comes from abusing the heck out of their language?

Variadic macros accidentally made the C99 pre-processor Turing complete too. It's time for C people to get in on the fun too!

Quick intro

Have a look at Tests/List.c to get an idea of how it works. Compile up this file

gcc --std=c99 -E Tests/List.c

and you will see the C99 pre-processor faithfully replaces each eval(...) line with it's value to match the line below. Note, in particular, the use of partial application in statements like

eval(map,(add,P1),L2(P1,P2))

(i.e., the equivalent to the Haskell map (+ 1) [1,2]).

See Tests/Goldbach.c for a more complex example.

More details

Let's take a quick look at the implementation of zipWith in List.h. From the Tests/List.c file we see

 eval(zipWith,(Tupple2),L3(P2,P3,P4),L4(P1,P2,P3,P4))
={(0x2,0x1),(0x3,0x4),(0x4,0x5)}

where the first line will evaluate to the second (minus the =) when passed through the pre-processor.

Wonky syntax

The fact that the pre-processor has very limited syntax and only really understands strings puts some limitations on the syntax. What emerges is a lisp like thing with , being the separator instead of .

The above could thus be thought of as

(zipWith (Tupple2) L3(P2 P3 P4) L4(P1 P2 P3 P4))

where

PN ~ the Nth positive integer (i.e., P2 is 2)

LN ~ a macro to construct a N-component list (i.e., L3(P2,P3,P4) is [2,3,4]

TuppleN ~ the N-component tuple constructor (i.e., Tupple2 would be (,))

Implementation

In Haskell, a zipWith implementation would look something like this

zipWith f (x:xs) (y:ys) = f x y : zipWith f xs ys
zipWith f _      _      = []

Or, in more gruesome detail, if we had to drop the function matching syntactic sugar and fully spelling out all the pattern match cases

zipWith f xs ys = case xs ys of
  (x:xs) (y:ys) -> f x y : zipWith f xs ys
  (x:xs) []     -> []
  []     (y:ys) -> []
  []     []     -> []

Looking into the List.h file, we hopefully recognize an implementation that pretty much mirrors this later version

#define _list_zipWith(f,xs,ys,...)                         reduce_caseReduce2(list_zipWith,xs,ys,f,__VA_ARGS__)
#define _list_zipWith_list_Cons_list_Cons(x,xs,y,ys,f,...) reduce_construct((list_Cons,reduce_apply(f,x,y),(list_zipWith,f,xs,ys)),__VA_ARGS__)
#define _list_zipWith_list_Cons_list_Nil( x,xs     ,f,...) reduce_construct((list_Nil),__VA_ARGS__)
#define _list_zipWith_list_Nil_list_Cons(      y,ys,f,...) reduce_construct((list_Nil),__VA_ARGS__)
#define _list_zipWith_list_Nil_list_Nil(            f,...) reduce_construct((list_Nil),__VA_ARGS__)

The first line define _list_zipWith(f,xs,ys,...). It is a zipWith binding in the list namespace that takes three arguments (note that, this being a lazy system, all definitions are functions).

The body associated with this binding is a call to the reduce_caseReduce2 macro (the leading reduce is a namespace -- see Reduce.h for full details and more forms). This macro takes the base name of the next macro to call, two arguments to evaluate to weak head normal form (WHNF) whose types are append to the base name, and any other arguments/bindings to pass along.

The rest is then just bindings for each of the cases. Each of them construct a list type by calling the macro reduce_construct (again the leading reduce is just the namesapce -- see Reduce.h for full details). This constructs the type an continues with whatever caused us to be evaluated.

In the first case it is a list_Cons constructor (i.e., : in Haskell), where the first argument (the head of the list) is f applied to x and y and the second argument (the tail of the list) is a recursive call to zipWith. In the remainder cases, as with the Haskell code, it is just the list_Nil constructor.

Debugging

The evaluation should not go astray if the types are correct (i.e., functions cover all the cases for the types they support and are only applied to arguments of those types). Noticeably absent though is type annotations and a type checker to ensure this is the case.

What's the nest best thing? Break the code into small bits and tests to ensure those small bits do what they are suppose to do. Another easy option is to translate the problematic code to Haskell. Haskell has a type checker, so it will tell you where things went wrong.

Decoding bad output

All that aside, there is some basic information to be obtained from the bad output. Take, for example, the incorrect expression

eval(zipWith,(Tuple2),L3(P2,P3,P4),P4)

which calls zipWith with a Integer third argument instead of a List. It evaluates to a long ugly string that begins with

_list_zipWith_list_Cons_integer_Integer

prefixed with a bunch of _recurse_NN_ prefixes. What this means is pattern matching went astray because in the definition of zipWith above

reduce_caseReduce2(list_zipWith,xs,ys,f,__VA_ARGS__)

there was no list_zipWith_list_Cons_integer_Integer branch. The ys argument must be of List type because it is only matched against the List constructors. An invalid call was made to zipWith with an incorrect third argument that was not a List.

Single stepping

Where? Well that gets messy. The evaluation can be stopped at an arbitrary step by defining RECURSE_DEBUG. For example,

gcc --std=c99 -E -D "RECURSE_DEBUG=(0,0,1,c)" Tests/List.c

stops evaluation at recursion step 0x1c (i.e., 28). With a bit of shell scripting to repeatedly call it for incrementing values of RECURSE_DEBUG it is possible to simulate single stepping the evaluation.

For example, assume the problematic code is in the file Debug.c.

#include "List.h"
#include "Integer.h"

tag: eval(zipWith,(Tuple2),L3(P2,P3,P4),P3)

where we've added tag: as some arbitrary bit of text filter the output to just the line we are interested in. Then the following code will dump a trace of the first 256 evaluation steps

for i1 in {0..9} a b c d e f; do
  for i0 in in {0..9} a b c d e f; do
    gcc --std=c99 -E -D "RECURSE_DEBUG=(0,0,$i1,$i0)" Example.c | sed -e '/^tag:/!d' >> Example.c-singlestep
  done
done

Of course the evaluation is lazy, and single stepping lazy programs gives notoriously convoluted code path.

What to do? Did we mention how it is a good idea to break the program into small bits and verify that these small bits work as advertised before attempting building bigger components from faulty blocks?

Implementation

The code is well documented. See Recurse.h for details about how Variadic macros accidentally made the the C99 pre-processor turning complete. Then see Reduce.h for details on how a somewhat Haskell-like syntax is achieved on top of this.