Awesome

This repository is abandoned due to the overwhelming complexity of metaprogramming with Boost/Preprocessor. See Metalang99 and Datatype99 -- the successors.

poica

The goal of this project is to implement the features of modern programming languages in plain C11 via its macro system, thereby improving static reasoning and achieving a better way to organise our code.

Installation

The recommended way is using Git submodules. Enter your project and execute:

git submodule add -b master https://github.com/hirrolot/poica
git submodule update --init --recursive

If you use CMake, you can then connect poica as follows:

include_directories(poica/include poica/preprocessor/include poica/vmd/include)

And #include <poica.h> inside your source files to export its public API. Building isn't required, because poica is a header-only library.

If you use GCC, -ftrack-macro-expansion=0 would reduce compilation time and memory consumption.

Algebraic data types

Usually in C we use unions to tell a compiler that we're going to interpret a single memory region in different ways. To decide how to interpret a union, we endow it with a tag and get a tagged union.

However, there'll be quite lot of duplication in code:

typedef struct {
    enum {
        OUR_TAGGED_UNION_STATE_1,
        OUR_TAGGED_UNION_STATE_2,
        OUR_TAGGED_UNION_STATE_3,
    } state;

    union {
        int state_1;
        const char *state_2;
        double state_3;
    } data;
} OurTaggedUnion;

What's even worse is that this approach is unsafe, meaning that we can construct invalid OurTaggedUnion (i), or, for example, (ii) access data.state_1 when the actual state is OUR_TAGGED_UNION_STATE_3:

// (i)
OurTaggedUnion res1 = { .state = OUR_TAGGED_UNION_STATE_2, .data.state_1 = 123 };

// (ii)
OurTaggedUnion res2 = { .state = OUR_TAGGED_UNION_STATE_3, .data.state_3 = .99 };
some_procedure(res2.data.state_1);

poica solves these two problems by introducing [algebraic data types] (discussed in the next section). That's how it's accomplished with poica:

choice(
    OurTaggedUnion,
    variant(State1, int)
    variant(State2, const char *)
    variant(State3, double)
);

// (i) Compilation failed!
OurTaggedUnion res1 = State2(123);

OurTaggedUnion res2 = State3(.99);
some_procedure(/* Impossible to pass state_1! */);

Sum types

For example, a binary tree like this:

Can be conveniently represented as a sum type and further manipulated using pattern matching. In the code below we first construct this binary tree, and then print all its elements to stdout:

[examples/binary_tree.c]

#include <poica.h>

#include <stdio.h>

choice(
    Tree,
    variant(Empty)
    variant(Leaf, int)
    variantMany(Node,
        field(left, struct Tree *)
        field(number, int)
        field(right, struct Tree *)
    )
);

void print_tree(const Tree *tree) {
    match(*tree) {
        of(Empty) {
            return;
        }
        of(Leaf, number) {
            printf("%d\n", *number);
        }
        ofMany(Node, (left, number, right)) {
            print_tree(*left);
            printf("%d\n", *number);
            print_tree(*right);
        }
    }
}

#define TREE(tree)                obj(tree, Tree)
#define NODE(left, number, right) TREE(Node(left, number, right))
#define LEAF(number)              TREE(Leaf(number))

int main(void) {
    const Tree *tree = NODE(NODE(LEAF(81), 456, NODE(LEAF(90), 7, LEAF(111))), 57, LEAF(123));

    print_tree(tree);
}

<details> <summary>Output</summary>

</details>

Product types

If we have structures in C, why do we need product types? Well, because product types provide type introspection (discussed in the next section). A product type is represented like this:

record(
    UserAccount,
    field(name, const char *)
    field(balance, double)
    field(age, unsigned char)
);

And it can be further manipulated like an ordinary structure:

UserAccount user = {"Gandalf", 14565.322, 715};
user.name = "Mithrandir";
user.age++;
user.balance *= 2;

Type introspection

Type introspection is supported in the sense that you can query the type properties of ADTs at compile-time and then handle them somehow in your hand-written macros.

Sum types

[examples/introspection/choice.c]

#include <poica.h>

#include <stdio.h>

#include <boost/preprocessor.hpp>

#define MY_CHOICE                                                           \
    Something,                                                              \
    variant(A)                                                              \
    variant(B, int)                                                         \
    variantMany(C, field(c1, double) field(c2, char))

choice(MY_CHOICE);
#define Something_INTROSPECT POICA_CHOICE_INTROSPECT(MY_CHOICE)

int main(void) {
    puts(BOOST_PP_STRINGIZE(Something_INTROSPECT));
}

<details> <summary>Output</summary>

((POICA_VARIANT_KIND_EMPTY)(A))
((POICA_VARIANT_KIND_SINGLE)(B)(int))
((POICA_VARIANT_KIND_MANY)(C)( ((c1)(double)) ((c2)(char)) ))

</details>

Product types

[examples/introspection/record.c]

#include <poica.h>

#include <stdio.h>

#include <boost/preprocessor.hpp>

#define MY_RECORD                                                           \
    Something,                                                              \
    field(a, int)                                                           \
    field(b, const char *)                                                  \
    field(c, double)

record(MY_RECORD);
#define Something_INTROSPECT POICA_RECORD_INTROSPECT(MY_RECORD)

int main(void) {
    puts(BOOST_PP_STRINGIZE(Something_INTROSPECT));
}

<details> <summary>Output</summary>

((a)(int)) ((b)(const char *)) ((c)(double))

</details>

Metainformation about types is actually a sequence in the terms of Boost/Preprocessor. So the BOOST_PP_SEQ_* macros can be used further, as well as Boost/VMD and the intrinsics from poica.

Safe, consistent error handling

ADTs provide a safe, consistent approach to error handling. A procedure that can fail returns a sum type, designating either a successful or a failure value, like this:

typedef enum RecvMsgErrKind {
    BAD_CONN,
    NO_SUCH_USER,
    ...
} RecvMsgErrKind;

typedef const char *Msg;

DefRes(Msg, RecvMsgErrKind);

P(Res, Msg, RecvMsgErrKind) recv_msg(...) { ... }

And then P(Res, Msg, RecvMsgErrKind) can be matched to decide what to do in the case of P(Ok, Msg, RecvMsgErrKind) and P(Err, Msg, RecvMsgErrKind):

P(Res, Msg, RecvMsgErrKind) res = recv_msg(...);
match(res) {
    of(P(Ok, Msg, RecvMsgErrKind), msg) { ... }
    of(P(Err, Msg, RecvMsgErrKind), err_kind) { ... }
}

But why this is better than int error codes? Because of:

Readability. Such identifiers as Ok and Err are more for humans, and therefore, it's much harder to confuse them with each other. In contrast to this, the usual approach in C to determine an error is by using magic ranges (for example, <0 or -1).
Consistency. No need to invent different strategies to handle different kinds of errors (i.e. using exceptions for less likely errors, int codes for a normal control flow, ...); ADTs address the problem of error handling generally.
Exhaustiveness checking (case analysis). A smart compiler and static analysis tools ensure that all the variants of Res are handled in match, so we can't forget to handle an error and make a possibly serious bug by leaving an application work as there's no error, when there is.

ADTs even have advantages over exceptions: they do not perform transformations with a program stack, since they are just values with no implicit logic that can hurt performance.

See examples/error_handling.c as an example of error handling using ADTs.

Built-in ADTs

ADT	Description	Example
`Maybe`	An optional value	`examples/maybe.c`
`Either`	Either this value or that	`examples/either.c`
`Pair`	A pair of elements	`examples/pair.c`
`Res`	Either a successful or a failure value	`examples/error_handling.c`

The last one has been presented in the previous section. All these generic types share the common API:

// Generate a definition of an ADT.
DefX(T1, ..., Tn);

// Generate a type name.
P(X, T1, ..., Tn) = ...;

The utility functions can be found in the specification.

OOP

Interfaces

[examples/interfaces.c]

#include <poica.h>

#include <math.h>
#include <stdio.h>

interface(
    Shape,
    double (*area)(const void *self);
);

record(
    Square,
    field(width, double)
    field(height, double)
);

record(
    Triangle,
    field(a, double)
    field(b, double)
    field(c, double)
);

impl(
    (Shape) for (Square),
    (double)(area)(const void *self)(
        const Square *square = (const Square *)self;
        return square->width * square->height;
    )
);

impl(
    (Shape) for (Triangle),
    (double)(area)(const void *self)(
        const Triangle *triangle = (const Triangle *)self;
        double a = triangle->a, b = triangle->b, c = triangle->c;

        double p = (a + b + c) / 2;
        return sqrt(p * (p - a) * (p - b) * (p - c));
    )
);

int main(void) {
    const Square square = { .width = 6, .height = 3.4 };
    const Triangle triangle = { .a = 4, .b = 13, .c = 15};

    printf("%f\n", iMethods(Shape, Square).area(&square));
    printf("%f\n", iMethods(Shape, Triangle).area(&triangle));
}

<details> <summary>Output</summary>

20.400000
24.000000

</details>

Dynamic dispatch

[examples/dyn_dispatch.c]

#include <poica.h>

#include <stdio.h>

interface(
    Animal,
    void (*noise)(void *self);
);

record(Dog, field(counter, int));
record(Cat, field(counter, int));

impl(
    (Animal) for (Dog),
    (void)(noise)(void *self)(
        Dog *dog = (Dog *)self;
        dog->counter++;
        printf("Woof! Counter: %d\n", dog->counter);
    )
);

impl(
    (Animal) for (Cat),
    (void)(noise)(void *self)(
        Cat *cat = (Cat *)self;
        cat->counter++;
        printf("Meow! Counter: %d\n", cat->counter);
    )
);

int main(void) {
    Dog dog = {.counter = 0};
    Cat cat = {.counter = 0};

    AnimalMut animal;

    animal = P(newIObj, AnimalMut, Dog)(&dog);

    vCall(animal, noise);
    vCall(animal, noise);
    vCall(animal, noise);

    animal = P(newIObj, AnimalMut, Cat)(&cat);

    vCall(animal, noise);
    vCall(animal, noise);
}

<details> <summary>Output</summary>

Woof! Counter: 1
Woof! Counter: 2
Woof! Counter: 3
Meow! Counter: 1
Meow! Counter: 2

</details>

Type-generic programming

This problem is often addressed via void * in C. However, it has two big disadvantages:

A compiler is unable to perform type-specific optimisations;
void * types could be confused with each other;
Not self-documenting.

poica uses a technique called monomorphisation, which means that it'll instantiate your generic types with concrete substitutions after preprocessing, eliminating all the disadvantages of void *.

Generic types

Below is a trivial implementation of a generic linked list:

[examples/generic_linked_list.c]

#include <poica.h>

#include <assert.h>
#include <stddef.h>
#include <stdlib.h>
#include <string.h>

#define DeclLinkedList(type)                                                   \
    typedef struct P(LinkedList, type) {                                       \
        type *data;                                                            \
        struct P(LinkedList, type) * next;                                     \
    } P(LinkedList, type);                                                     \
                                                                               \
    static P(LinkedList, type) * P(listNew, type)(type item);                  \
    static void P(listFree, type)(P(LinkedList, type) * list);                 \
                                                                               \
    POICA_FORCE_SEMICOLON

#define DefLinkedList(type)                                                    \
    static P(LinkedList, type) * P(listNew, type)(type item) {                 \
        P(LinkedList, type) *list = malloc(sizeof(*list));                     \
        assert(list);                                                          \
                                                                               \
        list->data = malloc(sizeof(type));                                     \
        assert(list->data);                                                    \
        memcpy(list->data, &item, sizeof(type));                               \
        list->next = NULL;                                                     \
                                                                               \
        return list;                                                           \
    }                                                                          \
                                                                               \
    static void P(listFree, type)(P(LinkedList, type) * list) {                \
        P(LinkedList, type) *node = list;                                      \
                                                                               \
        do {                                                                   \
            free(node->data);                                                  \
            P(LinkedList, type) *next_node = node->next;                       \
            free(node);                                                        \
            node = next_node;                                                  \
        } while (node);                                                        \
    }                                                                          \
                                                                               \
    POICA_FORCE_SEMICOLON

DeclLinkedList(int);
DefLinkedList(int);

int main(void) {
    P(LinkedList, int) *list = P(listNew, int)(123);
    list->next = P(listNew, int)(456);
    list->next->next = P(listNew, int)(789);

    P(listFree, int)(list);
}

There's nothing much to say, except that P (which stands for polymorphic) expands to a unique function or type identifier, e.g. performs type substitution.

Options

There are several options, implemented via macro definitions (turned off by default):

POICA_USE_PREFIX -- removes all the public unprefixed camelCaseed and PascalCaseed identifiers (match, DefRes, ...) from the current translation unit. The prefixed versions (poicaMatch, PoicaDefRes, ...) are defined unconditionally.
POICA_ENABLE_ASSERTIONS -- enables some consistency checks on input data to macros. Can increase compilation time!

FAQ

Q: What "poica" means?

A: "poica" is a Quenya word, which means clean, pure. It reflects its API.