Home

Awesome

minicoro

Minicoro is single-file library for using asymmetric coroutines in C. The API is inspired by Lua coroutines but with C use in mind.

The project is being developed mainly to be a coroutine backend for the Nelua programming language.

The library assembly implementation is inspired by Lua Coco by Mike Pall.

Features

Supported Platforms

Most platforms are supported through different methods:

PlatformAssembly MethodFallback Method
AndroidARM/ARM64N/A
iOSARM/ARM64N/A
Windowsx86_64Windows fibers
Linuxx86_64/i686ucontext
Mac OS Xx86_64/ARM/ARM64ucontext
WebAssemblyN/AEmscripten fibers / Binaryen asyncify
Raspberry PiARMucontext
RISC-Vrv64/rv32ucontext

The assembly method is used by default if supported by the compiler and CPU, otherwise ucontext or fiber method is used as a fallback.

The assembly method is very efficient, it just take a few cycles to create, resume, yield or destroy a coroutine.

Caveats

Introduction

A coroutine represents an independent "green" thread of execution. Unlike threads in multithread systems, however, a coroutine only suspends its execution by explicitly calling a yield function.

You create a coroutine by calling mco_create. Its sole argument is a mco_desc structure with a description for the coroutine. The mco_create function only creates a new coroutine and returns a handle to it, it does not start the coroutine.

You execute a coroutine by calling mco_resume. When calling a resume function the coroutine starts its execution by calling its body function. After the coroutine starts running, it runs until it terminates or yields.

A coroutine yields by calling mco_yield. When a coroutine yields, the corresponding resume returns immediately, even if the yield happens inside nested function calls (that is, not in the main function). The next time you resume the same coroutine, it continues its execution from the point where it yielded.

To associate a persistent value with the coroutine, you can optionally set user_data on its creation and later retrieve with mco_get_user_data.

To pass values between resume and yield, you can optionally use mco_push and mco_pop APIs, they are intended to pass temporary values using a LIFO style buffer. The storage system can also be used to send and receive initial values on coroutine creation or before it finishes.

Usage

To use minicoro, do the following in one .c file:

#define MINICORO_IMPL
#include "minicoro.h"

You can do #include "minicoro.h" in other parts of the program just like any other header.

Minimal Example

The following simple example demonstrates on how to use the library:

#define MINICORO_IMPL
#include "minicoro.h"
#include <stdio.h>
#include <assert.h>

// Coroutine entry function.
void coro_entry(mco_coro* co) {
  printf("coroutine 1\n");
  mco_yield(co);
  printf("coroutine 2\n");
}

int main() {
  // First initialize a `desc` object through `mco_desc_init`.
  mco_desc desc = mco_desc_init(coro_entry, 0);
  // Configure `desc` fields when needed (e.g. customize user_data or allocation functions).
  desc.user_data = NULL;
  // Call `mco_create` with the output coroutine pointer and `desc` pointer.
  mco_coro* co;
  mco_result res = mco_create(&co, &desc);
  assert(res == MCO_SUCCESS);
  // The coroutine should be now in suspended state.
  assert(mco_status(co) == MCO_SUSPENDED);
  // Call `mco_resume` to start for the first time, switching to its context.
  res = mco_resume(co); // Should print "coroutine 1".
  assert(res == MCO_SUCCESS);
  // We get back from coroutine context in suspended state (because it's unfinished).
  assert(mco_status(co) == MCO_SUSPENDED);
  // Call `mco_resume` to resume for a second time.
  res = mco_resume(co); // Should print "coroutine 2".
  assert(res == MCO_SUCCESS);
  // The coroutine finished and should be now dead.
  assert(mco_status(co) == MCO_DEAD);
  // Call `mco_destroy` to destroy the coroutine.
  res = mco_destroy(co);
  assert(res == MCO_SUCCESS);
  return 0;
}

NOTE: In case you don't want to use the minicoro allocator system you should allocate a coroutine object yourself using mco_desc.coro_size and call mco_init, then later to destroy call mco_uninit and deallocate it.

Yielding from anywhere

You can yield the current running coroutine from anywhere without having to pass mco_coro pointers around, to this just use mco_yield(mco_running()).

Passing data between yield and resume

The library has the storage interface to assist passing data between yield and resume. It's usage is straightforward, use mco_push to send data before a mco_resume or mco_yield, then later use mco_pop after a mco_resume or mco_yield to receive data. Take care to not mismatch a push and pop, otherwise these functions will return an error.

Error handling

The library return error codes in most of its API in case of misuse or system error, the user is encouraged to handle them properly.

Virtual memory backed allocator

The new compile time option MCO_USE_VMEM_ALLOCATOR enables a virtual memory backed allocator.

Every stackful coroutine usually have to reserve memory for its full stack, this typically makes the total memory usage very high when allocating thousands of coroutines, for example, an application with 100 thousands coroutine with stacks of 56KB would consume as high as 5GB of memory, however your application may not really full stack usage for every coroutine.

Some developers often prefer stackless coroutines over stackful coroutines because of this problem, stackless memory footprint is low, therefore often considered more lightweight. However stackless have many other limitations, like you cannot run unconstrained code inside them.

One remedy to the solution is to make stackful coroutines growable, to only use physical memory on demand when its really needed, and there is a nice way to do this relying on virtual memory allocation when supported by the operating system.

The virtual memory backed allocator will reserve virtual memory in the OS for each coroutine stack, but not trigger real physical memory usage yet. While the application virtual memory usage will be high, the physical memory usage will be low and actually grow on demand (usually every 4KB chunk in Linux).

The virtual memory backed allocator also raises the default stack size to about 2MB, typically the size of extra threads in Linux, so you have more space in your coroutines and the risk of stack overflow is low.

As an example, allocating 100 thousands coroutines with nearly 2MB stack reserved space with the virtual memory allocator uses 783MB of physical memory usage, that is about 8KB per coroutine, however the virtual memory usage will be at 98GB.

It is recommended to enable this option only if you plan to spawn thousands of coroutines while wanting to have a low memory footprint. Not all environments have an OS with virtual memory support, therefore this option is disabled by default.

This option may add an order of magnitude overhead to mco_create()/mco_destroy(), because they will request the OS to manage virtual memory page tables, if this is a problem for you, please customize a custom allocator for your own needs.

Library customization

The following can be defined to change the library behavior:

Benchmarks

The coroutine library was benchmarked for x86_64 counting CPU cycles for context switch (triggered in resume or yield) and initialization.

CPU ArchOSMethodContext switchInitializeUninitialize
x86_64Linuxassembly9 cycles31 cycles14 cycles
x86_64Linuxucontext352 cycles383 cycles14 cycles
x86_64Windowsfibers69 cycles10564 cycles11167 cycles
x86_64Windowsassembly33 cycles74 cycles14 cycles

NOTE: Tested on Intel Core i7-8750H CPU @ 2.20GHz with pre allocated coroutines.

Cheatsheet

Here is a list of all library functions for quick reference:

/* Structure used to initialize a coroutine. */
typedef struct mco_desc {
  void (*func)(mco_coro* co); /* Entry point function for the coroutine. */
  void* user_data;            /* Coroutine user data, can be get with `mco_get_user_data`. */
  /* Custom allocation interface. */
  void* (*alloc_cb)(size_t size, void* allocator_data); /* Custom allocation function. */
  void  (*dealloc_cb)(void* ptr, size_t size, void* allocator_data);     /* Custom deallocation function. */
  void* allocator_data;       /* User data pointer passed to `alloc`/`dealloc` allocation functions. */
  size_t storage_size;        /* Coroutine storage size, to be used with the storage APIs. */
  /* These must be initialized only through `mco_init_desc`. */
  size_t coro_size;           /* Coroutine structure size. */
  size_t stack_size;          /* Coroutine stack size. */
} mco_desc;

/* Coroutine functions. */
mco_desc mco_desc_init(void (*func)(mco_coro* co), size_t stack_size);  /* Initialize description of a coroutine. When stack size is 0 then MCO_DEFAULT_STACK_SIZE is used. */
mco_result mco_init(mco_coro* co, mco_desc* desc);                      /* Initialize the coroutine. */
mco_result mco_uninit(mco_coro* co);                                    /* Uninitialize the coroutine, may fail if it's not dead or suspended. */
mco_result mco_create(mco_coro** out_co, mco_desc* desc);               /* Allocates and initializes a new coroutine. */
mco_result mco_destroy(mco_coro* co);                                   /* Uninitialize and deallocate the coroutine, may fail if it's not dead or suspended. */
mco_result mco_resume(mco_coro* co);                                    /* Starts or continues the execution of the coroutine. */
mco_result mco_yield(mco_coro* co);                                     /* Suspends the execution of a coroutine. */
mco_state mco_status(mco_coro* co);                                     /* Returns the status of the coroutine. */
void* mco_get_user_data(mco_coro* co);                                  /* Get coroutine user data supplied on coroutine creation. */

/* Storage interface functions, used to pass values between yield and resume. */
mco_result mco_push(mco_coro* co, const void* src, size_t len); /* Push bytes to the coroutine storage. Use to send values between yield and resume. */
mco_result mco_pop(mco_coro* co, void* dest, size_t len);       /* Pop bytes from the coroutine storage. Use to get values between yield and resume. */
mco_result mco_peek(mco_coro* co, void* dest, size_t len);      /* Like `mco_pop` but it does not consumes the storage. */
size_t mco_get_bytes_stored(mco_coro* co);                      /* Get the available bytes that can be retrieved with a `mco_pop`. */
size_t mco_get_storage_size(mco_coro* co);                      /* Get the total storage size. */

/* Misc functions. */
mco_coro* mco_running(void);                        /* Returns the running coroutine for the current thread. */
const char* mco_result_description(mco_result res); /* Get the description of a result. */

Complete Example

The following is a more complete example, generating Fibonacci numbers:

#define MINICORO_IMPL
#include "minicoro.h"
#include <stdio.h>
#include <stdlib.h>

static void fail(const char* message, mco_result res) {
  printf("%s: %s\n", message, mco_result_description(res));
  exit(-1);
}

static void fibonacci_coro(mco_coro* co) {
  unsigned long m = 1;
  unsigned long n = 1;

  /* Retrieve max value. */
  unsigned long max;
  mco_result res = mco_pop(co, &max, sizeof(max));
  if(res != MCO_SUCCESS)
    fail("Failed to retrieve coroutine storage", res);

  while(1) {
    /* Yield the next Fibonacci number. */
    mco_push(co, &m, sizeof(m));
    res = mco_yield(co);
    if(res != MCO_SUCCESS)
      fail("Failed to yield coroutine", res);

    unsigned long tmp = m + n;
    m = n;
    n = tmp;
    if(m >= max)
      break;
  }

  /* Yield the last Fibonacci number. */
  mco_push(co, &m, sizeof(m));
}

int main() {
  /* Create the coroutine. */
  mco_coro* co;
  mco_desc desc = mco_desc_init(fibonacci_coro, 0);
  mco_result res = mco_create(&co, &desc);
  if(res != MCO_SUCCESS)
    fail("Failed to create coroutine", res);

  /* Set storage. */
  unsigned long max = 1000000000;
  mco_push(co, &max, sizeof(max));

  int counter = 1;
  while(mco_status(co) == MCO_SUSPENDED) {
    /* Resume the coroutine. */
    res = mco_resume(co);
    if(res != MCO_SUCCESS)
      fail("Failed to resume coroutine", res);

    /* Retrieve storage set in last coroutine yield. */
    unsigned long ret = 0;
    res = mco_pop(co, &ret, sizeof(ret));
    if(res != MCO_SUCCESS)
      fail("Failed to retrieve coroutine storage", res);
    printf("fib %d = %lu\n", counter, ret);
    counter = counter + 1;
  }

  /* Destroy the coroutine. */
  res = mco_destroy(co);
  if(res != MCO_SUCCESS)
    fail("Failed to destroy coroutine", res);
  return 0;
}

Updates

Donation

I'm a full-time open source developer. Any amount of the donation will be appreciated and could bring me encouragement to keep supporting this and other open source projects.

Become a Patron

License

Your choice of either Public Domain or MIT No Attribution, see LICENSE file.