Awesome

API Reference · Benchmarks · Stdlib Benchmarks

Saturn — Parallelism-Safe Data Structures for Multicore OCaml

This repository is a collection of concurrent-safe data structures for OCaml 5. It aims to provide an industrial-strength, well-tested (and possibly model-checked and verified in the future), well documented, and maintained concurrent-safe data structure library. We want to make it easier for Multicore OCaml users to find the right data structures for their uses.

You can learn more about the motivation behind Saturn through the implementation of a lock-free stack here.

Saturn is published on opam and is distributed under the ISC license.

Saturn — Parallelism-Safe Data Structures for Multicore OCaml
Contents
Installation
- Getting OCaml 5.2.0
- Getting Saturn
Provided data structures
About the Unsafe Data Structures
Usage
- Data Structures with Domain Roles
- Composability
Testing
Benchmarks
Contributing

Installation

Getting OCaml 5.2.0

To use Saturn, you need OCaml 5.2.0 or later. While Saturn is compatible with OCaml 4.14, this is primarily for compatibility purposes, as parallelism-safe data structures are not required without OCaml 5. Note that versions of OCaml 5 prior to 5.2 are not supported due to bugs in the Atomic module that affect the functionality of some data structures.

To install OCaml 5.2.0 yourself, first make sure you have opam 2.1 or later. You can run this command to check:

opam --version

Then use opam to install OCaml 5.2.0:

opam switch create 5.2.0

If you want a later version, you can run the following line to get a list of all available compiler versions:

opam switch list-available

Getting Saturn

saturn can be installed from opam:

opam install saturn

Provided data structures

Treiber Lock-free Stack

Module: Stack
Description: A classic multi-producer, multi-consumer, lock-free stack, known for robustness and flexibility.
Recommendation: It's a recommended starting point when a LIFO structure is needed.

Lock-free Bounded Stack

Module: Bounded_stack
Description: A stack based on the Treiber stack algorithm, with a limited capacity and a length function. This ensures that the stack is memory-bounded.
Recommendation: Adding a capacity introduces a general overhead to the operations. It is recommended to use the unbounded stack if neither the capacity nor the length function is needed.

Michael-Scott Lock-free Queue

Module: Queue
Description: A multi-producer, multi-consumer lock-free queue that is both robust and flexible.
Recommendation: This structure is ideal when a FIFO setup is required.
Sources: Simple, Fast, and Practical Non-Blocking and Blocking Concurrent Queue Algorithms

Lock-free Bounded Queue

Module: Bounded_queue
Description: A queue based on the Michael-Scott queue algorithm, with a limited capacity and a length function. This ensures that the queue is memory-bounded.
Recommendation: Adding a capacity introduces a general overhead to the operations. It is recommended to use the unbounded queue if neither the capacity nor the length function is needed.

Lock-free Chase-Lev Work-Stealing Dequeue

Module: Work_stealing_deque
Description: Single-producer, multi-consumer dynamic-size deque (double-ended queue).
Recommendation: Designed for high-throughput scheduling using per-core work distribution. Note that pop and steal operations follow different ordering (LIFO and FIFO) with distinct linearization constraints. It is a role-oriented data structure: most functions can't be used by all domains.
Sources:
- Dynamic Circular Work-Stealing Deque
- Correct and Efficient Work-Stealing for Weak Memory Models

Lock-free Single Producer Single Consumer Queue

Module: Single_prod_single_cons_queue
Description: A single-producer, single-consumer fixed-size queue. This specific configuration enables strong optimizations but also makes the data structure unsafe if used improperly, i.e., with more than one producer or one consumer at any time.
Recommendation: It's concurrent-safe as long as only one thread acts as producer and one as consumer at any time.

Lock-free Multiple Producers Single Consumer Queue

Module: Single_consumer_queue
Description: A multi-producer, single-consumer concurrent-safe queue with a closing mechanism to prevent further pushes.
Recommendation: Designed for scheduler run queues. It is not concurrent-safe if used by multiple consumers simultaneously.

Lock-free Skip List

Module: Skiplist
Description: A skiplist is a probabilistic data structure that has an average logarithmic complexity for search and insertion operations. Like Stdlib.Map, it is an ordered collection.
Recommendation: The skiplist is not resizable. It will, however, continue to work once the limit capacity is reached, but performance will decrease as the depth of the structure won't be enough to maintain logarithmic performance.
Sources: See Chapter 14 in The Art of Multiprocessor Programming

Lock-free Hash Table

Module: Htbl
Description: A resizable lock-free hash table with a snapshot mechanism.
Recommendation: Contains useful high-level operations designed to work as building blocks of non-blocking algorithms.

Lock-free Bag

Module: Bag
Description: A resizable lock-free bag based on the hash table. The pop functions returns a random value contained on the bag.

About the Unsafe Data Structures

Some data structures are available in two versions: a normal version and a more optimized but unsafe version. The unsafe version utilizes Obj.magic in a way that may be unsafe with flambda2 optimizations.

The reason for providing the unsafe version is that certain optimizations require features that are currently not available in OCaml, such as arrays of atomics or atomic fields in records. We recommend using the normal version of a data structure unless its performance is not sufficient for your use case. In that case, you can try the unsafe version.

Currently, the following data structures have an unsafe version:

Single_cons_single_prod_unsafe: a single consumer single producer lock-free queue
Queue_unsafe: a Michael-Scott lock-free queue
Bounded_queue_unsafe: a lock-free bounded queue based on Michael-Scott queue algorithm
Htbl_unsafe : a lock-free hashtable

Usage

This part describes how to use the provided data structures, and more exactly, what not to do with them. Two main points are discussed:

some data structures have restrictions on what operations can be performed in a single domain or a set of domains
the currently provided data structures are non-composable

Data Structures with Domain Roles

Some provided data structures are designed to work with specific domain configurations. These restrictions optimize their implementation, but failing to respect them may compromise safety properties. These limitations are clearly indicated in the documentation and often reflected in the name of the data structure itself. For instance, a single-consumer queue must have only one domain performing pop operations at any given time.

To learn more about it, see this document.

Composability

Composability refers to the ability to combine functions while preserving their properties. For Saturn data structures, the expected properties include atomic consistency (or linearizability) and progress guarantees, such as lock-freedom. Unfortunately, Saturn's data structures are not composable.

To learn more about it, see this document.

Testing

One of the many difficulties of implementating parallelism-safe data structures is that in addition to providing the same safety properties as sequental ones, they may also have to observe some liveness properties as well as additional safety properties specific to concurrent programming, like deadlock-freedom.

In addition to the expected safety properties, the main properties we want to test for are:

linearisability,
lock-freedom for all the lock-free data structures,
no potentially harmful data races.

Here is a list of the tools we use to ensure them:

safety : unitary tests and qcheck tests check semantics and expected behaviors with one and more domains.
safety and liveness : STM tests check linearisability for two domains (see multicoretests library).
liveness : dscheck checks non-blocking property for as many domains as wanted (for two domains most of the time). See dscheck.
safety : no data race with tsan

See test/README.md for more details.

Benchmarks

There are a number of benchmarks in bench directory. You can run them with make bench. See bench/README.md for more details.

Contributing

Contributions are appreciated! If you intend to add a new data structure, please read this before.

Awesome

Saturn — Parallelism-Safe Data Structures for Multicore OCaml

Contents

Installation

Getting OCaml 5.2.0

Getting Saturn

Provided data structures

Treiber Lock-free Stack

Lock-free Bounded Stack

Michael-Scott Lock-free Queue

Lock-free Bounded Queue

Lock-free Chase-Lev Work-Stealing Dequeue

Lock-free Single Producer Single Consumer Queue

Lock-free Multiple Producers Single Consumer Queue

Lock-free Skip List

Lock-free Hash Table

Lock-free Bag

About the Unsafe Data Structures

Usage

Data Structures with Domain Roles

Composability

Testing

Benchmarks

Contributing