Home

Awesome

A single-producer, single-consumer lock-free queue for C++

This mini-repository has my very own implementation of a lock-free queue (that I designed from scratch) for C++.

It only supports a two-thread use case (one consuming, and one producing). The threads can't switch roles, though you could use this queue completely from a single thread if you wish (but that would sort of defeat the purpose!).

Note: If you need a general-purpose multi-producer, multi-consumer lock free queue, I have one of those too.

This repository also includes a circular-buffer SPSC queue which supports blocking on enqueue as well as dequeue.

Features

Use

Simply drop the readerwriterqueue.h (or readerwritercircularbuffer.h) and atomicops.h files into your source code and include them :-) A modern compiler is required (MSVC2010+, GCC 4.7+, ICC 13+, or any C++11 compliant compiler should work).

Note: If you're using GCC, you really do need GCC 4.7 or above -- 4.6 has a bug that prevents the atomic fence primitives from working correctly.

Example:

using namespace moodycamel;

ReaderWriterQueue<int> q(100);       // Reserve space for at least 100 elements up front

q.enqueue(17);                       // Will allocate memory if the queue is full
bool succeeded = q.try_enqueue(18);  // Will only succeed if the queue has an empty slot (never allocates)
assert(succeeded);

int number;
succeeded = q.try_dequeue(number);  // Returns false if the queue was empty

assert(succeeded && number == 17);

// You can also peek at the front item of the queue (consumer only)
int* front = q.peek();
assert(*front == 18);
succeeded = q.try_dequeue(number);
assert(succeeded && number == 18);
front = q.peek(); 
assert(front == nullptr);           // Returns nullptr if the queue was empty

The blocking version has the exact same API, with the addition of wait_dequeue and wait_dequeue_timed methods:

BlockingReaderWriterQueue<int> q;

std::thread reader([&]() {
    int item;
#if 1
    for (int i = 0; i != 100; ++i) {
        // Fully-blocking:
        q.wait_dequeue(item);
    }
#else
    for (int i = 0; i != 100; ) {
        // Blocking with timeout
        if (q.wait_dequeue_timed(item, std::chrono::milliseconds(5)))
            ++i;
    }
#endif
});
std::thread writer([&]() {
    for (int i = 0; i != 100; ++i) {
        q.enqueue(i);
        std::this_thread::sleep_for(std::chrono::milliseconds(10));
    }
});
writer.join();
reader.join();

assert(q.size_approx() == 0);

Note that wait_dequeue will block indefinitely while the queue is empty; this means care must be taken to only call wait_dequeue if you're sure another element will come along eventually, or if the queue has a static lifetime. This is because destroying the queue while a thread is waiting on it will invoke undefined behaviour.

The blocking circular buffer has a fixed number of slots, but is otherwise quite similar to use:

BlockingReaderWriterCircularBuffer<int> q(1024);  // pass initial capacity

q.try_enqueue(1);
int number;
q.try_dequeue(number);
assert(number == 1);

q.wait_enqueue(123);
q.wait_dequeue(number);
assert(number == 123);

q.wait_dequeue_timed(number, std::chrono::milliseconds(10));

CMake

Using targets in your project

Using this project as a part of an existing CMake project is easy.

In your CMakeLists.txt:

include(FetchContent)

FetchContent_Declare(
  readerwriterqueue
  GIT_REPOSITORY    https://github.com/cameron314/readerwriterqueue
  GIT_TAG           master
)

FetchContent_MakeAvailable(readerwriterqueue)

add_library(my_target main.cpp)
target_link_libraries(my_target PUBLIC readerwriterqueue)

In main.cpp:

#include <readerwriterqueue.h>

int main()
{
    moodycamel::ReaderWriterQueue<int> q(100);
}

Installing into system directories

As an alternative to including the source files in your project directly, you can use CMake to install the library in your system's include directory:

mkdir build
cd build
cmake ..
make install

Then, you can include it from your source code:

#include <readerwriterqueue/readerwriterqueue.h>

Disclaimers

The queue should only be used on platforms where aligned integer and pointer access is atomic; fortunately, that includes all modern processors (e.g. x86/x86-64, ARM, and PowerPC). Not for use with a DEC Alpha processor (which has very weak memory ordering) :-)

Note that it's only been tested on x86(-64); if someone has access to other processors I'd love to run some tests on anything that's not x86-based.

More info

See the LICENSE.md file for the license (simplified BSD).

My blog post introduces the context that led to this code, and may be of interest if you're curious about lock-free programming.