Awesome
Scientist
A C++ library for carefully refactoring critical paths. This is a port of Github's Ruby Scientist library.
Usage
Currently it follows closely the original API (and the .NET version).
Here is how a simple experiment is defined in C++:
#include <scientist.hh>
int res = Scientist<int>::Science("do-stuff", [](ExperimentInterface<int>& e)
{
e.Use([]() { return do_stuff_legacy();});
e.Try([]() { return do_stuff();});
});
Scientist has most of the features of the Ruby library:
- It decides whether or not to run the
Try
function - Randomizes the order of
Try
andUse
functions - Measures the durations of both functions
- Compares the results
- Swallows exceptions from
Try
function - Publishes all observations
Installation
Currently there is only a single header that is required. Copy scientist.hh
somewhere and point the compiler at the location.
Experiments
Experiments are described using the following interface:
template <class T, class U = T>
class ExperimentInterface
{
public:
virtual void BeforeRun(Setup setup) = 0;
virtual void Use(Operation<T> control) = 0;
virtual void Try(Operation<T> candidate) = 0;
virtual void Ignore(Predicate ignore) = 0;
virtual void RunIf(Predicate runIf) = 0;
virtual void Publish(Publisher<U> publisher) = 0;
virtual void PublishAsync(Publisher<U> publisher) = 0;
virtual void Compare(Compare<T> compare) = 0;
virtual void Cleanup(Transform<T,U> cleanup) = 0;
virtual void Context(std::string key, std::string value) = 0;
};
using Operation = std::function<T()>;
using Compare = std::function<bool(const T&, const T&)>;
using Predicate = std::function<bool()>;
using Publisher = std::function<void(const Observation<T>&)>;
using Transform = std::function<U(const T&)>;
using Setup = std::function<void()>;
Template parameter T
denotes the result type of the operations, and U
the possible cleaned result type (See Cleanup)
Observations
For each experiment run, Scientist returns an observation in the following form:
template <class T>
class Observation
{
const std::string& Name() const;
bool Success() const;
std::chrono::nanoseconds ControlDuration() const;
std::exception_ptr ControlException() const;
T ControlResult() const;
std::chrono::nanoseconds CandidateDuration() const;
std::exception_ptr CandidateException() const;
T CandidateResult() const;
std::list<std::string> ContextKeys() const;
std::pair<bool, const std::string&> Context(std::string key) const;
};
An experiment is successful if:
- Control and candidate return identical results (according to the equal operator or given custom comparator)
- They both throw an exception
Try
is ignored (See Ignore)
To gather observations, register Publish
function:
int res = Scientist<int>::Science("", [](ExperimentInterface<int>& e)
{
...
e.Publish([](const Observation<int>& o)
{
...
}
});
There can be a number of registered publishers.
All Publish
functions are executed before the control result is returned.
There exists a asynchronous version, PublishAsync
, for long running operations.
See publish tests for more examples.
Comparison
You can specify a custom comparison function:
int res = Scientist<int>::Science("", [](ExperimentInterface<int>& e)
{
...
e.Compare([](const int&, const int&) { ... });
});
Comparison function is required if the result type does not have an equal operator. Otherwise, all experiments fail silently.
See comparison tests for more examples.
Control the stored results
The Observation
contains results from both operations, which might not be ideal in all cases.
You can register a Cleanup
function, which maps the returned data into desired form.
This cleaned value is available in observations.
struct Data
{
int Field;
};
Scientist<Data, int>::Science("", [](ExperimentInterface<Data, int>& e)
{
e.Use([]() { return Data { 0 };});
e.Try([]() { return Data { 1 };});
e.Publish([&](const Observation<int>& o)
{
...
});
e.Cleanup([](const Data& d ) { return d.Field; });
e.Compare([]( const Data& a, const Data& b) { return a.Field == b.Field;});
});
The second template parameter denotes the type of the cleaned value (int
in this case).
If the cleanup is required (i.e. T
!= U
) but is not given, Observation
has the
default constructed value U()
.
See cleanup tests for more examples.
Ignore known issues
You can ignore some test results with Ignore
function.
Experiment is ignored if any Ignore
function returns true
or throws an exception.
Exceptions are swallowed.
int res = Scientist<int>::Science("", [&](ExperimentInterface<int>& e)
{
e.Use([]() { return 1;});
e.Try([]() { return 1;});
e.Ignore([]() { return true; });
});
See Ignore tests for more examples.
Disable experiments
Experiments can be disabled with RunIf
function.
If at least one of them (or there is none) return true
, the experiment is run.
If an exception is caught from any RunIf
function, the experiment is not run.
int res = Scientist<int>::Science("", [&](ExperimentInterface<int>& e)
{
e.Use([]() { return 42;});
e.Try([]() { return 1;});
e.RunIf([]() { return true; });
});
See RunIf tests for more examples.
Context
You can add contextual information to observations as string key-value pairs.
Writing same key multiple times overwrites the previous value.
This information can be queried from Observation
as shown below:
Scientist<int>::Science("", [](ExperimentInterface<int>& e)
{
e.Use([]() { return 42;});
e.Try([]() { return 0;});
e.Context("key1", "value");
e.Context("key2", "value");
e.Publish([](const Observation<int>& o)
{
for (std::string key: o.ContextKeys())
{
std::pair<bool, std::string> value = o.Context(key);
if (value.first)
std::cout << key << " : " << value.second << std::endl;
}
});
});
The first value (bool
) of the returned pair from Observation::Context
is true
if the requested key was found.
Expensive Setup
If an experiment requires expensive setup that should only occur once for enabled experiments, register a setup with
BeforeRun
function:
Scientist<int>::Science("", [&](ExperimentInterface<int>& e)
{
...
e.BeforeRun([&]() { ... });
});
You can register any number of setup functions, and they are run in the order of registration. These functions are only run if the experiment is enabled (See RunIf).
See BeforeRun tests for more examples.
Exceptions
Exceptions from both Try
and Use
functions are caught and stored in the Observation
.
Exceptions from Use
function are rethrown after all Publish
functions have been called.
Exceptions can be rethrown with std::rethrow_exception
and handled accordingly.
Tests
Tests are written with Google Test. To run tests, install CMake and run the following commands:
cd path/to/scientist
git submodule update --init
mkdir build
cd build
cmake ..
make tests
./tests
TODO
- Come up with a better name
- Clean up the SFINAE magic if possible
- Finalize the API
- Allow more than one
Try
function - Define an interface for publishers and register them separately (See IResultPublisher in Scientist.NET)
- Add context to experiments (See Ruby version)