Awesome

Largest-Triangle time-series downsampling algorithm implementation for modern Java

</br>

These implementations are based on the paper "Downsampling Time Series for Visual Representation" by Sveinn Steinarsson from the Faculty of Industrial Engineering, Mechanical Engineering and Computer Science University of Iceland (2013). You can read the paper here

The goal of Largest-Triangle downsampling algorithms for data visualization is to reduce the number of points in a number series without losing important visual features of the resulting graph. However, it is essential to know these algorithms are not numerically correct.

See how this algorithm compares to other algorithms designed to keep local extrema in the input series at ggalmazor.com/blog/evaluating_downsampling_algorithms.html

Javadoc at ggalmazor.com/lttb_downsampling

Download

Latest version: 1.0.2

You can add this library to your Maven/Gradle/SBT/Leiningen project using a couple of source repositories.

JitPack.io

Please follow the instructions at the JitPack.io page for this project. Gradle example:

allprojects {
  repositories {
    maven { url 'https://jitpack.io' }
  }
}

dependencies {
  implementation 'com.github.ggalmazor:lttb_downsampling:1.0.2'
}

Largest-Triangle Three-Buckets

This version of the algorithm groups numbers in buckets of the same size and then selects the point that produces the largest area from each bucket with points in neighboring buckets.

You can produce a downsampled version of an input series with:

List<Point> input = Arrays.asList(...);
int numberOfBuckets = 200;

List<Point> output = LTThreeBuckets.ofSorted(input, numberOfBuckets);

The first and last points of the original series are always in the output. The rest are grouped into the defined number of buckets, and the algorithm chooses the best point from each bucket, resulting in a list of 202 elements.

Notes on Point types

This library must provide lists of instances of the Point supertype.
It also provides and uses internally the DoublePoint subtype, which can also be used to feed data to the library.
However, users can create implementations of Point that best fit their Domain.

Largest-Triangle Dynamic

Not yet implemented

Example

This is how a raw time series with ~5000 data points and downsampled versions (2000, 500, and 250 buckets) look like (graphed by AirTable)

These are close-ups for 250, 500, 1000, and 2000 buckets with raw data in the back: