Awesome
Largest-Triangle time-series downsampling algorithm implementation for modern Java
These implementations are based on the paper "Downsampling Time Series for Visual Representation" by Sveinn Steinarsson from the Faculty of Industrial Engineering, Mechanical Engineering and Computer Science University of Iceland (2013). You can read the paper here
The goal of Largest-Triangle downsampling algorithms for data visualization is to reduce the number of points in a number series without losing important visual features of the resulting graph. However, it is essential to know these algorithms are not numerically correct.
See how this algorithm compares to other algorithms designed to keep local extrema in the input series at ggalmazor.com/blog/evaluating_downsampling_algorithms.html
Javadoc at ggalmazor.com/lttb_downsampling
Download
Latest version: 1.0.2
You can add this library to your Maven/Gradle/SBT/Leiningen project using a couple of source repositories.
JitPack.io
Please follow the instructions at the JitPack.io page for this project. Gradle example:
allprojects {
repositories {
maven { url 'https://jitpack.io' }
}
}
dependencies {
implementation 'com.github.ggalmazor:lttb_downsampling:1.0.2'
}
Largest-Triangle Three-Buckets
This version of the algorithm groups numbers in buckets of the same size and then selects the point that produces the largest area from each bucket with points in neighboring buckets.
You can produce a downsampled version of an input series with:
List<Point> input = Arrays.asList(...);
int numberOfBuckets = 200;
List<Point> output = LTThreeBuckets.ofSorted(input, numberOfBuckets);
The first and last points of the original series are always in the output. The rest are grouped into the defined number of buckets, and the algorithm chooses the best point from each bucket, resulting in a list of 202 elements.
Notes on Point types
- This library must provide lists of instances of the
Point
supertype. - It also provides and uses internally the
DoublePoint
subtype, which can also be used to feed data to the library. - However, users can create implementations of
Point
that best fit their Domain.
Largest-Triangle Dynamic
Not yet implemented
Example
This is how a raw time series with ~5000 data points and downsampled versions (2000, 500, and 250 buckets) look like (graphed by AirTable)
These are close-ups for 250, 500, 1000, and 2000 buckets with raw data in the back: