Home

Awesome

k-Shape: Efficient and Accurate Clustering of Time Series

k-Shape is a highly accurate and efficient unsupervised method for univariate and multivariate time-series clustering. k-Shape appeared at the ACM SIGMOD 2015 conference, where it was selected as one of the (2) best papers and received the inaugural 2015 ACM SIGMOD Research Highlight Award. An extended version appeared in the ACM TODS 2017 journal. Since then, k-Shape has achieved state-of-the-art performance in both univariate and multivariate time-series datasets (i.e., k-Shape is among the fastest and most accurate time-series clustering methods, ranked in the top positions of established benchmarks with 100+ datasets).

k-Shape has been widely adopted across scientific areas (e.g., computer science, social science, space science, engineering, econometrics, biology, neuroscience, and medicine), Fortune 100-500 enterprises (e.g., Exelon, Nokia, and many financial firms), and organizations such as the European Space Agency.

If you use k-Shape in your project or research, cite the following two papers:

References

"k-Shape: Efficient and Accurate Clustering of Time Series"<br/> John Paparrizos and Luis Gravano<br/> 2015 ACM SIGMOD International Conference on Management of Data (ACM SIGMOD 2015)<br/>

@inproceedings{paparrizos2015k,
  title={{k-Shape: Efficient and Accurate Clustering of Time Series}},
  author={Paparrizos, John and Gravano, Luis},
  booktitle={Proceedings of the 2015 ACM SIGMOD international conference on management of data},
  pages={1855--1870},
  year={2015}
}

"Fast and Accurate Time-Series Clustering"<br/> John Paparrizos and Luis Gravano<br/> ACM Transactions on Database Systems (ACM TODS 2017), volume 42(2), pages 1-49<br/>

@article{paparrizos2017fast,
  title={{Fast and Accurate Time-Series Clustering}},
  author={Paparrizos, John and Gravano, Luis},
  journal={ACM Transactions on Database Systems (ACM TODS)},
  volume={42},
  number={2},
  pages={1--49},
  year={2017}
}

Acknowledgements

We thank Teja Bogireddy for his valuable help on this repository.

k-Shape's Matlab Repository

This repository contains the Matlab implementation for k-Shape. For the Python version, check here.

Data

To ease reproducibility, we share our results over two established benchmarks:

For the preprocessing steps check here.

Usage

Univariate Example

$ matlab
> Datasets = [cellstr('Coffee')]
> DS = LoadUCRdataset(char(Datasets(i)))
> [labels centroids] = kShape_univariate(DS.Data, length(DS.ClassNames));

Multivariate Example

$ matlab
> Datasets = [cellstr('ERing')]
> DS = LoadUAEdataset(char(Datasets(i)))
> [labels centroids] = kShape_multivariate(DS.Data, length(DS.ClassNames));

Check the Univariate and Multivariate code examples for benchmarking on the UCR and UAE datasets, respectively.

Results

The following tables contain the average Rand Index (RI), Adjusted Rand Index (ARI), and Normalized Mutual Information (NMI) accuracy values over 10 runs for k-Shape on the univariate and multivariate datasets.

Note: We collected the results using a single core implementation.

Server Specifications: Dual Intel(R) Xeon(R) Silver 4116 (24 cores/48 HT), 2.10 GHz, 196GB RAM.

Results on the 128 univariate datasets:

DatasetsRIARINMIRuntime (secs)
ACSF10.7201300.1338530.3881616.44156
Adiac0.9502430.2451070.588554465.73705
AllGestureWiimoteX0.83127240.0979740.20686544.08482
AllGestureWiimoteY0.83226200.12985620.261207241.394241
AllGestureWiimoteZ0.83056390.08055510.183499836.600462
ArrowHead0.6230060.174254500.25334441.3054324
BME0.6232020.19056010.28772190.4999676
Beef0.65864400.0936080.275481890.8396471
BeetleFly0.522179480.044387710.055631890.536824
BirdChicken0.5571790.11474530.11158650.3971603
CBF0.87541160.72412170.767182.6086939
Car0.6621840.1358450.21613952.8708315
Chinatown0.52755380.0437590.0169240.3752007
ChlorineConcentration0.5261843-0.00098910.000764845.2590362
CinCECGTorso0.631440.06270540.105833118.6877229
Coffee0.77461030.5496420.51308210.1543948
Computers0.52968090.059599650.05736844.7628411
CricketX0.869680.177700.35846818.7591078
CricketY0.87162230.2029530.37246620.3061207
CricketZ0.87084780.1814790.36608623.5766044
Crop0.9228960.23788240.43795652016.38332
DiatomSizeReduction0.9191380.80004430.820792.0050213
DistalPhalanxOutlineAgeGroup0.70898050.408800.33273411.8217814
DistalPhalanxOutlineCorrect0.4994557-0.00103032.97467e-050.9867624
DistalPhalanxTW0.8612180.666772090.54124762.6783893
DodgerLoopDay0.76671770.20805490.4031201.537474
DodgerLoopGame0.55921950.1189730.10078040.4339152
DodgerLoopWeekend0.8837050.76399010.7264880.4244024
ECG2000.6157230.2210280.13552040.3517059
ECG50000.7942730.57895880.55108662.80226
ECGFiveDays0.84506220.690240.650358601.9896606
EOGHorizontalSignal0.86218250.221060.398858876.3076898
EOGVerticalSignal0.87126300.19874070.3630311136.628252
Earthquakes0.5154630.0024419350.003659345.8951894
ElectricDevices0.6997130.081027120.1900975798.8596981
EthanolLevel0.6227210.00328260.007663.510865
FaceAll0.9146470.4465070.62130377.628496
FaceFour0.7562740.373904660.4598480.666998
FacesUCR0.9054140.4072500.60298182.0091669
FiftyWords0.9512680.3538080.64682277.2777564
Fish0.784690.18856220.319317.698090
FordA0.57294170.145880.108051392.9051991
FordB0.5128850.0257690.0192114338.176240
FreezerRegularTrain0.6386380.2772770.21135821.8496562
FreezerSmallTrain0.639120.27824640.212177020.948636
Fungi0.83836080.3705850.74417872.4766722
GestureMidAirD10.9449960.29241810.63007820.7455286
GestureMidAirD20.9459830.325120.66828717.3725095
GestureMidAirD30.931910.12871440.46299520.0660984
GesturePebbleZ10.8828120.586720.6721855.3699548
GesturePebbleZ20.8656870.5312160.6277075.6506422
GunPoint0.497487-0.0050500.00.27812729
GunPointAgeSpan0.5321330.064425480.05343330.8154745
GunPointMaleVersusFemale0.79193890.5838640.5745840.9656176
GunPointOldVersusYoung0.5185690.03714190.027928630.8558587
Ham0.52517660.0503640.03937451.4598628
HandOutlines0.6842680.362750.2533108.3821474
Haptics0.6839340.064814230.08846718.5722705
Herring0.50180850.00384260.00795380.5510935
HouseTwenty0.5184450.0364080.02944338.1808792
InlineSkate0.73494550.0355250.101388760.3541959
InsectEPGRegularTrain0.70800330.36703330.38150316.3469509
InsectEPGSmallTrain0.7068290.36433300.3814496.4372878
InsectWingbeatSound0.8174020.2039790.417454106.50463
ItalyPowerDemand0.6320770.26438810.22901631.142456
LargeKitchenAppliances0.59598610.15876580.15829719.1498729
Lightning20.5317350.0578070.0919831.2679817
Lightning70.8098290.32456820.50334052.9859982
Mallat0.92807030.73190020.88074263.637825
Meat0.811520.5987690.66189770.769969
MedicalImages0.67581020.07286510.227334421.5877388
MelbournePedestrian0.8708590.3526340.47578753.1154542
MiddlePhalanxOutlineAgeGroup0.723160830.40754200.39443121.2056471
MiddlePhalanxOutlineCorrect0.49977174-0.0037363400.00089481.3280515
MiddlePhalanxTW0.8241020.5160910.44394754.2374399
MixedShapesRegularTrain0.8133640.4548670.517074281.8805552
MixedShapesSmallTrain0.80434810.428627250.4831717132.92262
MoteStrain0.8032140.6063980.49891541.9564175
NonInvasiveFetalECGThorax10.9506630.32847140.673888983.400489
NonInvasiveFetalECGThorax20.9666160.4609780.7620196914.974343
OSULeaf0.78455990.262668250.363220525.6948894
OliveOil0.845370.6416110.64826960.9241004
PLAID0.85759270.27578230.400212139.861927
PhalangesOutlinesCorrect0.505354510.010687590.01021173.467281
Phoneme0.92786810.0348550.21230284.5056299
PickupGestureWiimoteZ0.8560000.2732050.5209600.959337
PigAirwayPressure0.9128010.028880900.4359559108.75665
PigArtPressure0.9578550.253240.706551885.6010015
PigCVP0.9604290.1856620.6544195.0936539
Plane0.92716330.739008120.859890.5993027
PowerCons0.6034720.20731080.1925111.0429059
ProximalPhalanxOutlineAgeGroup0.7707492740.5130010.4768261.5074031
ProximalPhalanxOutlineCorrect0.5341420.06690390.08612430.5041136
ProximalPhalanxTW0.82901530.56444400.55016822.0008952
RefrigerationDevices0.5577990.005912100.00781414.5373435
Rock0.7039750.23644930.3458878814.234106
ScreenType0.5599130.011231460.012179019.1829763
SemgHandGenderCh20.54666220.0922210.05881712.8683178
SemgHandMovementCh20.74448370.13193580.225027762.862215
SemgHandSubjectCh20.72966060.20508910.275499449.4141228
ShakeGestureWiimoteZ0.9018980.4624420.67470471.5737051
ShapeletSim0.73778810.47573320.4479080.9869574
ShapesAll0.97808720.41802660.740378166.1683117
SmallKitchenAppliances0.37372100.00127300.02172314.4734788
SmoothSubspace0.6270100.1643960.1792252.1865543
SonyAIBORobotSurface10.6754730.35056790.351826601.6894352
SonyAIBORobotSurface20.60007750.19394380.13656852.9754101
StarLightCurves0.7663130.51225110.6024638964.46397
Strawberry0.504165-0.0193980.123394.6568317
SwedishLeaf0.89572850.321608170.558681324.8132561
Symbols0.8891960.65000500.777543011.0962628
SyntheticControl0.889202550.6139620.7108652.6594387
ToeSegmentation10.50201240.004073190.00508041.2348743
ToeSegmentation20.6351880.262366560.2100260.8832202
Trace0.69850250.43200410.581080.748028
TwoLeadECG0.54481220.08964790.069460364.2016234
TwoPatterns0.6874060.23545460.337780467.9150568
UMD0.6010050.120674200.16850330.6761436
UWaveGestureLibraryAll0.914055890.61469670.682234306.4056649
UWaveGestureLibraryX0.8568310.3640440.4629476164.2872381
UWaveGestureLibraryY0.8299820.24564310.34295234302.7145652
UWaveGestureLibraryZ0.84660150.347109140.456047186.73311
Wafer0.5383200.01382770.0034295932.045277
Wine0.4964785-0.00518790.00105640.5772297
WordSynonyms0.89550530.2263089830.452164465145.8095883
Worms0.65179010.04007970.0640355.9496994
WormsTwoClass0.5028260.00548060.00837002.8476746
Yoga0.4999203-0.000260.00015200151.1891111

Results on the 28 multivariate datasets:

DatasetsRIARINMIRuntime (secs)
ArticularyWordRecognition0.776716250.071493550.30117891546.01149
AtrialFibrillation0.56873560.0119880.080268035.360911
BasicMotions0.83018980.54324720.59845530.48984
CharacterTrajectories0.68447120.10702940.3221812456.18020
Cricket0.83565480.2760090.50898411429.36256
DuckDuckGeese0.6328480.019254160.07765418474.327226
ERing0.817924190.367326250.455813686163.9078219
Epilepsy0.79845780.47439800.51779585268.4851809
EthanolConcentration0.5652897-0.000745360.003556034611.921332
FaceDetection0.50005010.000101080.00015046169734.49986
FingerMovements0.49992460.000771810.00301908146.7177615
HandMovementDirection0.5646850.00053540.015490653.58844
Handwriting0.90880800.03315150.2049001473.25828
Heartbeat0.5034637-0.0011990.000837432197.214505
InsectWingbeat0.752760.001450.00373868301.6473
JapaneseVowels0.778929960.06489880.143400386.25808
LSST0.781722310.053180.0993892098.3571827
Libras0.8697260950.1726330.419726405.35505
MotorImagery0.50062730.00149680.0032002010928.88054
NATOPS0.7359450.08596920.14600909514.7707
PenDigits0.78920670.1631340.2910981503.162750
PhonemeSpectra0.94439790.01898950.11997013252.5714916
RacketSports0.6078550.04295150.0708733107.0638691
SelfRegulationSCP10.5426620.08547830.0728741256.628221
SelfRegulationSCP20.499026-0.001945600.0004991021.38219
SpokenArabicDigits0.8445460.21308300.3320066801.11004
StandWalkJump0.584900.112690.185295623.50776
UWaveGestureLibrary0.779150.19281670.3521283785.27181