Time-Series Clustering in R Using the dtwclust Package

Abstract:

Most clustering strategies have not changed considerably since their initial definition. The common improvements are either related to the distance measure used to assess dissimilarity, or the function used to calculate prototypes. Time-series clustering is no exception, with the Dynamic Time Warping distance being particularly popular in that context. This distance is computationally expensive, so many related optimizations have been developed over the years. Since no single clustering algorithm can be said to perform best on all datasets, different strategies must be tested and compared, so a common infrastructure can be advantageous. In this manuscript, a general overview of shape-based time-series clustering is given, including many specifics related to Dynamic Time Warping and associated techniques. At the same time, a description of the dtwclust package for the R statistical software is provided, showcasing how it can be used to evaluate many different time-series clustering procedures.

Cite PDF Tweet

Author

Affiliation

Alexis Sardá-Espinosa

 

Published

Aug. 15, 2019

Received

Apr 4, 2018

DOI

10.32614/RJ-2019-023

Volume

Pages

11/1

22 - 43

Supplementary materials

Supplementary materials are available in addition to this article. It can be downloaded at RJ-2019-023.zip

CRAN packages used

dtwclust, flexclust, cluster, TSdist, TSclust, pdc, dtw, proxy, clue, foreach, RcppParallel, doParallel

CRAN Task Views implied by cited packages

TimeSeries, Cluster, Multivariate, HighPerformanceComputing, Environmetrics, Optimization, Robust

Footnotes

    Reuse

    Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

    Citation

    For attribution, please cite this work as

    Sardá-Espinosa, "The R Journal: Time-Series Clustering in R Using the dtwclust Package", The R Journal, 2019

    BibTeX citation

    @article{RJ-2019-023,
      author = {Sardá-Espinosa, Alexis},
      title = {The R Journal: Time-Series Clustering in R Using the dtwclust Package},
      journal = {The R Journal},
      year = {2019},
      note = {https://doi.org/10.32614/RJ-2019-023},
      doi = {10.32614/RJ-2019-023},
      volume = {11},
      issue = {1},
      issn = {2073-4859},
      pages = {22-43}
    }