RStorm: Developing and Testing Streaming Algorithms in R

Abstract:

Streaming data, consisting of indefinitely evolving sequences, are becoming ubiquitous in many branches of science and in various applications. Computer scientists have developed streaming applications such as Storm and the S4 distributed stream computing platform1 to deal with data streams. However, in current production packages testing and evaluating streaming algorithms is cumbersome. This paper presents RStorm for the development and evaluation of streaming algorithms analogous to these production packages, but implemented fully in R. RStorm allows developers of streaming algorithms to quickly test, iterate, and evaluate various implementations of streaming algorithms. The paper provides both a canonical computer science example, the streaming word count, and examples of several statistical applications of RStorm.

Cite PDF Tweet

Author

Affiliation

Maurits Kaptein

 

Published

March 17, 2014

Received

Nov 18, 2013

DOI

10.32614/RJ-2014-012

Volume

Pages

6/1

123 - 132

CRAN packages used

RStorm, stream

CRAN Task Views implied by cited packages

Footnotes

    Reuse

    Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

    Citation

    For attribution, please cite this work as

    Kaptein, "The R Journal: RStorm: Developing and Testing Streaming Algorithms in R", The R Journal, 2014

    BibTeX citation

    @article{RJ-2014-012,
      author = {Kaptein, Maurits},
      title = {The R Journal: RStorm: Developing and Testing Streaming Algorithms in R},
      journal = {The R Journal},
      year = {2014},
      note = {https://doi.org/10.32614/RJ-2014-012},
      doi = {10.32614/RJ-2014-012},
      volume = {6},
      issue = {1},
      issn = {2073-4859},
      pages = {123-132}
    }