Comparing namedCapture with other R packages for regular expressions

Abstract:

Regular expressions are powerful tools for manipulating non-tabular textual data. For many tasks (visualization, machine learning, etc), tables of numbers must be extracted from such data before processing by other R functions. We present the R package namedCapture, which facilitates such tasks by providing a new user-friendly syntax for defining regular expressions in R code. We begin by describing the history of regular expressions and their usage in R. We then describe the new features of the namedCapture package, and provide detailed comparisons with related R packages (rex, stringr, stringi, tidyr, rematch2, re2r).

Cite PDF Tweet

Author

Affiliation

Toby Dylan Hocking

 

Published

Dec. 26, 2019

Received

Feb 25, 2019

DOI

10.32614/RJ-2019-050

Volume

Pages

11/2

328 - 346

Supplementary materials

Supplementary materials are available in addition to this article. It can be downloaded at RJ-2019-050.zip

CRAN packages used

namedCapture, rex, stringr, stringi, tidyr, rematch2, re2r, microbenchmark

CRAN Task Views implied by cited packages

NaturalLanguageProcessing

Footnotes

    Reuse

    Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

    Citation

    For attribution, please cite this work as

    Hocking, "The R Journal: Comparing namedCapture with other R packages for regular expressions", The R Journal, 2019

    BibTeX citation

    @article{RJ-2019-050,
      author = {Hocking, Toby Dylan},
      title = {The R Journal: Comparing namedCapture with other R packages for regular expressions},
      journal = {The R Journal},
      year = {2019},
      note = {https://doi.org/10.32614/RJ-2019-050},
      doi = {10.32614/RJ-2019-050},
      volume = {11},
      issue = {2},
      issn = {2073-4859},
      pages = {328-346}
    }