The R package GrpString was developed as a comprehensive toolkit for quantitatively analyzing and comparing groups of strings. It offers functions for researchers and data analysts to prepare strings from event sequences, extract common patterns from strings, and compare patterns be tween string vectors. The package also finds transition matrices and complexity of strings, determines clusters in a string vector, and examines the statistical difference between two groups of strings.
Supplementary materials are available in addition to this article. It can be downloaded at RJ-2018-002.zip
stringr, stringb, stringi, gsubfn, uniqtag, stringdist, TraMineR, informR, GrpString, entropy
NaturalLanguageProcessing, OfficialStatistics, Survival
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Tang, et al., "The R Journal: GrpString: An R Package for Analysis of Groups of Strings", The R Journal, 2018
BibTeX citation
@article{RJ-2018-002, author = {Tang, Hui and Day, Elizabeth L. and Atkinson, Molly B. and Pienta, Norbert J.}, title = {The R Journal: GrpString: An R Package for Analysis of Groups of Strings}, journal = {The R Journal}, year = {2018}, note = {https://doi.org/10.32614/RJ-2018-002}, doi = {10.32614/RJ-2018-002}, volume = {10}, issue = {1}, issn = {2073-4859}, pages = {359-369} }