Stylometry with R: A Package for Computational Text Analysis

Abstract:

This software paper describes ‘Stylometry with R’ (stylo), a flexible R package for the high level analysis of writing style in stylometry. Stylometry (computational stylistics) is concerned with the quantitative study of writing style, e.g. authorship verification, an application which has considerable potential in forensic contexts, as well as historical research. In this paper we introduce the possibilities of stylo for computational text analysis, via a number of dummy case studies from English and French literature. We demonstrate how the package is particularly useful in the exploratory statistical analysis of texts, e.g. with respect to authorial writing style. Because stylo provides an attractive graphical user interface for high-level exploratory analyses, it is especially suited for an audience of novices, without programming skills (e.g. from the Digital Humanities). More experienced users can benefit from our implementation of a series of standard pipelines for text processing, as well as a number of similarity metrics.

Cite PDF Tweet

Authors

Affiliations

Maciej Eder

 

Jan Rybicki

 

Mike Kestemont

 

Published

Dec. 21, 2015

Received

Apr 7, 2015

DOI

10.32614/RJ-2016-007

Volume

Pages

8/1

107 - 121

CRAN packages used

stylo

CRAN Task Views implied by cited packages

Footnotes

    Reuse

    Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

    Citation

    For attribution, please cite this work as

    Eder, et al., "The R Journal: Stylometry with R: A Package for Computational Text Analysis", The R Journal, 2015

    BibTeX citation

    @article{RJ-2016-007,
      author = {Eder, Maciej and Rybicki, Jan and Kestemont, Mike},
      title = {The R Journal: Stylometry with R: A Package for Computational Text Analysis},
      journal = {The R Journal},
      year = {2015},
      note = {https://doi.org/10.32614/RJ-2016-007},
      doi = {10.32614/RJ-2016-007},
      volume = {8},
      issue = {1},
      issn = {2073-4859},
      pages = {107-121}
    }