RTextTools: A Supervised Learning Package for Text Classification

Abstract:

Social scientists have long hand-labeled texts to create datasets useful for studying topics from congressional policymaking to media reporting. Many social scientists have begun to incorporate machine learning into their toolkits. RTextTools was designed to make machine learning accessible by providing a start-to-finish product in less than 10 steps. After installing RTextTools, the initial step is to generate a document term matrix. Second, a container object is created, which holds all the objects needed for further analysis. Third, users can use up to nine algorithms to train their data. Fourth, the data are classified. Fifth, the classification is summarized. Sixth, functions are available for performance evaluation. Seventh, ensemble agreement is conducted. Eighth, users can cross-validate their data. Finally, users write their data to a spreadsheet, allowing for further manual coding if required.

Cite PDF Tweet

Published

June 2, 2013

Received

Aug 19, 2011

DOI

10.32614/RJ-2013-001

Volume

Pages

5/1

6 - 12

CRAN packages used

RTextTools, glmnet, maxent, e1071, tm, ipred, caTools, randomForest, nnet, tree

CRAN Task Views implied by cited packages

MachineLearning, Environmetrics, NaturalLanguageProcessing, Survival, Cluster, Distributions, Econometrics, HighPerformanceComputing, Multivariate, Psychometrics, SocialSciences

Footnotes

    Reuse

    Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

    Citation

    For attribution, please cite this work as

    Jurka, et al., "The R Journal: RTextTools: A Supervised Learning Package for Text Classification", The R Journal, 2013

    BibTeX citation

    @article{RJ-2013-001,
      author = {Jurka, Timothy P. and Collingwood, Loren and Boydstun, Amber E. and Grossman, Emiliano and Atteveldt, Wouter van},
      title = {The R Journal: RTextTools: A Supervised Learning Package for Text Classification},
      journal = {The R Journal},
      year = {2013},
      note = {https://doi.org/10.32614/RJ-2013-001},
      doi = {10.32614/RJ-2013-001},
      volume = {5},
      issue = {1},
      issn = {2073-4859},
      pages = {6-12}
    }