A Computational Analysis of the Dynamics of R Style Based on 108 Million Lines of Code from All CRAN Packages in the Past 21 Years

Abstract:

The flexibility of R and the diversity of the R community leads to a large number of programming styles applied in R packages. We have analyzed 108 million lines of R code from CRAN and quantified the evolution in popularity of 12 style-elements from 1998 to 2019. We attribute 3 main factors that drive changes in programming style: the effect of style-guides, the effect of introducing new features, and the effect of editors. We observe in the data that a consensus in programming style is forming, such as using lower snake case for function names (e.g. softplus_func) and <- rather than = for assignment.

Cite PDF Tweet

Published

June 20, 2022

Received

Apr 4, 2020

DOI

10.32614/RJ-2022-006

Volume

Pages

14/1

6 - 21

Supplementary materials

Supplementary materials are available in addition to this article. It can be downloaded at RJ-2022-006.zip

Footnotes

    References

    Reuse

    Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

    Citation

    For attribution, please cite this work as

    Yen, et al., "The R Journal: A Computational Analysis of the Dynamics of R Style Based on 108 Million Lines of Code from All CRAN Packages in the Past 21 Years", The R Journal, 2022

    BibTeX citation

    @article{RJ-2022-006,
      author = {Yen, Chia-Yi and Chang, Mia Huai-Wen and Chan, Chung-hong},
      title = {The R Journal: A Computational Analysis of the Dynamics of R Style Based on 108 Million Lines of Code from All CRAN Packages in the Past 21 Years},
      journal = {The R Journal},
      year = {2022},
      note = {https://doi.org/10.32614/RJ-2022-006},
      doi = {10.32614/RJ-2022-006},
      volume = {14},
      issue = {1},
      issn = {2073-4859},
      pages = {6-21}
    }