PivotalR: A Package for Machine Learning on Big Data

Abstract:

PivotalR is an R package that provides a front-end to PostgreSQL and all PostgreSQL-like databases such as Pivotal Inc.’s Greenplum Database (GPDB) (Pivotal Inc., 2013a), HAWQ (Pivotal Inc., 2013b). When running on the products of Pivotal Inc., PivotalR utilizes the full power of parallel computation and distributive storage, and thus gives the normal R user access to big data. PivotalR also provides the R wrapper for MADlib. MADlib is an open-source library for scalable in-database analytics. It provides data-parallel implementations of mathematical, statistical and machine-learning algorithms for structured and unstructured data. Thus PivotalR also enables the user to apply machine learning algorithms onto big data.

Cite PDF Tweet

Author

Affiliation

Hai Qian

 

Published

May 26, 2014

Received

Sep 21, 2013

DOI

10.32614/RJ-2014-006

Volume

Pages

6/1

57 - 67

CRAN packages used

PivotalR, RPostgreSQL, shiny

CRAN Task Views implied by cited packages

WebTechnologies

Footnotes

    Reuse

    Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

    Citation

    For attribution, please cite this work as

    Qian, "The R Journal: PivotalR: A Package for Machine Learning on Big Data", The R Journal, 2014

    BibTeX citation

    @article{RJ-2014-006,
      author = {Qian, Hai},
      title = {The R Journal: PivotalR: A Package for Machine Learning on Big Data},
      journal = {The R Journal},
      year = {2014},
      note = {https://doi.org/10.32614/RJ-2014-006},
      doi = {10.32614/RJ-2014-006},
      volume = {6},
      issue = {1},
      issn = {2073-4859},
      pages = {57-67}
    }