Guforu Guforu - 1 month ago 6x
R Question

Installing of SparkR

I have the last version of R - 3.2.1. Now I want to install SparkR on R. After I execute:

> install.packages("SparkR")

I got back:

Installing package into ‘/home/user/R/x86_64-pc-linux-gnu-library/3.2’
(as ‘lib’ is unspecified)
Warning in install.packages :
package ‘SparkR’ is not available (for R version 3.2.1)

I have also installed Spark on my machine

Spark 1.4.0

How I can solve this problem (actually I use RStudio or just from terminal)


You can install directly from a GitHub repository:

if (!require('devtools')) install.packages('devtools')
devtools::install_github('apache/spark@v1.4.0', subdir='R/pkg')

You should choose tag (v1.4.0 above) corresponding to the version of Spark you use. You can find a full list of tags on the project page or directly from R using GitHub API:


If you've downloaded binary package from a downloads page R library is in a R/lib/SparkR subdirectory. It can be used to install SparkR directly. For example:

$ export SPARK_HOME=/path/to/spark/directory
$ cd $SPARK_HOME/R/lib/SparkR/
$ R -e "devtools::install('.')"

You can also add R lib to .libPaths (taken from here):

.libPaths(c(file.path(Sys.getenv('SPARK_HOME'), 'R', 'lib'), .libPaths()))

Finally, you can use sparkR shell without any additional steps:

$ /path/to/spark/directory/bin/sparkR