At my job, the edge node server that accesses our cluster is re-instantiated every day. This means I have to clone our repo and install a bunch of R packages every morning. I wrote a bash script (using https://github.com/eddelbuettel/littler to open R from the command line) to automate this and it works except for one hiccup.
which opens an R session and calls the following command:
repos <- "http://... [our CRAN equivalent]"
Installing package into ‘/usr/hdp/18.104.22.168-2/spark2/R/lib’ (as ‘lib’ is unspecified)
Warning in install.packages("[name_of_package]", repos = "http://...") :
'lib = "/usr/hdp/22.214.171.124-2/spark2/R/lib"' is not writable
Would you like to use a personal library instead? (y/n)
I figured out a way to do it.
First, create or modify .Rprofile. Add the installations and responses at the top of that file. For example:
repos = [CRAN or wherever you're sourcing from] install.packages(package1, repos) y y install.packages(package2, repos) install.packages(package3, repos) ... # etc.
The 'y's should only be necessary for the first package, when R asks if you want to create a new library. The subsequent packages will also be dropped in there.
If you really want it to be hands-free and the script does other things after installing the R packages, you can add a
quit() statement at the end, which will exit R and do whatever else you want the script to do. This might be annoying when you want to launch R later, so you'd want the script to pass around a couple different .Rprofiles.
Second, in your bash script, simply open R with
R. The .Rprofile will be run immediately on startup, and the packages will be downloaded automatically.