How to run R programs on taki

 Table of Contents

Introduction

R can be run on taki in two modes: (i) Interactive mode and (ii) batch mode.

(i) Generally, running R interactively should be done on the actual computer that you are actually sitting at in your office or your own laptop. But you do have the option to get an allocation of a core on the dedicated interactive node for interactive use, instructions for which are provided in the next section. You can start R on that node after you have been connected to the node. Remember to exit several times to relinquish your allocation, as explained below! The use of a login node for R is not recommended.

(ii) Most examples provided involve running R in batch mode. This means that your job is submitted just like any other executable to the scheduler, waits in the queue for an available node, and then runs R for you. You receive the output files and captured stdout and stderr after the run. This mode of running R is suitable to a shared cluster. Running serial R code on the cluster is similar to running any other serial job. We give a serial example, and then demonstrate how to run parallel jobs using the Rmpi package. Make sure you’ve read the tutorial for C programs first, to understand the basics of serial and parallel programming on taki.

For more information about the software, see the R website.

To use R, load the R module.

[reetam1@taki-usr1 ~]$ module load R/3.6.0-intel-2019a
[reetam1@taki-usr1 ~]$

You can check the version using the command below.

[reetam1@taki-usr1 ~]$ R --version
R version 3.6.0 (2019-04-26) -- "Planting of a Tree"
Copyright (C) 2019 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under the terms of the
GNU General Public License versions 2 or 3.
For more information about these matters see
https://www.gnu.org/licenses/.

[reetam1@taki-usr1 ~]$

Running R interactively

Taki has a dedicated node (inter101) for running software interactively, details of which can be found in the System Description page. All computational tasks should be run on the Interactive node unless you’re submitting jobs using batch mode, or installing your own libraries in R. Resources from the Interactive node can be requested by issuing the following commands on the login node.

[reetam1@taki-usr1 ~]$ salloc --partition=interactive --qos=short -N1 -n1 --time=00:05:00

The command asks slurm to allocate the specified resources to the user. In this case, the user reetam1 requests a single core from the interactive node. More explicit resource requirements can also be specified; e.g. if your interactive job might require more memory than typical, the ‘–mem=10G’ option requests 10 GB of memory. However, note that requesting more resources might affect how long your jobs wait in the SLURM queue. For more details on these options, see the ‘How to Run Programs on taki‘ webpage.

When slurm is able to allocate these resources, a message such as the following appears.

salloc: Granted job allocation 338352
[reetam1@taki-usr1 ~]$

At this point, the node is allocated, but you are still on taki-usr1. In order to log into the allocated node, the following command can be used to preserve the X11 tunnel.

[reetam1@taki-usr1 ~]$ ssh -Y $SLURM_NODELIST

You should see a new prompt now that indicates the assigned node inter101 (in place of “taki-usr1”). You can now launch R, as shown below. We show how to quit R by saying ‘quit()’ at the R prompt. When finished with R, you are still logged in on the assigned interactive node, so you must explicitly exit the shell on that node by it by saying ‘exit’ at the Linux prompt. At this point, you still have the node allocated on SLURM. One more ‘exit’ relinquishes this allocation. Always make sure to relinquish the interactive node, when done with it! The message at the end below confirms that we have relinquished the node.

[reetam1@inter101 ~]$ R

R version 3.6.0 (2019-04-26) -- "Planting of a Tree"
Copyright (C) 2019 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

>
> quit()
Save workspace image? [y/n/c]: n
[reetam1@inter101 ~]$ exit
logout
Connection to inter101 closed.
[reetam1@taki-usr1 ~]$ exit
exit
salloc: Relinquishing job allocation 338352
[reetam1@taki-usr1 ~]$

Installing your own packages

You will never, ever, be given permission to install packages to any of module R installations. So please do not attempt to do so. The solution to this is to maintain a separate collection of packages for you and your PI group (if applicable).

First load an R module. You can see if you’ve successfully loaded one with the which command.

[reetam1@taki-usr1 ~]$ which R
/usr/ebuild/software/R/3.6.0-intel-2019a/bin/R

Then create a directory to store all personal modules in. Please note that this directory should be in your dedicated research storage area and NOT in your home folder! Installing these packages in your home folder will result in a full home folder which locks you out of all system commands even ls!

[reetam1@taki-usr1 researchArea]$ mkdir my_R_Packages

Now add that directory to your R path by starting R and entering the following command:

> .libPaths("path/to/my_R_Packages")

with the path to the folder you just created. You can verify that the change has taken place.

> .libPaths()
[1] "path/to/my_R_Packages"
[2] "/umbc/ebuild-soft/skylake/software/R/3.6.0-intel-2019a/lib64/R/library"
>

The first path is the one you just set; the second one is where the system libraries of R are already installed. Note here that this line should also go into your .bashrc if you intend on having these packages available all the time. To do so, go to your home directory and open up your .bashrc file for editing:

nano .bashrc

Add the following lines at the end of the file:

# R user library path
export R_LIBS_USER=path/to/my_R_Packages

Save and exit. You need to source the bash file or log out of taki and log back in for the changes to take effect. Note that you using install.packages() has extra steps. Often the default options try installing the latest version, and you will get errors if the latest version is for, say, R 3.6.4 and is not compatible for R 3.6.0. You can find what version of the package you want is compatible with the version of R you are using by going to the CRAN page of the package and looking at older versions. Documentation for install.packages() involves steps for specifying versions of the package. Also note that R will not install dependencies by default, and you will often have to list them out and install them manually. If you think a particular package is extremely complex to install by an end-user, and is widely used enough that all users would benefit from its installation at the system level, please raise a ticket.

Serial example

Note: This, and all following sections, involve running R in batch mode.

Let us demonstrate running a simple serial example on the batch nodes. Consider the following R script.

Here is a batch script to submit the job. We use the Rscript command to execute the script. There are other possible methods such as “R CMD BATCH driver.R”, whose behavior may vary (e.g. echoing your commands to the output and automatically saving your workspace before exiting).


Download: ../code/serial-R/run.slurm

Submitting the batch script and checking the output, we have the following.

[reetam1@taki-usr1 test]$ sbatch run.slurm
Submitted batch job 338354
[reetam1@taki-usr1 test]$ more slurm.err
[reetam1@taki-usr1 test]$ more slurm.out
My sample from N(0,1) is:
 [1]  1.47280837 -0.49388754 -0.51779773  1.01626656 -1.15459667 -0.85933441
 [7] -1.29499502 -1.89274071  0.54747534 -0.93967891  0.11989975 -1.00780679
[13] -0.56620424 -0.34565973 -2.02714776  0.06824074  0.06404495  1.03276057
[19]  1.08084910 -0.67778679  0.13303688 -1.80340654 -0.39460324  0.28272118
[25]  1.01035833  0.44216910  0.21397671  0.30999425 -1.61752353 -0.44939700
[31]  1.25310412  0.72013623  0.05111222  1.62579837  1.24943859 -0.58720802
[37] -1.57202843 -0.59879776  0.35340819  0.83023155 -0.03393120 -1.13104587
[43]  0.10485761 -0.05003300  1.13855805 -0.83661548  0.91838276  0.38404897
[49] -0.32699333 -1.74747361
[reetam1@taki-usr1 test]$

Serial example with plot

On taki, some special steps are needed to produce some types of graphics such as PNG and JPEG. (PDF graphics do not require these steps). We will demonstrate with the following script which creates a scatter plot and adds a regression line.


Download: ../code/plot-R/driver.R

Our batch script will take a special step to ensure that a “virtual display” is available on the compute nodes. This allows graphics operations to be carried out even though no display is available.


Download: ../code/plot-R/run.slurm

Submitting the batch script gives the following result.

[reetam1@taki-usr1 Rplot]$ sbatch run.slurm
Submitted batch job 2995208
[reetam1@taki-usr1 Rplot]$ more slurm.err
: CommandLine Error: Option 'help-list' registered more than once!
LLVM ERROR: inconsistency in registered CommandLine options
[reetam1@taki-usr1 Rplot]$ more slurm.out
null device
          1
[reetam1@taki-usr1 Rplot]$ ls -l
total 35
-rw-rw---- 1 reetam1 pi_nagaraj   289 Nov  4  2019 driver.R
-rw-rw---- 1 reetam1 pi_nagaraj 18331 May 28 11:16 mtcars.png
-rw-rw---- 1 reetam1 pi_nagaraj   389 May 28 11:15 run.slurm
-rw-rw---- 1 reetam1 pi_nagaraj   127 May 28 11:16 slurm.err
-rw-rw---- 1 reetam1 pi_nagaraj    26 May 28 11:16 slurm.out
[reetam1@taki-usr1 Rplot]$

The messages in slurm.err are status messages from Xvfb. Notice that we have the graphic mtcars.png in the output. If mtcars.png is zero bytes, it likely means that the virtual display did not work correctly. Viewing the image should produce a result as follows.

Result shown in png formatYou can either download this image onto your personal computer, or set up an X server to view it through taki.

 

Parallel Computing using Rmpi in SPMD mode

Rmpi is a popular R package for parallel computing. The usual use of Rmpi follows a slightly different paradigm than traditional MPI. In Rmpi there is a master process that spawns slaves to work in parallel; the master usually maintains control of the overall execution. In contrast, the traditional MPI paradigm is “single program multiple data” (SPMD) where all processes are treated as equal peers, but processes may take on specific roles during the course of a program.

In the current section, we show how to run Rmpi in the SPMD mode. Rmpi features many familiar communications like “bcast” and “send”. For more information about using Rmpi, a good place to check is the reference manual on CRAN.

Hello example


Download: ../code/Rmpi-hello-spmd/hello.R


Download: ../code/Rmpi-hello-spmd/run.slurm

[reetam1@taki-usr1 hello-example]$ sbatch run.slurm
Submitted batch job 2995224
[reetam1@taki-usr1 hello-example]$ cat slurm.err
[reetam1@taki-usr1 hello-example]$ cat slurm.out
Hello world from process 000 of 008, on host cnode101
Hello world from process 004 of 008, on host cnode102
Hello world from process 001 of 008, on host cnode101
[1]Hello world from process 006 of 008, on host cnode102
Hello world from process 002 of 008, on host cnode101
[1] 1
Hello world from process 007 of 008, on host cnode102
Hello world from process 003 of 008, on host cnode101
Hello world from process 005 of 008, on host cnode102
[1] 1
 1
[1] 1
[1] 1
[1] 1
[1] 1
[1] 1
[1] 1
[1] 1
[1] 1
[1] 1
[1] 1
[1] 1
[1] 1
[1] 1
[reetam1@taki-usr1 hello-example]$

Hello example: send & receive


Download: ../code/Rmpi-hellosendrecv-spmd/driver.R


Download: ../code/Rmpi-hellosendrecv-spmd/run.slurm

[reetam1@taki-usr1 send-and-receive]$ sbatch run.slurm
Submitted batch job 2995474
[reetam1@taki-usr1 send-and-receive]$ more slurm.err
[reetam1@taki-usr1 send-and-receive]$ more slurm.out
Process 0: Received msg from process 0 saying: Hello world from process 000
NULL
NULL
NULL
NULL
NULL
NULL
NULL
[1] 1
Process 0: Received msg from process 1 saying: Hello world from process 001
Process 0: Received msg from process 2 saying: Hello world from process 002
Process 0: Received msg from process 3 saying: Hello world from process 003
Process 0: Received msg from process 4 saying: Hello world from process 004
Process 0: Received msg from process 5 saying: Hello world from process 005
Process 0: Received msg from process 6 saying: Hello world from process 006
Process 0: Received msg from process 7 saying: Hello world from process 007
[1] 1
[1] 1
[1] 1
[1] 1
[1] 1
[1] 1
[1] 1
[1] 1
[1] 1
[1] 1
[1] 1
[1] 1
[1] 1
[1] 1
[1] 1
[reetam1@taki-usr1 send-and-receive]$

Gather example

The Rmpi package provides a function called mpi.gather that provides the functionality of MPI’s Gather operation. In the example below, five standard normal random variables are generated on each of the three processes and are ‘gathered’ to the process with rank 0. Each process outputs the five random numbers it generates and gathered result in the form an array of 15 real numbers to its own log file. Note that only process 0 prints the gathered result.


Download: testgather.R

Here is a batch script to submit the job.


Download: run.slurm

Running the above slurm script using the sbatch command produces three .log files in addition to slurm.err and slurm.out. If the program runs successfully, slurm.err files should be empty and slurm.out might contain some information related to R or MPI. The output we are interested in is shown below. Note that process-000.log shows
the gathered result.

[reetam1@taki-usr1 gather]$ sbatch run.slurm
Submitted batch job 2995477
[reetam1@taki-usr1 gather]$ ll -s
total 24
5 -rw-rw---- 1 reetam1 pi_nagaraj 264 May 28 13:14 process-000.log
5 -rw-rw---- 1 reetam1 pi_nagaraj 119 May 28 13:14 process-001.log
5 -rw-rw---- 1 reetam1 pi_nagaraj 119 May 28 13:14 process-002.log
5 -rw-rw---- 1 reetam1 pi_nagaraj 248 May 28 13:13 run.slurm
1 -rw-rw---- 1 reetam1 pi_nagaraj   0 May 28 13:14 slurm.err
1 -rw-rw---- 1 reetam1 pi_nagaraj  36 May 28 13:14 slurm.out
5 -rw-rw---- 1 reetam1 pi_nagaraj 605 Dec  5  2016 testgather.R
[reetam1@taki-usr1 gather]$ more slurm.out
[1] 1
[1] 1
[1] 1
[1] 1
[1] 1
[1] 1
[reetam1@taki-usr1 gather]$ more process-000.log
local x:
[1]  1.3855826 -1.2747399 -0.4980705  0.8012684 -1.3744122

gather.result:
 [1]  1.3855826 -1.2747399 -0.4980705  0.8012684 -1.3744122 -0.9989818
 [7]  1.9006834 -0.4336543 -1.5094880  1.5557850  1.0416493  0.8077017
[13] -1.5424024  0.8711241  0.9287241
[reetam1@taki-usr1 gather]$ more process-001.log
local x:
[1] -0.9989818  1.9006834 -0.4336543 -1.5094880  1.5557850

gather.result:
 [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[reetam1@taki-usr1 gather]$

Allgather example

While Rmpi’s mpi.gather function gathers the results from all the processes to one process (with rank 0 as in the previous example), the mpi.allgather function makes the gathered result available on all the processes. In this example, we will again generate five standard normal variables on three processes and using the mpi.allgather function, make the gathered result available on all the processes. Note that the only difference between the code below and the code from the previous example is that the function call mpi.gather is replaced with mpi.allgather.


Download: ../code/Rmpi-allgather-spmd/driver.R


Download: ../code/Rmpi-allgather-spmd/run.slurm
Below is the output from our program. Note that all the three processes output the gathered result, unlike process 0 alone as in the example on mpi.gather.

[reetam1@taki-usr1 allgather]$ sbatch run.slurm
Submitted batch job 2995480
[reetam1@taki-usr1 allgather]$ ls -l
total 24
-rw-rw---- 1 reetam1 pi_nagaraj 581 May 26  2012 driver.R
-rw-rw---- 1 reetam1 pi_nagaraj 254 May 28 13:34 process-000.log
-rw-rw---- 1 reetam1 pi_nagaraj 264 May 28 13:34 process-001.log
-rw-rw---- 1 reetam1 pi_nagaraj 264 May 28 13:34 process-002.log
-rw-rw---- 1 reetam1 pi_nagaraj 210 May 28 13:32 run.slurm
-rw-rw---- 1 reetam1 pi_nagaraj   0 May 28 13:34 slurm.err
-rw-rw---- 1 reetam1 pi_nagaraj  36 May 28 13:34 slurm.out
[reetam1@taki-usr1 allgather]$ more slurm.out
[1] 1
[1] 1
[1] 1
[1] 1
[1] 1
[1] 1
[reetam1@taki-usr1 allgather]$ more process-000.log
local x:
[1] 2.058167 1.891789 0.507036 1.621419 0.893324

gather.result:
 [1]  2.0581673  1.8917888  0.5070360  1.6214189  0.8933240 -0.7242214
 [7] -0.3855879  0.4300060  0.8979178  0.1619027  1.5226275 -0.2504572
[13] -0.4530603 -1.0446812 -1.0411122
[reetam1@taki-usr1 allgather]$ more process-001.log
local x:
[1] -0.7242214 -0.3855879  0.4300060  0.8979178  0.1619027

gather.result:
 [1]  2.0581673  1.8917888  0.5070360  1.6214189  0.8933240 -0.7242214
 [7] -0.3855879  0.4300060  0.8979178  0.1619027  1.5226275 -0.2504572
[13] -0.4530603 -1.0446812 -1.0411122
[reetam1@taki-usr1 allgather]$