Parallel Computing on UConn HPC

Last updated on Oct 1, 2023 6 min read

1. Before we get started

I assume you have:

knowledge about how to use ssh to connect to remote servers via terminal on Mac, or via Putty on Windows
knowledge about using at least one of vim, emacs or nano, to modify files on command line interface
an account for UConn’s HPC

2. What’s special about parallelizing on a cluster

When running jobs only on our local machine, in the “computer science” terminology, we are using cores on a single node. Such parallelization can be done easily via packages like parallel on Unix systems, and doparallel on Windows. The parallel package, however, when used on clusters, is not capable of breaking the node barrier. No matter how may cores you requested, your code will still only run on cores within a single node.

3. Break the barrier

3.1 Load MPI

MPI, short for “message passing interface”, is designed to power on huge clusters, and enable the exchange of information between nodes. To use MPI on HPC:

ssh netid@login.storrs.spc.uconn.edu
echo "module load null gcc/5.4.0-alt r/3.5.1-openblas-gcc540 mpi/openmpi/1.10.1-gcc" >> ~/.bashrc
echo "export OMPI_MCA_mpi_warn_on_fork=0" >> ~/.bashrc

Adding the above code to .bashrc ensures that every time you log in to cluster, the MPI module is automatically loaded.

3.2 R Code!

For illustration we use this R code snippet:

sq <- function(x){
    x * x
}

Regardless of vectorization, computing squares for, say, 1 million numbers, would be time consuming if we use a loop. Let’s now make a cluster of, say, as many cores (allowed by system) as possible!

options(echo=TRUE) # have outputs saved to RLog
library(parallel)

cl <- makeCluster(Sys.getenv()["SLURM_NTASKS"], type = "MPI")
clusterExport(cl, varlist = ls())

## pass the 100,000 calculations to many clusters!
result <- clusterApply(cl, 1:100000, fun = function(x){
    sq(x)
})

save.image("squares.RData")

The clusterExport command exports objects in your workspace to each core so that all cores can use them. The result will be a large list whose length is the same as the number of replicates you want, with the i-th element being the result of your i-th replicate. You can then use functions in the purrr package (part of the tidyverse series) to manipulate and process the results to get your final desired output. I recommend purrr as it makes list operation simple, fast, and tractable.

Written above is the R code. As UConn’s HPC uses SLURM to manage all jobs, we need to write a submit.sh script that submit jobs to cluster. A typical script in this case would be:

#! /bin/bash
#SBATCH --partition=general
#SBATCH -n 100
#SBATCH --mail-type=END
#SBATCH --mail-user=first.last@uconn.edu
#SBATCH --mem 128000
#SBATCH -t 12:00:00

R CMD BATCH code.R

Finally, we use sbatch submit.sh to submit our job.

3.3 Run interactive jobs that take command line arguments

It is sometimes desired that we change some of the parameter inputs to run different simulation settings, especially when the project involves tuning. Of course, one can copy the same code several times and hard code the parameters, and use many different bash scripts to submit jobs. This could be tedious, and it’s easy to err.

As an example, suppose I now have a function that depends not only on x, but on other optional arguments:

qd <- function(x, a, b, c){
    a * x^2 + b * x + c
}

To make my code able to identify command line arguments:

args <- commandArgs(trailingOnly = TRUE)
a <- as.numeric(args[1])
b <- as.numeric(args[2])
c <- as.numeric(args[3])

## the usuall makeCluster step to follow...

And the submit.sh file should be modified accordingly:

#! /bin/bash
#SBATCH --partition=general
#SBATCH -n 100
#SBATCH --mail-type=END
#SBATCH --mail-user=first.last@uconn.edu
#SBATCH --mem 128000
#SBATCH -t 12:00:00

R CMD BATCH --vanilla --slave "--args $1 $2 $3" code.R RLog_$1_$2_$3

Now in terminal, we type

sbatch submit.sh 5 4 3

to submit a job. The three arguments will be passed to R, and console inputs will be in RLog_5_4_3.

3.4 Passing a vector as optional argument

In the previous section’s example, we have seen how to take command line inputs as arguments. These arguments, however, are all single numbers. What if we want To pass an entire vector as the argument? The solution here is we pass the vector as one string, and parse it inside R.

For example, I want to be able to specify my vector of true coefficients. Then:

args <- commandArgs(trailingOnly = TRUE)
mybeta <- as.numeric(unlist(strsplit(args[1], split=",")))

Everything remain the same. When submitting, I would do:

sbatch submit.sh 2,0,0,4,8

3.5 Copy the results to local machine

On a Windows machine, this can easily be done by using something like WinSCP.

On Unix/Linux systems, using the command line is a simpler way. Take our previous squares.RData as an example. It’s now under the home directory of our hpc account. If we want to copy it to the Downloads folder, we can do:

scp netid@login.storrs.hpc.uconn.edu:~/squares.RData ~/Downloads/

It can be a pain having to type login.storrs.hpc.uconn.edu every time we try to copy something to/from the remote machine. An easy solution is to save the user and host information in .ssh/config. To do that, type vi ~/.ssh/config in terminal. Then press i and enter the edit mode, and add the following:

Host hpc
    User net19001 # replace with your own netid
    HostName login.storrs.hpc.uconn.edu

Then press Esc and type :wq to exit vim. Now to copy the squares.RData, we only need to type

scp hpc:~/squares.RData ~/Downloads/

To connect to hpc, now we type

ssh hpc

5. Quick reference sheet

To see all jobs submitted by yourself:

sjobs

To cancel a specific job:

scancel jobid # can be found in sjobs

To cancel all jobs submitted by yourself:

scancel -u netid

To see your priority on HPC:

sprio -u netid

There are three factors that influence your priority on HPC: AGE (how long your job has been pending), FAIRSHARE (how much resource have you used within a certain period), and PARTITION (depends on partition, most people just submit to “general”). Intuitively, the longer you have been waiting, the less resource you have used recently, and the higher priority your partition has, the higher will your job’s priority be.

6. Debug

The R package Rmpi needs to be installed for us to be able to utilize the great MPI. However, some errors might occur when trying to install it. The reason could be due to that MPI is not loaded. If yes, R can sometimes miss the location of MPI. Therefore, the path to MPI needs to be manually entered. In an R session, type

install.packages("Rmpi", 
	configure.args=c("--with-Rmpi-include=/apps2/openmpi/1.10.1-gcc/include/",
	"--with-Rmpi-libpath=/apps2/openmpi/1.10.1-gcc/lib/",
	"--with-Rmpi-type=OPENMPI"))

and Rmpi should install without error. This works on both R 3.4.1 and 3.5.1. Other versions, However, have not been tested.

technical

Parallel Computing on UConn HPC

1. Before we get started

2. What’s special about parallelizing on a cluster

3. Break the barrier

3.1 Load MPI

3.2 R Code!

3.3 Run interactive jobs that take command line arguments

3.4 Passing a vector as optional argument

3.5 Copy the results to local machine

5. Quick reference sheet

6. Debug

Yishu Xue

Data Scientist / Runner / Gym Enthusiast / Animal Lover

Parallel Computing on UConn HPC

1. Before we get started

2. What’s special about parallelizing on a cluster

3. Break the barrier

3.1 Load MPI

3.2 R Code!

3.3 Run interactive jobs that take command line arguments

3.4 Passing a vector as optional argument

3.5 Copy the results to local machine

4. Easy login

5. Quick reference sheet

6. Debug

Yishu Xue

Data Scientist / Runner / Gym Enthusiast / Animal Lover