r on biohpc · explicit parallelization in r 18 our optimized r automatically parallelizes linear...

R on BioHPCRstudio, Parallel R and BioconductoR

1 Updated for 2017-10-18

Today we’ll be looking at…

2

Why R?

3

• The dominant statistics environment in academia

• Large number of packages to do a lot of different analyses

• Excellent uptake in Bioinformatics – specialist packages

• (Relatively) easy to accomplish complex stats work

• Very active development right nowR Foundation, R Consortium, Revolution Analytics, RStudio, Microsoft…

Why not R?

4

• Quirky language – painful for e.g. Python programmers

• Generally thought to be quite slow – except for optimized linear algebra

• Complex ‘old-fashioned’ documentation

• Parallelization packages can be complex / outdated

… but it’s getting better quickly….

Exciting Recent Developments in R

5

RStudio – An IDE for R, on the web

6

http://rstudio.biohpc.swmed.edu

BioHPC optimized R, access to cluster storage, persistent sessions

When to use RStudio

7

• Development work with small datasets

• Creating R Markdown documents

• Working with Shiny for dataset visualizations

• Any small, short-running data analysis tasks

Large datasets, very long running jobs, parallel code?

Must use R on the cluster…

Using R on the cluster / clients

8

Default is R/3.3.2-gccmkl – also used by rstudio.biohpc.swmed.edu

R/3.2.1-intel (older) or R/3.4.1-gccmkl also recommended

Use ‘R’ for command line R, or run scripts with ‘Rscript’

Rstudio in a GUI Session

9

Start a webGUI Session

$ module load R/3.3.2-gccmkl

$ module load rstudio-desktop

$ rstudio

Standard 20 hr limit

Whole node to yourself

You can choose which version of R

Start & connect to dedicated Python, R, and DIGITS environments

Directly from the BioHPC Portal

Portal DIGITS, RStudio & Jupyter – Coming 2018

10

Installing Packages

11

We have a set of common packages pre-installed in the R module.

You can install your own into your home directory (~/R)

install.packages(c("microbenchmark", "data.table"))

Some packages need additional libraries, won’t compile successfully.- Ask us to install them for you ([email protected])- gccmkl R is more compatible than intel R

You need to install at least one package manually before you can use install.packages via RScript

This is for packages from CRAN – BioconductoR packages install differentlySee later!

mailto:[email protected]

Our R is faster than standard downloads

12

Compiled using Intel compiler and Intel Math Kernel Library

Task Standard R BioHPC R Speedup

Matrix Multiplication 139.15 1.80 77x

Cholesky Decomposition 19.53 0.32 61x

SVD 45.66 1.95 23x

PCA 201.30 6.25 32x

LDA 135.37 17.60 7x

This is on a cluster node – speedup is less on clients with fewer CPU cores

For your own Mac or PC see http://www.revolutionanalytics.com/revolution-r-open

mkl_test.R

Benchmarking functions in R (and compiling them)

13

Compiling a function that is called often can increase speedThe microbenchmark package allows you to benchmark functions

library(compiler)f <- function(n, x) for (i in 1:n) x = (1 + sin(x))^(cos(x))g <- cmpfun(f)

library(microbenchmark)compare <- microbenchmark(f(1000, 1), g(1000, 1), times = 1000)

library(ggplot2)autoplot(compare)

functions.R

For speed – always vectorize!

14

54x speedup!

Using a function compilation improved median some (< 2x)Using vector form was much faster

distnorm <- function(){

x <- seq(-5, 5, 0.01)y <- rep(NA,length(x))

for(i in 1:length(x)) {y[i] <- stdnorm(x[i])

}

return(list(x=x,y=y))}

vdistnorm <- function(){

x <- seq(-5, 5, 0.01)y <- stdnorm(x)

return(list(x=x, y=y))

}

functions.R

Our Example Application

15

# Define a function that performs a random walk with a# specified bias that decaysrw2d <- function(n, mu, sigma){

steps=matrix(, nrow=n, ncol=2)for (i in 1:n){

steps[i,1] <- rnorm(1, mean=mu, sd=sigma )steps[i,2] <- rnorm(1, mean=mu, sd=sigma )mu <- mu/2

}return( apply(steps, 2, cumsum) )

}

mc_parallel.R

A bigger task…

16

# Generate random walks of lengths between 1000 and 5000# foreach loopsystem.time(

results <- foreach(l=1000:5000) %do% rw2d(l, 3, 1))# user system elapsed# 85.872 0.145 86.242

# Applysystem.time(

results <- lapply( 1000:5000, rw2d, 3, 1))# user system elapsed# 81.175 0.114 81.511

mc_parallel.R

Start a cluster (of R slave workers on a single machine)

17

Single node, multiple cores running multiple R slaves

#Parallel Single nodelibrary(parallel)library(doParallel)

# Create a cluster of workers using all corescl <- makeCluster( detectCores() )# Tell foreach with %dopar% to use this clusterregisterDoParallel(cl)

…

stopCluster(cl)

mc_parallel.R

Explicit Parallelization in R

18

Our optimized R automatically parallelizes linear algebra on a single machine- enough in a lot of cases!

Always prefer using vector/matrix form over for loops and apply functions to get the most out of these optimizations.

If you need more options you can control the parallelization:

library(parallel) # Single-node and cluster parallelization# apply functions and explicit execution

library(doParallel) # Simple parallel foreach loops

Can run parallel code on a single node (multicore) or across nodes (MPI)

R parallel vs MKL conflict

19

Intel MKL tries to use all cores for every linear algebra operationR is running multiple iterations of a loop in parallel using all cores

If used together too many threads/processes are launched – far more than cores!

export OMP_NUM_THREADS=1 # on terminal before running R

sys.setenv(OMP_NUM_THREADS="1") # within R

~ 5% improvement by disabling MKL multi-threading

This time in parallel!

20

cl <- makeCluster( detectCores() )RegisterDoParallel(cl)Sys.setenv(OMP_NUM_THREADS="1")

# Generate 1000 random walks of increasing length# Parallel foreach loopsystem.time(

results <- foreach(l=1000:5000) %dopar% rw2d(l, 3, 1))# user system elapsed# 2.928 0.441 17.374

# Parallel applysystem.time(

results <- parLapply( cl, 1000:5000, rw2d, 3, 1))# user system elapsed# 0.339 0.171 8.460

stopCluster(cl)

5x Speedup

9x Speedup

mc_parallel.sh

MPI parallelization – for really big jobs

21

MPI is available on R/3.3.2-gccmkl only – contact if you need othersMust ‘module add R/3.3.2-gccmkl openmpi/gcc/64/1.6.5-mlnx-ofed’

We will continue to use the simple parallel and doParallel packages

Lots online about ‘snow’ – this is now behind the scenes in new versions of R

Please join us for coffee to discuss MPI projectsusing R

Work in progress optimizations with your help

MPI parallelization – easy!

22

cl <- makeCluster( 128, type="MPI" )

Number of MPI tasks

cores per node * nodes (or less if RAM limited)

56 cores per node for 256GBv1/GPUv1 partition48 cores per node for 256GB partition32 cores per node for other partitions

mpi_parallel.R

mpi.exit()

Add to bottom of your R code to ensure tidy exit

MPI parallelization – submitting the job

23

#!/bin/bash

#SBATCH --job-name R_MPI_TEST

# Number of nodes required to run this job#SBATCH -N 4# Distribute n tasks per node#SBATCH --ntasks-per-node=32

#SBATCH -t 0-2:0:0#SBATCH -o job_%j.out#SBATCH -e job_%j.err#SBATCH --mail-type ALL#SBATCH --mail-user [email protected]

module load R/3.3.2-gccmklmodule load openmpi/gcc/64/1.6.5-mlnx-ofed

ulimit -l unlimitedR --vanilla < mpi_parallel.R

# END OF SCRIPT

No mpirun!

mpi_parallel.sh

MPI Performance

24

# Sequential (with MKL multi-threading)system.time(

results <- lapply( 1000:10000, rw2d, 3, 1))# user system elapsed# 329.173 0.610 330.607

# Parallel apply, 4 nodes, 128 MPI taskssystem.time(

results <- parLapply( cl, 1000:10000, rw2d, 3, 1))# user system elapsed# 18.815 0.951 19.848 16x Speedup

Rmarkdown / Knitr

25

Write R code inside markdown documents

Create attractive HTML, PDF, Word output that includes the code and output

BioconductoR

26

A comprehensive set of Bioinformatics related packages for R

Software and datasets

Bioconductor

27

Base packages installed, plus some commonly used extras

Install additional packages to home directory:

source("http://bioconductor.org/biocLite.R")biocLite('limma')

Ask [email protected] for packages that fail to compile

mailto:[email protected]

BioconductoR

28

Bioconductor workflows are fantastic tutorials

http://www.bioconductor.org/help/workflows/

http://www.bioconductor.org/help/workflows/

BioconductoR Example

29

DEMO

RNA-Seq AnalysisBioconductor, Rmarkdown/Knitr

See bioconductor.Rmd

Dallas R Users Group

30

http://www.meetup.com/Dallas-R-Users-Group/

University of Dallas, Irving, Saturdays

r on biohpc · explicit parallelization in r 18 our optimized r automatically parallelizes linear...

Documents