ten things i don’t hate about you: some things i didnt ... · ten things i don’t hate about...

32
Ten things I DON’T hate about you: some things I didnt know when I started using R that I wish I had Ty Stanford May 22, 2012 ADELAIDE R USERS GROUP

Upload: others

Post on 17-Jun-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Ten things I DON’T hate about you: some things I didnt ... · Ten things I DON’T hate about you: some things I didnt know when I started using R that I wish I had Ty Stanford

Ten things I DON’T hate about you:some things I didnt know when I started using R that I wish I had

Ty Stanford

May 22, 2012

ADELAIDE R

USERS GROUP

Page 2: Ten things I DON’T hate about you: some things I didnt ... · Ten things I DON’T hate about you: some things I didnt know when I started using R that I wish I had Ty Stanford

We all here use R but...

I thought I might give a little self-affirmatory spruiking of R

As a statistician, R ticks almost all the boxes

I amazing data handling, easy to use

I allows to some extent low-level programming

I — as well as vectorised code

I extensive range of packages

I works in memory only downside - but there are packages forthat (memory is becoming cheaper too!)

Page 3: Ten things I DON’T hate about you: some things I didnt ... · Ten things I DON’T hate about you: some things I didnt know when I started using R that I wish I had Ty Stanford

Wish I knew... #1

R-bloggers

• A ‘blog’ of R-articles widely sourced from the webernet• Also a mailing list

www.r-bloggers.com

“R news and tutorials contributed by (X ) R bloggers”

X = 365 as of 19th May 2012

Page 4: Ten things I DON’T hate about you: some things I didnt ... · Ten things I DON’T hate about you: some things I didnt know when I started using R that I wish I had Ty Stanford

Wish I knew... #2

We know cran.r-project.org where we download R...

But two of the pages contained within are one-stop-shops:

• Task views: cran.r-project.org/web/views• R language defn: cran.r-project.org/doc/manuals/R-lang.html

Page 5: Ten things I DON’T hate about you: some things I didnt ... · Ten things I DON’T hate about you: some things I didnt know when I started using R that I wish I had Ty Stanford

Wish I knew... #3

Indexing syntax

Page 6: Ten things I DON’T hate about you: some things I didnt ... · Ten things I DON’T hate about you: some things I didnt know when I started using R that I wish I had Ty Stanford

> ### create a matrix and play with indexes> A<-matrix(101:108,nrow=2)> colnames(A)<-paste("C",1:4,sep="")> rownames(A)<-paste("R",1:2,sep="")> A

C1 C2 C3 C4R1 101 103 105 107R2 102 104 106 108> A[2,4][1] 108> A[1:2,4]R1 R2

107 108> A[3:5][1] 103 104 105> A[-(3:5)][1] 101 102 106 107 108> A[,"C3"]R1 R2

105 106> ### which() is a great fn to get indexes> which(A>105)[1] 6 7 8> A[A>105][1] 106 107 108> A %in% c(106,101,103)[1] TRUE FALSE TRUE FALSE FALSE TRUE FALSE FALSE> c(106,101,103) %in% A[1] TRUE TRUE TRUE

Page 7: Ten things I DON’T hate about you: some things I didnt ... · Ten things I DON’T hate about you: some things I didnt know when I started using R that I wish I had Ty Stanford

Wish I knew... #4

Vectorise your code!

I Will run faster

I Makes code easier to read

I But what is vectorised code...

Page 8: Ten things I DON’T hate about you: some things I didnt ... · Ten things I DON’T hate about you: some things I didnt know when I started using R that I wish I had Ty Stanford

> n_a<-5> (a<-seq(101,length=n_a))[1] 101 102 103 104 105> (a<-sample(a))[1] 104 102 103 105 101> diff(a)[1] -2 1 2 -4> ### how can we get diff(a)?> ### #1 - for loop over elements> (diffa1<-rep(0,n_a-1))[1] 0 0 0 0> for(i in 1:(n_a-1)) diffa1[i]<-a[i+1]-a[i]> diffa1[1] -2 1 2 -4> ### #2 let’s vectorise> (indxs1<-1:(n_a-1))[1] 1 2 3 4> (indxs2<-indxs1+1)[1] 2 3 4 5> (diffa2<-a[indxs2]-a[indxs1])[1] -2 1 2 -4

Page 9: Ten things I DON’T hate about you: some things I didnt ... · Ten things I DON’T hate about you: some things I didnt know when I started using R that I wish I had Ty Stanford

Wish I knew... #5

The list() object

We’re all probably familiar with the data structures:

I c(), matrix, data.frame

A list() is a more generic structure that you can bundle manytypes of objects together

I Handy for different length data structures

Page 10: Ten things I DON’T hate about you: some things I didnt ... · Ten things I DON’T hate about you: some things I didnt know when I started using R that I wish I had Ty Stanford

Some examples

> #empty initialised list> a.list<-list()> a.list[[2]]<-c("a","b")> a.list[[1]]NULL

[[2]][1] "a" "b"

> #known length list> b.list<-vector(mode="list",length=2)> b.list[[1]]NULL

[[2]]NULL

Page 11: Ten things I DON’T hate about you: some things I didnt ... · Ten things I DON’T hate about you: some things I didnt know when I started using R that I wish I had Ty Stanford

More examples

> #list with named elements> x<-matrix(1:4,nrow=2)> y<-c("a","b","c")> c.list<-list(x=x,letters=y)> c.list$x

[,1] [,2][1,] 1 3[2,] 2 4

$letters[1] "a" "b" "c"

> #note there are differences to extracting elements to matrices etc> c.list$letters #extract element at pos 2 by using element name[1] "a" "b" "c"> c.list[2] #this returns equivalent to a 1 element list$letters[1] "a" "b" "c"

> c.list[[2]] #this returns the col vec at element 2[1] "a" "b" "c"

Page 12: Ten things I DON’T hate about you: some things I didnt ... · Ten things I DON’T hate about you: some things I didnt know when I started using R that I wish I had Ty Stanford

Wish I knew... #6

system.time()

### how long does a function take?system.time(y<-somefunc(x))

### OR ###

### how long does some system of statements take?### start!t0<-proc.time()[3]

### <<do stuff>>

### how long did it take?time.taken<-proc.time()[3]-t0cat("The process took",time.taken,"seconds \n")

Page 13: Ten things I DON’T hate about you: some things I didnt ... · Ten things I DON’T hate about you: some things I didnt know when I started using R that I wish I had Ty Stanford

Wish I knew... #7

You need to re-install ALL of your packages if you upgrade to thenewer R!

How do you remember all the packages you’ve installed?

You don’t.1

1onertipaday.blogspot.com.au

Page 14: Ten things I DON’T hate about you: some things I didnt ... · Ten things I DON’T hate about you: some things I didnt know when I started using R that I wish I had Ty Stanford

Step 1, before you get rid of your old R version

setwd("<<where you wanna save>>")tmp <- installed.packages()installedpkgs <- as.vector(tmp[is.na(tmp[,"Priority"]), 1])save(installedpkgs, file="installed_old.rda")

Page 15: Ten things I DON’T hate about you: some things I didnt ... · Ten things I DON’T hate about you: some things I didnt know when I started using R that I wish I had Ty Stanford

Step 2, install new R version and run...

setwd("<<where you saved the .rda file just now>>")load("installed_old.rda")tmp <- installed.packages()installedpkgs.new <- as.vector(tmp[is.na(tmp[,"Priority"]), 1])missing <- setdiff(installedpkgs, installedpkgs.new)install.packages(missing)update.packages()

Page 16: Ten things I DON’T hate about you: some things I didnt ... · Ten things I DON’T hate about you: some things I didnt know when I started using R that I wish I had Ty Stanford

Wish I knew... #8

A LATEX & R tip

Want to put syntax highlighted code in a LATEX document orBeamer slideshow?

pygments.org

Page 17: Ten things I DON’T hate about you: some things I didnt ... · Ten things I DON’T hate about you: some things I didnt know when I started using R that I wish I had Ty Stanford
Page 18: Ten things I DON’T hate about you: some things I didnt ... · Ten things I DON’T hate about you: some things I didnt know when I started using R that I wish I had Ty Stanford
Page 19: Ten things I DON’T hate about you: some things I didnt ... · Ten things I DON’T hate about you: some things I didnt know when I started using R that I wish I had Ty Stanford

An example

Say we have a file, someRcode.R as seen below

and we want to incorperate into a LATEX document

### some R code### include some maths: $\phi\left[\frac{\pi}{2}\right]=\Delta$

afunc<-function(x) return(xˆ2)cat("We are outputting some text as it highlights nice! \n")x<-seq(-5,5,length=100)plot(x,afunc(x))

Page 20: Ten things I DON’T hate about you: some things I didnt ... · Ten things I DON’T hate about you: some things I didnt know when I started using R that I wish I had Ty Stanford

Install pygments...

Then in the command line shell

$ cd Dropbox/Rusers/pygments

## get preamble code, put this in Rstyle.tex$ pygmentize -f tex -S autumn -a .syntax > Rstyle.tex

## now pygmentise "someRcode.R"$ pygmentize -O mathescape=True,style=autumn

-P "verboptions=frame=lines,gobble=0,numbers=left,..."-o someRcode.tex someRcode.R

Page 21: Ten things I DON’T hate about you: some things I didnt ... · Ten things I DON’T hate about you: some things I didnt know when I started using R that I wish I had Ty Stanford

Then your .tex document looks like:

\documentclass[11pt]{article}

%need these packages\usepackage{fancyvrb}\usepackage{color}

%input pygments style commands%(not overly human readable)\input{Rstyle.tex}

\begin{document}

%input the file created - pygments syntax highlighting\input{someRcode.tex}

\end{document}

Page 22: Ten things I DON’T hate about you: some things I didnt ... · Ten things I DON’T hate about you: some things I didnt know when I started using R that I wish I had Ty Stanford

And the output...

someRcode.R1 ### some R code2 ### include some maths: φ

[π2

]= ∆

3

4 afunc<-function(x) return(xˆ2)5 cat("We are outputting some text as it highlights nice! \n")6 x<-seq(-5,5,length=100)7 plot(x,afunc(x))

Page 23: Ten things I DON’T hate about you: some things I didnt ... · Ten things I DON’T hate about you: some things I didnt know when I started using R that I wish I had Ty Stanford

Wish I knew... #9

The package compiler

Since R v2.13 there is the package of compiler included

Page 24: Ten things I DON’T hate about you: some things I didnt ... · Ten things I DON’T hate about you: some things I didnt know when I started using R that I wish I had Ty Stanford

An example

require(compiler)

oursd<-function(x){

nx<-length(x)sdout<-xbar<-0for(i in 1:nx) xbar<-xbar+x[i]xbar<-xbar/nxfor(i in 1:nx) sdout<-sdout+(xbar-x[i])ˆ2sdout<-sdout/(nx-1)return(sqrt(sdout))

}compiledsd<-cmpfun(oursd)

Page 25: Ten things I DON’T hate about you: some things I didnt ... · Ten things I DON’T hate about you: some things I didnt know when I started using R that I wish I had Ty Stanford

Test it!

> set.seed(87455687)> x<-runif(1e7) ### Ten million obs> system.time(sd1<-oursd(x))# user system elapsed# 31.794 0.268 35.017> system.time(sd2<-compiledsd(x))3 user system elapsed# 7.083 0.062 7.787> system.time(sd3<-sd(x))# user system elapsed# 0.091 0.000 0.095> sd1#[1] 0.2886422> sd2#[1] 0.2886422> sd3#[1] 0.2886422

Page 26: Ten things I DON’T hate about you: some things I didnt ... · Ten things I DON’T hate about you: some things I didnt know when I started using R that I wish I had Ty Stanford

Wish I knew... #10

You can call to C for computationally intense algorithms

Use the R function

.C("C func name",arg1,arg2,...)

This returns the list

[[1]]

arg1

[[2]]

arg2...

Page 27: Ten things I DON’T hate about you: some things I didnt ... · Ten things I DON’T hate about you: some things I didnt know when I started using R that I wish I had Ty Stanford

Create a void C function

c sd.c#include <R.h>

void c_sd(double *varout, double *myvec, int *nv){

double xbar=0;double tempval=0;int n=*nv;for(int i=0;i<n;i++)

xbar+=myvec[i];xbar=xbar/n;for(int i=0;i<n;i++)

tempval+=(myvec[i]-xbar)*(myvec[i]-xbar);

*varout=tempval/(n-1);}

Then compile it in the command window

$ R CMD SHLIB c_sd.c

Page 28: Ten things I DON’T hate about you: some things I didnt ... · Ten things I DON’T hate about you: some things I didnt know when I started using R that I wish I had Ty Stanford

This is how you call it in R

dyn.load("Dropbox/Rusers/Code/c_sd.so")c_sd<-function(x){

sdout<-as.double(0)x<-as.double(x)nx<-as.integer(length(x))sdout<-.C("c_sd",sdout,x,nx)[[1]]return(sqrt(sdout))

}system.time(sd4<-c_sd(x))# user system elapsed# 0.190 0.126 0.334sd4#[1] 0.2886422

Page 29: Ten things I DON’T hate about you: some things I didnt ... · Ten things I DON’T hate about you: some things I didnt know when I started using R that I wish I had Ty Stanford

Wish I knew... #11

The R function dir.create() is somewhat limited

Sometimes you need to create output dynamically

I and want to create folders accordingly

dir.create() won’t solve all your problems...

Page 30: Ten things I DON’T hate about you: some things I didnt ... · Ten things I DON’T hate about you: some things I didnt know when I started using R that I wish I had Ty Stanford

A function to help

createMultDir<-function(dirloc,os=.Platform$OS.type){if(os=="unix"){

ourdirs<-strsplit(dirloc,"/")[[1]]if(ourdirs[1]=="") ourdirs<-ourdirs[2:length(ourdirs)] #starts with "/"nd<-length(ourdirs)movingdir<-"/"anyCreate<-FALSEfor(i in 1:nd){

movingdir<-paste(movingdir,ourdirs[i],sep="")if(!file.exists(movingdir)){

dir.create(movingdir)anyCreate<-TRUE

}movingdir<-paste(movingdir,"/",sep="")

}if(anyCreate) return(paste("sucessfully created:",movingdir))else return(paste("The file path:",movingdir,"already exists"))

}else{

return("not a mac - not yet implemented")}

}

Page 31: Ten things I DON’T hate about you: some things I didnt ... · Ten things I DON’T hate about you: some things I didnt know when I started using R that I wish I had Ty Stanford

Use the function

> createMultDir("/Dropbox/Rusers/TestFolder")[1] "sucessfully created: /Dropbox/Rusers/TestFolder/"

### try again> createMultDir("/Dropbox/Rusers/TestFolder")[1] "The file path: /Dropbox/Rusers/TestFolder/ already exists"

Page 32: Ten things I DON’T hate about you: some things I didnt ... · Ten things I DON’T hate about you: some things I didnt know when I started using R that I wish I had Ty Stanford

Wish I knew... #12

Nah, I’ll leave it there.

Thank you for your attention.