Download - Introduction to R programming
Quantitative Data Analysis
Working with R
Working with RWhat is R
A computer language, with orientation toward statistical applications
AdvantagesCompletely free, just download from Internet
Many add-on packages for specialized uses
Open source
Getting Started: Installing RHave Internet connectionGo to http://cran.r-project/R for Windows screen, click “base”Find, click on download R Click Run, OK, or Next for all screensEnd up with R icon on desktop
At http://cran.r-project.org/
Haga clic para modificar el estilo de texto del patrónSegundo nivel
● Tercer nivel● Cuarto nivel
● Quinto nivel
Downloading Base R
Click on WindowsThen in next screen, click on “base”Then screens for Run, OK, or NextAnd finally “Finish”
will put R icon on desktop
Rgui and R Consolenending with R prompt (>)
Haga clic para modificar el estilo de texto del patrónSegundo nivel
● Tercer nivel● Cuarto nivel
● Quinto nivel
The R prompt (>)
> This is the “R prompt.” It says R is ready to take your command.Enter these after the prompt, observe output
>2+3
>2^3+(5)
>6/2+(8+5)
>2 ^ 3 + (5)
Installing Packages and Libraries
install.packages("akima")install.packages("chron")install.packages("lme4")install.packages("mcmc")install.packages("odesolve")install.packages("spdep")install.packages("spatstat")install.packages("tree")install.packages("lattice")
Installing Packages and Libraries
Installing Packages and Libraries
R.versioninstalled.packages()update.packages()setRepositories()
Help
help(mean) ?meanhelp will not find a function in a package unless you install it and load it with libraryhelp.search(“aspline”) will find functions in packages installed but not loadedapropos("lm")
Help
For help on whole package:help(package=akima)
objects(grep("akima",search()))
library(“akima”) my.packages <- search()aki <- grep("akima",my.packages)my.objects <- objects(aki)
Help
example(mean)
demo()demo(package = packages(all.available = TRUE))demo(graphics)
vignette(all=TRUE)V <- vignette("sp")print(V)edit(V)
Maintenance
ls() / objects()search()class(a)rm(a,b,c)rm(list=ls())
Maintenance
getwd()setwd()source("myprogram.R ")save(list = ls(all=TRUE), file= "all.Rdata")load("all.Rdata")save.image()savehistory()
To cite use of R
To cite the use of R for statistical work, R documentation recommends the following: R Development Core Team (2010). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org/.
Get the latest citation by typing citation ( ) at the prompt.
Email Support Lists
http://r-project.org under "mailing lists"r-help is the most general oneBefore posting, read: http://www.R-project.org/postingguide.htmlSend the smallest possible example of your problem (generated data is handy)sessionInfo() will list your computer & R details to cut/paste to your question
Quantitative Data Analysis
Programming with R
Basic concepts
CodeCommandsProgramsObjectsTypesFunctionsOperators
assignment
a <- 1assign("b", 2)
Mathematical operators
+ - */ ^ arithmetic> >= < <= == != relational! & logical$ list indexing (the ‘element name’ operator): create a sequence~ model formulae
Logical operators
! logical NOT& logical AND| logical OR< less than<= less than or equal to> greater than>= greater than or equal to== logical equals (double =)!= not equal&& AND with IF|| OR with IFxor(x,y) exclusive ORisTRUE(x) an abbreviation of identical(TRUE,x)all(x)any(x)
Mathematical functions
log(x) log to base e of xexp(x) antilog of x exlog(x,n) log to base n of xlog10(x) log to base 10 of xsqrt(x) square root of x
factorial(x) x!choose(n,x) binomial coefficients n!/(x! n−x!)gamma(x) x, for real x x−1!, for integer xlgamma(x) natural log of x
Mathematical functions
floor(x) greatest integer <xceiling(x) smallest integer >xtrunc(x) round(x, digits=0) round the value of x to an integerabs(x) the absolute value of x, ignoring the minus sign if there is onesignif(x, digits=6) give x to 6 digits in scientific notation
Trigonometrical functions
cos(x) cosine of x in radianssin(x) sine of x in radianstan(x) tangent of x in radiansacos(x), asin(x), atan(x) inverse trigonometric transformations of real or complex numbersacosh(x), asinh(x), atanh(x) inverse hyperbolic trigonometric transformations of real or complex numbers
Infinity and Things that Are Not a Number
Inf (is.finite,is.infinite)3/0
2 / Inf
exp(-Inf)
(0:3)^Inf
NaN (is.nan)0/0
Vectors
a <- c(1,2,3,4,5)a <- 1:5a <- scan()a <- seq(1,10,2)b <- 1:4a <- seq(1,10,along=b)x <- runif(10)which(a == 2)
Plotting functions
x<-seq(-10,10,0.1)y<-x^3plot(x,y,type=‘l’)
Vector functions
max(x) maximum value in xmin(x) minimum value in xsum(x) total of all the values in xsort(x) a sorted version of xrank(x) vector of the ranks of the values in xorder(x) an integer vector containing the permutation to sort x into ascending orderrange(x) vector of minx and maxx
More functions
cumsum(x) vector containing the sum of all of the elements up to that pointcumprod(x) vector containing the product of all of the elements up to that pointcummax(x) vector of non-decreasing numbers which are the cumulative maxima of the values in x up to that pointcummin(x) vector of non-increasing numbers which are the cumulative minima of the values in x up to that pointpmax(x,y,z) vector, of length equal to the longest of x y or z, containing the maximum of x y or z for the ith position in eachpmin(x,y,z) vector, of length equal to the longest of x y or z, containing the minimum of x y or z for the ith position in eachrowSums(x) row totals of dataframe or matrix xcolSums(x) column totals of dataframe or matrix x
functions
Geometric mean (p.49)
geometric<-function (x) exp(mean(log(x)))
Harmonic mean (p.51)
harmonic<-function (x) 1/mean(1/x)
Exercises
Finding the value in a vector that is closest to a specified valueclosest<-function(xv,sv){ xv[which(abs(xv-sv)==min(abs(xv-sv)))]}
Calculate a trimmed mean of x which ignores both the smallest and largest values
trimmed.mean <- function (x) { mean(x[-c(which(x==min(x)),which(x==max(x)))])}
Sets
union(x,y)intersect(x,y)setdiff(x,y)setequal(x,y),is.element(el,set)
Matrices
X<-matrix(c(1,0,0,0,1,0,0,0,1),nrow=3)dim(X)is.matrix(X)
vector<-c(1,2,3,4,4,3,2,1)V<-matrix(vector,byrow=T,nrow=2)dim(vector) <- c(2,4)
Matrices
X<-rbind(X,apply(X,2,mean))X<-cbind(X,apply(X,1,var))
sweep
matdata<-read.table("data\\sweepdata.txt")cols<-apply(matdata,2,mean)sweep(matdata,2,cols)
listsperson <- list()person$name <- "Alberto”person$age <- 37person$nationality <- "Spain“class(persona)[1] "list"
> persona$name[1] "Alberto"
$age[1] 37
$nationality[1] "Spain"
names(persona)[1] “name" “age" "nationality"
Stringsphrase<-"the quick brown fox jumps over the lazy dog"letras <- table(strsplit(phrase,split=character(0)))numwords<-1+table(strsplit(phrase,split=character(0)))[1]
words <- unlist(strsplit(phrase,split=" "))words[grep("o",words)]"fox" %in% unlist(strsplit(phrase,split=" "))unlist(strsplit(phrase,,split=" ")) %in% c("fox","dog")
Strings
nchar(words)paste(words[1],words[2])toupper(words)
Regular expressions
grep("^t", words)words[grep("^t", words)]words[grep("s$", words)]gsub("o","O",words)regexp()
Dataframes
lista <- data.frame() lista[1,1] = "Alberto"lista[1,2] = 37lista[2,1] = "Ana"lista[2,2] = 23names(lista) <- c("Ana", "Edad")
Missing values
NA (is.na)x<-c(1:8,NA)mean(x)mean(x,na.rm=T)which(is.na(x))as.vector(na.omit(x))x[!is.na(x)]
Dates and Times in R
date()date<- as.POSIXlt(Sys.time())unlist(unclass(date))difftime()excel.dates <- c("27/02/2004", "27/02/2005", "14/01/2003“,"28/06/2005", "01/01/1999")strptime(excel.dates,format="%d/%m/%Y")
Testing and Coercing in R
if
if (y > 0) print(1) else print (-1)z <- ifelse (y < 0, -1, 1)
Loops and Repeatsfor (i in 1:10) print(i^2)
t = 1
while(t<=10) {
print(i^2)
i <- i + 1
}
t = 1
repeat {
if (i > 10)break
print(i^2)
i <- i + 1
}
Exercise
Compute the Fibonacci series 1, 1, 2, 3, 5, 8
fibonacci<-function(n) {
a<-1
b<-0
while(n>0)
{swap<-a
a<-a+b
b<-swap
n<-n-1 }
b }
Avoid loops
x<-runif(10000000)
system.time(max(x))
pc<-proc.time()
cmax<-x[1]
for (i in 2:length(x)) {
if(x[i]>cmax) cmax<-x[i]
}
proc.time()-pc
switch
central<-function(y, measure) {switch(measure,
Mean = mean(y),
Geometric = exp(mean(log(y))),
Harmonic = 1/mean(1/y),
Median = median(y),
stop("Measure not included"))
}
Quantitative Data Analysis
Working with datasets
Help for DatasetsTo list built-in datasets:
data()data(package = .packages(all.available = TRUE))data(swiss)
For help on a dataset: help(swiss) “Standardized fertility measure and socio-economic indicators for each of 47 French-speaking provinces of Switzerland at about 1888.”
The attach Command
To access individual variables, do this:> attach(swiss)Now try:> mean(Fertility)> detach(swiss)
Using R Functions: Simple Stuff
rownames(swiss)colnames(swiss)• summary(swiss)
Applying functionsmean(swiss$Fertility)
sd(swiss$Fertility)
apply(swiss,2,max)
Factorsclass(Detergent)nlevels(Detergent)levels(Detergent)as.factor()
Working with your dataset
fix(swiss)hist(Agriculture)plot(Catholic,Fertility)
Working with your own datasets
write.table(swiss, "swiss.txt")swiss2 <- read.table("swiss.txt")
data<-read.table(file.choose(),header=T)
readLines()
Reading data from files
read.table(file) reads a file in table format and creates a data frame from it; the default separator sep="" is any whitespace; use header=TRUE to read the first line as a header of column names; use as.is=TRUE to prevent character vectors from being converted to factors; use comment.char="" to prevent "#" from being interpreted asa comment; use skip=n to skip n lines before reading data; see thehelp for options on row naming, NA treatment, and othersread.csv("filename", header=TRUE) id. but with defaults set for reading comma-delimited filesread.delim("filename", header=TRUE) id. but with defaults setfor reading tab-delimited filesread.fwf(file,widths)read a table of f ixed width f ormatted data into a ’data.frame’; widthsis an integer vector, giving the widths of the fixed-width fields
Example
data<-read.table(".\\data\\daphnia.txt",header=T)names(data)attach(data)table(Detergent)tapply(Growth.rate,Detergent,mean)aggregate(Growth.rate,list(Detergent), mean)tapply(Growth.rate,list(Water,Daphnia),median)with(data,boxplot(Growth.rate ~ Detergent))