r-studio and revolution analytics have built additional functionality on top of base r

21
R-Studio and Revolution Analytics have built additional functionality on top of base R.

Upload: rose-hines

Post on 02-Jan-2016

228 views

Category:

Documents


1 download

TRANSCRIPT

R-Studio and Revolution Analytics have built additional functionality on top of base R.

Revolution Analytics has moved onto the radar screen for predictive analytics

http://www.forrester.com/pimages/rws/reprints/document/85601/oid/1-KWYFVB

Enter CommandsView Results

Write Code/ Program- Input Data- Analyze- Graphics Datasets, etc.

Character Vector: b <- c("one","two","three")

numeric vector

character vector

Numeric Vector: a <- c(1,2,5.3,6,-2,4)

Matrix: y<-matrix(1:20, nrow=5,ncol=4)

Dataframe:d <- c(1,2,3,4)e <- c("red", "white", "red", NA)f <- c(TRUE,TRUE,TRUE,FALSE)mydata <- data.frame(d,e,f)names(mydata) <- c("ID","Color","Passed")

List:w <- list(name="Fred", age=5.3)

Data Structures

Framework Source: Hadley Wickham

Actor Heights

1) Create Vectors of Actor Names, Heights, Date of Birth, Gender

2) Combine the 4 Vectors into a DataFrame

• Numeric: e.g. heights

• String: e.g. names

• Dates: “12-03-2013

• Factor: e.g. gender

• Boolean: TRUE, FALSE

Variable Types

• We use the c() function and list all values in quotations so that R knows that it is string data.

• ?c Combine Values into a Vector or List

Creating a Character / String Vector

• Create a variable (aka object) called ActorNames:

ActorNames <- c(“John", “Meryl”, “Jennifer", “Andre")

Creating a Character / String Vector

Class, Length, Index

class(ActorNames)

length(ActorNames)

ActorNames[2]

• Create a variable called ActorHeights (inches):

ActorHeights <- c(77, 66, 70, 90)

Creating a Numeric Vector / Variable

• Use the as.Date() function:

ActorDoB <-as.Date(c("1930-10-27", "1949-06-22", "1990-08-15", "1946-05-19“ ))

• Each date has been entered as a text string (in quotations) in the appropriate format (yyyy-mm-dd).

• By enclosing these data in the as.Date() function, these strings are converted to date objects.

Creating a Date Variable

• Use the factor() function:

ActorGender <- c(“male", “female", “female", “male“ )

class(ActorGender)

ActorGender <- factor(ActorGender)

Creating a Categorical / Factor Variable

Actor.DF <- data.frame(Name=ActorNames, Height=ActorHeights, BirthDate = ActorDob, Gender=ActorGender)

Vectors and DataFrames

dim(Actor.DF)

1 2 3 4

Actor.DF[4,3] # row 1, column 3

Actor.DF[1,3] # row 4, column 3

Actor.DF[1,]

# row 1Actor.DF[2:3,]

# rows 2,3, all

columns

# column 2Actor.DF[,2]

Accessing Rows and Columns

> getwd()[1] "C:/Users/johnp_000/Documents"

> setwd()

getwd() setwd()

• write.table(Actors.DF, “ActorData.txt", sep="\t", row.names = TRUE)

• write.csv(Actors.DF, “ActorData.csv")

Write / Create a File

Add New Variable: Height -> Feet, Inches

Actor.DF$Feet <- floor(Actor.DF$Height/12)Actor.DF$Inches <- Actor.DF$Height - (Actor.DF$Feet *12)

Sort

Actor.DF[with(Actor.DF, order(-Height)), ]

Logical Operators / Filter

Actor.DF$Height > 68Actor.DF$Gender == "female"

?'['Actor.DF[Actor.DF$Gender == "female",]

http://www.statmethods.net/management/operators.html