writing functions in r
DESCRIPTION
Writing functions in R. Some handy advice for creating your own functions. A quick review of R. R is a statistical software package and an object-oriented programming language Terms to remember: Vectors, matrices, and dataframes Indices Functions. Warm up. Download the data for lab 3 - PowerPoint PPT PresentationTRANSCRIPT
Writing functions in R
Some handy advice for creating your own functions
A quick review of R
R is a statistical software package and an object-oriented programming language
Terms to remember: Vectors, matrices, and dataframes Indices Functions
Warm up
Download the data for lab 3 In Rstudio, go to Workspace → Import Dataset
→ From Text File Make sure to select the header option If you're not using Rstudio, the code is:
data_lab_3 <- read.csv("~/documents/classes/Psych 1950/mood.csv")
Where ~ is the path name
Warming up a little more
Use the help() function to read about the read.csv() function
How could we use it to read in a file with no header?
read.csv(“filename”,header=FALSE) We can also use R to read in SPSS files, but
for now we'll stick with read.csv()
Last page of warm-up (I promise!)
Find the standard deviation (sd()) of the second column (puDay2call1) of your dataframe
Uh-oh! That output isn't helpful Add the following argument to the standard
deviation function: na.rm=TRUE
A slight modification
Suppose that we want to calculate the standard deviation using the population formula
Check the help file for sd(). Is there a way to do that?
Nope! We'll need to make our own....
Making a function
Let's start with something easier We'll make our own mean() function What should it do?
We'll pass* it a vector of numbers as arguments*
It should return* the mean
*programming jargon
The function syntax
getMean <- function(arguments){
commands go here
} The name of the function is getMean() (this
is usually a verb) The arguments are the values and instructions
we give to the function The body is where the work happens
Iteration 1
getMean <- function(x){
return(sum(x)/length(x))
} Try this on the second column How can we handle NAs in the function,
assuming we ALWAYS want to remove them?
Iteration 2
getMean <- function(x){
return(sum(x,na.rm=T)/length(x))
} Now try this one, and compare your results to
R's built-in mean function Why aren't the values the same?
Hint: what's the length of a vector that contains NAs?
Iteration 3
getMean <- function(x){
return(sum(x,na.rm=T)/length(na.omit(x))
} Another R function saves the day! Thanks, R! Compare your results to the built-in function
Another way to do it
We've been leaning heavily on the sum() function
Sometimes, though, we need to tell R to do a certain operation a number of times
To do that, we use an operation called a for loop
There are other loops as well, but we'll stick with a for loop
The anatomy of a for loop
getFactorial <- function(number){
j=1
for (index in 1:number){ j <- j*index
}
return(j)
} What will this function do?
One more concept
Sometimes, we need a function to make a decision
Here, we use conditionalsif(condition){ #if the condition is true
Something #do this
}
else{ #if it's false
something else #do this instead
}
For examples
if (!is.na(x)){ #if x isn't an NA
print(x) #write x. If it is, nothing
} #will happen
if (x<=4){ #if x is less than 4
print(x-1)
}
if (x==5){ #if x is exactly 5
print(“Five”)
}
Looping to get the mean
getMean_3 <- function(x){
sum <- 0
length <- 0
for (i in 1:length(x)){
if (!is.na(x[i])){ #exclude NAs
sum <- sum+x[i] #keep a running tally of the sum
length <- length+1 #and the length
}
}
return(sum/length) #this is the mean
}
Adding some complexity
It's your turn now: Write two functions to compute the sum of
squared deviations from the mean of a vector In one version, use the sum() function In the other, use a for loop
Try to allow your function to work with a vector that includes some NAs
Remember
The formula for the sum of squares of a set of numbers is the sum of (x
i – mean(x))2
Now make R do it for you!
Last of all
Make a new function that finds the (population) standard deviation of the vector
Find the sum of squares, divide by the number of observations, and take the square root
Test your function to make sure it's working