data & graphing

27
Data & Graphing vectors data frames importing data contingency tables barplots 18 September 2014 Sherubtse Training

Upload: imelda

Post on 05-Jan-2016

30 views

Category:

Documents


0 download

DESCRIPTION

Data & Graphing. vectors data frames importing data contingency tables barplots. 18 September 2014 Sherubtse Training. Data CLASSES in R. Vector: a single string of data Factor: categorical data, stored as category levels with frequencies Matrix: 2D table of data - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Data & Graphing

Data & Graphing

vectorsdata frames

importing datacontingency tables

barplots

18 September 2014 Sherubtse Training

Page 2: Data & Graphing

Data CLASSES in R• Vector: a single string of data • Factor: categorical data, stored as

category levels with frequencies• Matrix: 2D table of data• Array: >2D table of data• Data Frame: 2D table that can accept

different data modes • List: General structure for organizing all

project data

mem

ory

used

(obj

ect.s

ize)

Page 3: Data & Graphing

Data MODES in R

• Character/String: letters and text in quotation marks

• Numeric/Integer: numbers

• Logical: TRUE, FALSE, T, F (must be capital letters, no quotes; converts to 0 & 1 for arithmetic)

Page 4: Data & Graphing

Data Classes: VectorsVECTORA single string of data of the same “mode”

Examples: Numeric or Integer Modex <- c(1, 0, -5, 10, 300)x <- c(2+2, 9-6, 5)x <- c(2.5, 3.9, 0.7, 4.0)

numeric or integer mode(spaces are for easy reading)

logical modeanswer <- c(TRUE, FALSE, TRUE, TRUE)answer <- c(T, F, T, T)

Examples: Logical Mode

Page 5: Data & Graphing

Data Classes: VectorsVECTORA single string of data of the same “mode”

Examples: Character Mode

character mode (single quotes also okay)

animals <- c(“dog”, ”cat”, ”bird”)string <- c(“a”, ”c”, ”d”, ”z”, ”p”)answer <- c(“T”, “F”, “T”, “T”)values <- c(“-9”, “0.2”, “1.4”)

Page 6: Data & Graphing

1 -5 10 300

Working with Vectors

Use subscripts to refer to elements of a vector:> x <- c(1, 0, -5, 10, 300)

x[3]

x[c(1, 4, 5)]

x[-2]

-5

1 10 300

x[vector_position]

x[1:4] 1 0 -5 10

Page 7: Data & Graphing

Logical Operators

Page 8: Data & Graphing

Working with VectorsEdit the vector:> x <- c(1, 0, -5, 10, 300)

Append (add) data to the end of the vector:

Change a single value in the vector:

1 0 -5 10 300 400 500 700

1 0 -5 10 300 90 500 700

x <- c(x, 400, 500, 700) # NOTE: Also try append()

x[6] <- 90

Replace values > 100 with NA:1 0 -5 10 NA 90 NA NA

x[x>100]<-NAx[which(x>100)]<-NA# Also try replace()

Page 9: Data & Graphing

Importing DataOPTION 1Type data directly into R

OPTION 2Use job <- scan(what="character") to paste in the following data copied from an Excel column

Import the ‘job’ column data (exclude column heading) from the ‘Work’ tab in Excel, and assign it the variable name ‘job’

Page 10: Data & Graphing

How might we graph these data?

Here's a hint...

table(job)

Page 11: Data & Graphing

For example, you can just create a vector with labels, then make a barplot of the vector, or put the vector directly in barplot:job.count <- c("farmer"=12, "government"=2, "laborer"=4, "teacher"=2)

Page 12: Data & Graphing

Importing DataOPTION 3Export the data as a csv- or tab-delimited text file, then import the text file into R

Import the ‘HtWt’ dataset(notice how the data are arrangedin Excel)

Page 13: Data & Graphing

Data Classes: Data Frames

DATA FRAMESA data frame is similar to the data format used in SPSS...different columns can have different modes (numeric, character, factor, etc.)

Page 14: Data & Graphing

Working with Data FramesThere are many way to refer to the elements in data frames... but we will focus on just a few

To access the height column HtWt$cmHtWt[“cm”]HtWt[4]

Page 15: Data & Graphing

Working with Data Frames

To access a rowHtWt[5,]

To access an elementHtWt[5,4] HtWt[5,”cm”]

Page 16: Data & Graphing

What kinds of interesting questions can we ask?What graphs would we make to answer them?

HtWt Data

• Is there a difference in height between UWICE & SFS personnel? Does it differ for males vs. females?

• Is there a difference in weight between UWICE & SFS personnel? Does it differ for males vs. females?

• Is there a relationship between height and weight for UWICE personnel? How about for SFS personnel?

• Is there a relationship between height and weight for males? How about for females?

Page 17: Data & Graphing

Bar PlotsFor comparing COUNTS, PROPORTIONS (%) or MEANS

of data in different qualitative categories. Often we make bar plots of summary data.

Page 18: Data & Graphing

Use the table() function to create a contingency table of sample counts by

INSTITUTE and SEX. Try it also using with()

table(HtWt$institute,HtWt$sex)

Working with Data Frames

Page 19: Data & Graphing

Now make a stacked barplot from the table you just created

Page 20: Data & Graphing

Add title, labels, legend and color...

Page 21: Data & Graphing

Convert it to a side-by-side barplot

Page 22: Data & Graphing

Move the legend to the top centerADD AS AN ARGUMENT: args.legend=list (horiz=T, x="top")

Page 23: Data & Graphing

Transpose the data: t(tab.HtWt)

Page 24: Data & Graphing

Working with Data FramesUse the function subset() to create a new data frame

called ‘UWICE’ that includes only UWICE data

UWICE <- subset(HtWt,institute=="UWICE")

Now subset the HtWt data to get a data frame with only 'SFS' data and only the 'INSTITUTE' and 'SEX'

columns. Call this data frame 'SFS.sex'

SFS.sex <- subset(HtWt,institute=="SFS",select=1:2)

Page 25: Data & Graphing

1) Install & load the package reshape2

2) Import the Livestock data and save it to a variable called farms

3) Use the function cast() to reformat the farms data to a matrix form for stacked barplots:

m.farms<-acast(farms,town~livestock)

4) Make a stacked barplot from m.farms

Reshaping Data

Page 26: Data & Graphing
Page 27: Data & Graphing

Make this graph—note that the y-axis values should be from 0 to 60