r lab fall 2018 draft lottery - umasspeople.umass.edu/biep540w/pdf/r lab fall 2018 draft...

10
R Handouts 2018-19 ACTIVITY: 1970 Lottery …\R lab Fall 108 Draft Lottery.docx Page 1 of 10 Introduction to R and R-Studio 2018-19 In-Class Lab Activity The 1970 Draft Lottery Summary The goal of this activity is to give you practice with R Markdown for saving your work. It’s also a fun bit of sleuthing with real data, namely the lottery that determined who would be drafted in 1970 and sent to Viet Nam. R Datasets Used in This Illustration Download the following 2 datasets from the course website page, THIS WEEK lottery1970.Rdata lottery1970monthly.Rdata Packages Used in This Illustration knitr ggplot2 To install these packages (one time), at the console window, type install.packages(“knitr “) install.packages(“ggplot2”) Reminders - 1) When installing packages, don’t forget. install.packages has a period between install and packages 2) When installing packages, the package name must be enclosed in quotes

Upload: others

Post on 22-Sep-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: R lab Fall 2018 Draft Lottery - UMasspeople.umass.edu/biep540w/pdf/R lab Fall 2018 Draft Lottery.pdfIntroduction to R and R-Studio 2018-19 In-Class Lab Activity The 1970 Draft Lottery

R Handouts 2018-19 ACTIVITY: 1970 Lottery

…\R lab Fall 108 Draft Lottery.docx Page 1 of 10

Introduction to R and R-Studio 2018-19

In-Class Lab Activity The 1970 Draft Lottery

Summary The goal of this activity is to give you practice with R Markdown for saving your work. It’s also a fun bit of sleuthing with real data, namely the lottery that determined who would be drafted in 1970 and sent to Viet Nam.

R Datasets Used in This Illustration

Download the following 2 datasets from the course website page, THIS WEEK lottery1970.Rdata lottery1970monthly.Rdata

Packages Used in This Illustration

knitr ggplot2 To install these packages (one time), at the console window, type install.packages(“knitr “) install.packages(“ggplot2”) Reminders - 1) When installing packages, don’t forget. install.packages has a period between install and packages 2) When installing packages, the package name must be enclosed in quotes

Page 2: R lab Fall 2018 Draft Lottery - UMasspeople.umass.edu/biep540w/pdf/R lab Fall 2018 Draft Lottery.pdfIntroduction to R and R-Studio 2018-19 In-Class Lab Activity The 1970 Draft Lottery

R Handouts 2018-19 ACTIVITY: 1970 Lottery

…\R lab Fall 108 Draft Lottery.docx Page 2 of 10

Background – The 1970 Lottery in the US was NOT Random Source: http://ww2.amstat.org/publications/jse/v5n2/datasets.starr.html#fienberg1 “This lottery was a source of considerable discussion before being held on December 1, 1969. Soon afterwards a pattern of unfairness in the results led to further publicity: those with birthdates later in the year seemed to have had more than their share of low lottery numbers and hence were more likely to be drafted. On January 4, 1970, the New York Times ran a long article, "Statisticians Charge Draft Lottery Was Not Random," illustrated with a bar chart of the monthly averages (Rosenbaum 1970a). It described the way the lottery was carried out, and with hindsight one can see how the attempt at randomization broke down. The capsules were put in a box month by month, January through December, and subsequent mixing efforts were insufficient to overcome this sequencing. The details of the procedure are quoted in Fienberg (1971a) and the first three editions of Moore (1979, 1985, 1991).”

Details of the 1970 Lottery Randomization Procedures Source: http://science.sciencemag.org/content/sci/171/3968/255.full.pdf

Page 3: R lab Fall 2018 Draft Lottery - UMasspeople.umass.edu/biep540w/pdf/R lab Fall 2018 Draft Lottery.pdfIntroduction to R and R-Studio 2018-19 In-Class Lab Activity The 1970 Draft Lottery

R Handouts 2018-19 ACTIVITY: 1970 Lottery

…\R lab Fall 108 Draft Lottery.docx Page 3 of 10

__1. Begin your R-Studio Session by Opening An New R Markdown file

Step 1. Launch R Studio Step 2. From the top menu bar: FILE > NEW FILE > R Markdown You should see something like the following (note – yours won’t say Carol Bigelow of course):

• At top right, at title: Type in a title of your choosing • Just below, under Default Output Format: choose your output format

o HTML – This is the default selection. It’s fine to choose this. o Click OK

Page 4: R lab Fall 2018 Draft Lottery - UMasspeople.umass.edu/biep540w/pdf/R lab Fall 2018 Draft Lottery.pdfIntroduction to R and R-Studio 2018-19 In-Class Lab Activity The 1970 Draft Lottery

R Handouts 2018-19 ACTIVITY: 1970 Lottery

…\R lab Fall 108 Draft Lottery.docx Page 4 of 10

Example –

REVIEW - Recall what you are looking at above: * A brand new R Markdown file comes with a bunch of stuff (helpful to read but not necessary to keep). * Each gray shaded area is called a chunk. A chunk is a set of R commands with a “beginning” and “end” CHUNK BEGINNING: Each “chunk” begins with ```{r} or it begins with ```{r SOMETHING YOU CHOOSE HERE} IF you choose ```{r include=FALSE} THEN messages and code will be NOT SHOWN (I do not recommend this) If you choose ```{r echo=FALSE} THEN code will NOT BE SHOWN (I do not recommend this either) Personally, I recommend sticking with beginning each “chunk” using ```{r } CHUNK END: Each “chunk” ends with ```

Page 5: R lab Fall 2018 Draft Lottery - UMasspeople.umass.edu/biep540w/pdf/R lab Fall 2018 Draft Lottery.pdfIntroduction to R and R-Studio 2018-19 In-Class Lab Activity The 1970 Draft Lottery

R Handouts 2018-19 ACTIVITY: 1970 Lottery

…\R lab Fall 108 Draft Lottery.docx Page 5 of 10

Step 3. Clear your brand new R Markdown so that it is empty of everything except your header: - Place your cursor at line 7 of the “shell” R Markdown. Drag to highlight and select all below - Click delete You should now see something like the following (with your name, not mine, obviously):

REVIEW - We will work “chunk by chunk”: writing code, fixing code, executing code

1st – We open a new blank chunk (to do a specific task that we want to do) 2nd - We type some commands into this chunk and then we run it. 3rd - As, typically is the case, we EDIT the commands in this chunk until we get what we like and then re-run it 4th - Once, we’re happy with the current chunk and the current task, we move on to the next chunk/next task!

1st – How to Open a New Blank Chunk Click on the little green “insert a chunk” icon at top (on the right). From the drop down menu, choose R

You should see the following. - You should see the gray chunk start ```{r} - You should see your cursor placed inside - You should see the chunk end ``` - TIP - NEVER delete the chunk start or end!

Page 6: R lab Fall 2018 Draft Lottery - UMasspeople.umass.edu/biep540w/pdf/R lab Fall 2018 Draft Lottery.pdfIntroduction to R and R-Studio 2018-19 In-Class Lab Activity The 1970 Draft Lottery

R Handouts 2018-19 ACTIVITY: 1970 Lottery

…\R lab Fall 108 Draft Lottery.docx Page 6 of 10

Key: BLACK - commands (you type these) BROWN - comments (optional, you type these) BLUE – output

 

1. Read  in  R  dataset  lottery1970.Rdata  #  input  rdataset  lottery1970.Rdata.    Check.  setwd("/Users/cbigelow/Desktop")  load(file="lottery1970.Rdata")  

2. Produce  basic  plot  (no  frills).      No  special  package  required.  #  command  is  plot(dataframe$xvar,  dataframe$yvar)  plot(lotterydata$day,lotterydata$rank)  

 

What  do  you  think?              

Page 7: R lab Fall 2018 Draft Lottery - UMasspeople.umass.edu/biep540w/pdf/R lab Fall 2018 Draft Lottery.pdfIntroduction to R and R-Studio 2018-19 In-Class Lab Activity The 1970 Draft Lottery

R Handouts 2018-19 ACTIVITY: 1970 Lottery

…\R lab Fall 108 Draft Lottery.docx Page 7 of 10

   3. Produce  fancy  scatterplot  of  raw  scatter  using  package  ggplot2  #  command  is  ggplot(dataframe,  aes(x=xvar,  y=yvar))  +  geom_point(  )  library(ggplot2)  p  <-­‐  ggplot(lotterydata,  aes(x=day,y=rank))  p  <-­‐  p  +  geom_point()  p  <-­‐  p  +  xlab("Birth  Date")  p  <-­‐  p  +  ylab("Selective  Service  Lottery  Number")  p  <-­‐  p  +  ggtitle("Scatterplot  of  1970  Draft  Lottery")  p  <-­‐  p  +  theme_bw()  p  

##  Warning:  Removed  1  rows  containing  missing  values  (geom_point).  

 

               

Page 8: R lab Fall 2018 Draft Lottery - UMasspeople.umass.edu/biep540w/pdf/R lab Fall 2018 Draft Lottery.pdfIntroduction to R and R-Studio 2018-19 In-Class Lab Activity The 1970 Draft Lottery

R Handouts 2018-19 ACTIVITY: 1970 Lottery

…\R lab Fall 108 Draft Lottery.docx Page 8 of 10

   4. Obtain  average  lottery  number,  by  month.    No  special  package  required.  #  command  is  aggregate()  aggregate(lotterydata$rank,list(month=lotterydata$month),mean)  

##        month                x  ##  1            1  201.1613  ##  2            2  202.9655  ##  3            3  225.8065  ##  4            4  203.6667  ##  5            5  207.9677  ##  6            6  195.7333  ##  7            7  181.5484  ##  8            8  173.4516  ##  9            9  157.3000  ##  10        10  182.4516  ##  11        11  148.7333  ##  12        12  121.5484  

 The  output  you  got  above  should  match  the  following:  

Page 9: R lab Fall 2018 Draft Lottery - UMasspeople.umass.edu/biep540w/pdf/R lab Fall 2018 Draft Lottery.pdfIntroduction to R and R-Studio 2018-19 In-Class Lab Activity The 1970 Draft Lottery

R Handouts 2018-19 ACTIVITY: 1970 Lottery

…\R lab Fall 108 Draft Lottery.docx Page 9 of 10

   5. So  now  lets  work  with  the  monthly  means.    Next,  load  lottery1970monthly.Rdata  load(file="lottery1970monthly.Rdata")  

6. Produce  fancy  scatterplot,  with  overlay  linear  regression,  of  monthly  means  using  package  ggplot2  #  command  is  ggplot(dataframe,  aes(x=xvar,  y=yvar))  +  geom_point(  )  +  geom_smooth(  )  library(ggplot2)  p  <-­‐  ggplot(monthlydata,  aes(x=xmonth,y=yave_rank))  p  <-­‐  p  +  geom_point()  p  <-­‐  p  +  geom_smooth(method=lm,  se=FALSE)  p  <-­‐  p  +  xlab("Month")  p  <-­‐  p  +  ylab("Average  Selective  Service  Lottery  Number")  p  <-­‐  p  +  ggtitle("1970  Draft  Lottery  -­‐  Monthly  Average")  p  <-­‐  p  +  theme_bw()  p  

 

Now what do you think?

Page 10: R lab Fall 2018 Draft Lottery - UMasspeople.umass.edu/biep540w/pdf/R lab Fall 2018 Draft Lottery.pdfIntroduction to R and R-Studio 2018-19 In-Class Lab Activity The 1970 Draft Lottery

R Handouts 2018-19 ACTIVITY: 1970 Lottery

…\R lab Fall 108 Draft Lottery.docx Page 10 of 10

All done? Save (archiving) your work (nifty either as a record of your work, or for re-use later!)

The action of saving your work is what is meant by knit. How to knit:

- At top click on the drop down menu for the knit icon - From the drop down menu, I recommend that you choose KNIT TO WORD

(Why? Answer – so that you can open this file later in word and perhaps fancy it up a bit) - Tip: Take care to choose a destination folder that you’ll remember (I always choose DESKTOP)