an introduction to r - wordpress.com€¦ · an introduction to r introduction the power and...
TRANSCRIPT
An introduction to R
An introduction to R
S. Manzi & A. Salmon
South West Peninsula Collaboration for Leadership in Applied Health Researchand Care (PenCLAHRC)
University of Exeter
January 2019
An introduction to R
Outline
1 Aims and topics of the trainingAims of the trainingTraining topics – IntroductionTraining topics – Core R functionalityTraining topics – Data visualisation and statistics
2 IntroductionInstalling R and R studio
Windows installationLinux installation
The power and limitations of RFamiliarisation with the R studio environmentSetting the working directoryPackages for the extension of R functionalityR conventions
An introduction to R
Aims and topics of the training
Outline
1 Aims and topics of the trainingAims of the trainingTraining topics – IntroductionTraining topics – Core R functionalityTraining topics – Data visualisation and statistics
2 IntroductionInstalling R and R studio
Windows installationLinux installation
The power and limitations of RFamiliarisation with the R studio environmentSetting the working directoryPackages for the extension of R functionalityR conventions
An introduction to R
Aims and topics of the training
Aims of the training
Aims of this training
To familarise you withpurpose and use of R and Rstudio
To provide you with anunderstanding of the basicdata structures andfunctionality of R
To provide you with theskills to produce datavisualisations in R
To introduce the use ofpackages for extending thefunctionality of R
An introduction to R
Aims and topics of the training
Training topics – Introduction
Training topics – Introduction
Introduction
Installation of R and RstudioThe power and limitationsof RFamiliarisation with the Rstudio environmentPackages for theextension of RfunctionalityThe R language
An introduction to R
Aims and topics of the training
Training topics – Core R functionality
Training topics – Core R functionality
Core R functionality and skillsSequence generation and replicationVariable types and conversionData structures and conversionVector arithmaticIdentifying unique valuesArray sizesString concatenation and splittingConditional logic and functionsReading from and writing to filesSubscripting and subsettingMerging and appending dataSorting dataDate and time data handlingTablulationLoops and the apply based functionsUser defined functions
An introduction to R
Aims and topics of the training
Training topics – Data visualisation and statistics
Data visualisation and statistics
Plotting
Scatter and line graphsHistogramsBar chartsStatistical process controlcharts
Statistics
Descriptive statisticsT-testsLinear regression
An introduction to R
Introduction
Outline
1 Aims and topics of the trainingAims of the trainingTraining topics – IntroductionTraining topics – Core R functionalityTraining topics – Data visualisation and statistics
2 IntroductionInstalling R and R studio
Windows installationLinux installation
The power and limitations of RFamiliarisation with the R studio environmentSetting the working directoryPackages for the extension of R functionalityR conventions
An introduction to R
Introduction
Installing R and R studio
Windows installation – base R
Go to the R project websitehttps:
//www.r-project.org/
Click on ’download R’ in theget started section
R uses a hosting structurecalled CRAN(Comprehensive R ArchiveNetwork). Scroll down thelist and select your localCRAN mirror. This willmost likely be Bristol
An introduction to R
Introduction
Installing R and R studio
Windows installation – base R
Select ’Download R for Windows’
Click on the ’base’ link as we wantto install base R.
Click on ’Download R 3.5.2 forWindows
Your download should begin
Launch the .exe installer and followthe on screen instructions
An introduction to R
Introduction
Installing R and R studio
Windows installation – R Studio
Go to the R Studiodownload pagehttps://www.rstudio.com/
products/rstudio/
download/
Select the Windows installer
Your download should begin
Launch the .exe installer andfollow the on screeninstructions
An introduction to R
Introduction
Installing R and R studio
Linux installation - R and R Studio
The R base is available through the Ubuntu software center oryou can install it from the command line using sudo apt installrbase
RStudio can be installed in a couple of different ways but theeasiest (thanks Mike) seems to be:
Download the R studio installer from https:
//download1.rstudio.org/rstudio-1.1.463-amd64.deb
Go to your downloads folder and open a terminal there thentype sudo dpkg -i rstudio...(use tab complete to get the full filename)To complete the installation some dependencies need to beinstalled. Use sudo apt install -fR Studio should now be installed
An introduction to R
Introduction
The power and limitations of R
S as the basis for R
Developed as part of theGNU project – free andopen source
The R language is based onthe S language developed byJohn Chambers in 1976while at Bell Laboratoriesspecifically for statisticalcomputing
An introduction to R
Introduction
The power and limitations of R
The development of R
R was created by Ross Ihakaand Robert Gentleman atthe University of Auckland
R was named partly afterthe first names of the twocreators and partly as a playon the name of S.
The project began in 1992,the first version was releasedin 1995 and a stable betaversion was released in 2000
An introduction to R
Introduction
The power and limitations of R
The power of R
Designed for vector and matrixarithmatic enabling faster calaculationover larger matrices than possible withobject orientated languages such asC++, Java and Python
Designed as a script based languagerather than object orientated althoughdoes allow meta-programming forobject like tasks
Supports parallel processing andmultithreading
Can implement other languages e.g.C, C++, Fortran, Python and LaTex
An introduction to R
Introduction
Familiarisation with the R studio environment
The R console
R consolewindow -directly inputcommands intoR, view inputsand outputs
An introduction to R
Introduction
Familiarisation with the R studio environment
R Scripting window
Script window- write and runan R script
An introduction to R
Introduction
Familiarisation with the R studio environment
The variable explorer
Variable explorer -lists activevariables/objects
An introduction to R
Introduction
Familiarisation with the R studio environment
The plot window
Plot window -displays andretains all plots
An introduction to R
Introduction
Setting the working directory
Setting your working directory in R Studio
Navigate to the tools dropdown on the top tool bar inR Studio
Select global options whichopen a new window
The first option you will seeis the working directorysetting
Browse or manually enterthe location of your Rworking directory
An introduction to R
Introduction
Setting the working directory
Setting your working directory in an R script
Normally you will want to place your R scripts in specificsub-folders of your working directory to keep things organised
The command to set the working directory manually in ascript is setwd(’folder location’)
When reading or writing anything to and from sub-folders ofyour working directory you need to add the sub-folder name(s)in the standard drive location format e.g.’/subfolder/subsubfolder/subsubsubfolder’
An introduction to R
Introduction
Packages for the extension of R functionality
Installing and using packages
R makes use of packages toextend its base functionality
Packages need to beinstalled for eachdistribution of R that yourun
Packages are installed in Rstudio by going to the toolsdrop down on the top toolbar, selecting installpackages, searching for therequired package andselecting install
An introduction to R
Introduction
Packages for the extension of R functionality
Installing and using packages
Packages can also beinstalled from the Rcommand line using thecommand install.packages()function
Packages need to beinitialised in an R script.This is done by conventionat the start of the scriptusing library(package name)
An introduction to R
Introduction
R conventions
R does not require the use of correct indentations for the codeto run however, convention dictates that standard indents areused
The # (hash) symbol is used to insert comments into the code
R uses specific assignment operators
< − is assign the output from the right to the left= is the same as ¡- but it is poor practice to use this operatorin this way< − is assign the output from the left to the right<< − is assign the out from the right to the global/parentvariable on the left− >> is assign the out from the left to the global/parentvariable on the right