r basics

43
R Basics Xudong Zou Prof. Yundong Wu Dr. Zhiqiang Ye 18 th Dec. 2013 1

Upload: ady

Post on 24-Feb-2016

55 views

Category:

Documents


0 download

DESCRIPTION

R Basics. Xudong Zou Prof. Yundong Wu Dr. Zhiqiang Ye 18 th Dec . 2013. R Basics. History of R language How to use R Data type and Data Structure Data input R programming Summary Case study. History of R language. R obert Gentleman. R oss Ihaka. History of R language. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: R Basics

1

R Basics

Xudong Zou Prof. Yundong Wu Dr. Zhiqiang Ye18th Dec. 2013

Page 2: R Basics

2

R Basics History of R language

How to use R

Data type and Data Structure

Data input

R programming

Summary

Case study

Page 3: R Basics

3

History of R language

Page 4: R Basics

4

Page 5: R Basics

5

Robert Gentleman Ross Ihaka

Page 6: R Basics

6

History of R language

Page 7: R Basics

7

History of R language

Page 8: R Basics

8

History of R language

Page 9: R Basics

9

History of R language

Page 10: R Basics

10

History of R language

Page 11: R Basics

11

History of R language

Page 12: R Basics

12

History of R language

Page 13: R Basics

13

History of R language

Page 14: R Basics

14

History of R language

Page 15: R Basics

15

History of R language

Page 16: R Basics

16

History of R language

Page 17: R Basics

17

2013-09-25:Version: R-3.0.2

Page 18: R Basics

18

History of R language

Page 19: R Basics

19

History of R language

Page 20: R Basics

20

History of R language

Page 21: R Basics

21

History of R language

Page 22: R Basics

22

History of R language

Page 23: R Basics

23

History of R language

5088

Page 24: R Basics

What is R?• R is a programming language, and also a environment for statistics

analysis and graphics

Why use R• R is open and free. Currently contains 5088 packages that makes R a

powerful tool for financial analysis, bioinformatics, social network analysis and natural language process and so on.

• More and more people in science tend to learn and use R

# BioConduct: bioinformatics analysis(microarray)# survival: Survival analysis

Page 25: R Basics

控制台从这里输入命令

How to use R

Page 26: R Basics

?用来获取帮助

新建或打开 R脚本 点这里添加 R包

How to use R

Page 27: R Basics

Data type and Data structure

numeric : integer, single float, double floatcharactercomplexlogical

Data structure in R:

Data type in R :

Objects Class Mixed-class permitted?Vector numeric, char, complex, logical no

Factor numeric, char no

Array numeric, char, complex, logical no

Matrix numeric, char, complex, logical no

Data frame numeric, char, complex, logical yes

list numeric, char, complex, logical, func, exp… yes

Page 28: R Basics

28

Vector and vector operation

Vector is the simplest data structure in R, which is a single entity containing a collection of numbers, characters, complexes or logical. # Create two vectors:

# Check the attributes:

# basic operation on vector:

注意这个向左的箭头

Page 29: R Basics

29

Vector and vector operation# basic operation on vector:

> max( vec1) > min (vec1) > mean( vec1) > median(vec1)> sum(vec1)> summary(vec1)

> vec1> vec1[1]> x <- vec1[-1] ; x[1] > vec1[7] <- 15;vec1

Page 30: R Basics

30

array and matrix

> x <- 1:24> dim( x ) <- c( 4,6) # create a 2D array with 4 rows and 6 columns> dim( x ) <- c(2,3,4) # create a 3D array

An array can be considered as a multiply subscripted collection of data entries.

Page 31: R Basics

31

array and matrix

> x <- 1:24> array( data=x, dim=c(4,6)) > array( x , dim= c(2,3,4) )

array()

array indexing

> x <- 1:24> y <- array( data=x, dim=c(2,3,4))> y[1,1,1]> y[,,2]> y[,,1:2]

Page 32: R Basics

32

array and matrix

> class(potentials) # “matrix”> dim(potentials) # 20 20 > rownames(potentials) # GLY ALA SER …> colnames(potentials) # GLY ALA SER …> min(potentials) # -4.4

Matrix is a specific array that its dimension is 2

Page 33: R Basics

33

list List is an object that containing other objects as its component which can be a numeric vector, a logical value, a character or another list, and so on. And the components of a list do not need to be one type, they can be mixed type.

>Lst <- list(drugName="warfarin",no.target=3,price=500,+ symb.target=c("geneA","geneB","geneC")

>length(Lst) # 4>attributes(Lst) >names(Lst)>Lst[[1]]>Lst[[“drugName”]]>Lst$drugName

Page 34: R Basics

34

Data Frame A data frame is a list with some restricts:

① the components must be vectors, factors, numeric matrices, lists or other data frame

② Numeric vectors, logicals and factors are included as is, and by default character vectors are coerced to be factors, whose levels are the unique values appearing in the vector

③ Vector structures appearing as variables of the data frame must all have the same length, and matrix structures must all have the same row size

Names of components

Page 35: R Basics

35

Data Frame

> names(cars) [1] "Plant" "Type" "Treatment" "conc" "uptake“> length(cars) # 2> cars[[1]] > cars$speed # recommended

> attach(cars) # ?what’s this> detach(cars)

> summary(cars$conc) # do what we can do for a vector

Page 36: R Basics

36

Data Input scan(file, what=double(), sep=“”, …) # scan will return a vector with data type the same as the what give.

read.table(file, header=FALSE, sep= “ ”, row.names, col.names, …)# read.table will return a data.frame object# my_data.frame <- read.table("MULTIPOT_lu.txt",row.names=1,header=TRUE)

# from SPSS and SASlibrary(Hmisc)mydata <- spss.get(“test.file”,use.value.labels=TRUE)mydata <- sasxport.get(“test.file”)#from Stata and systatlibrary(foreign)mydata<- read.dta(“test.file”)mydata<-read.systat(“test.file”)# from excellibrary(RODBC)channel <- odbcConnectExcel(“D:/myexcel.xls”)mydata <- sqlFetch(channel, “mysheet”)odbcclose(channel)

From other softwareload

package

Page 37: R Basics

37

Operators

Page 38: R Basics

38

Control Statements

R Programming

# switch( statement, list)

# repeat {…}

Page 39: R Basics

39

Function

R Programming

Definition:

Example:matrix.axes <- function(data) {

x <- (1:dim(data)[1] - 1) / (dim(data)[1] - 1);axis(side=1, at=x, labels=rownames(data), las=2);

x <- (1:dim(data)[2] - 1) / (dim(data)[2] - 1);axis(side=2, at=x, labels=colnames(data), las=2);

}

Page 40: R Basics

40

Summary

Data type and Data Structure

numeric, character, complex, logical

vector, array/matrix, list, data frame

Data Input

scan, read.table

load from other software: SPSS, SAS, excel

Operators : <-

R Programming:

Page 41: R Basics

41

Case study

Residue based Protein-Protein Interaction potential analysis:

Lu et al. (2003) Development of Unified Statistical Potentials Describing Protein-Protein Interactions, Biophysical Journal 84(3), p1895-1901

Page 42: R Basics

42

Reference

CRAN-Manual: http://cran.r-project.org/Quick-R: http://www.statmethods.net/index.htmlR tutorial: http://www.r-tutor.com/MOAC:http://www2.warwick.ac.uk/fac/sci/moac/people/students/peter_cock/r/matrix_contour/

Page 43: R Basics

43

Thanks for your attention