basic operation of r-16!2!2016

Upload: st

Post on 06-Jul-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/16/2019 Basic Operation of R-16!2!2016

    1/30

    Department of Mathematics, Statistics &Computer Sc. 1

    Learning R

    Laboratory Exercise-III

    BPS651 Research MethodologyR.S.Rajput, Assistant Professor (Computer Science)

    Laboratory Instructor

    BPS651

  • 8/16/2019 Basic Operation of R-16!2!2016

    2/30

    Department of Mathematics, Statistics &Computer Sc. 2

    Laboratory Exercise -III

    Correlation

    Regression

     Analysis of Variance

    BPS651

  • 8/16/2019 Basic Operation of R-16!2!2016

    3/30

    Correlation

    Correlation is used to test for a relationship between two

    numerical variables or two ranked (ordinal) variables.

    Correlation is a bi variants analysis that measures the strengths

    of association between two variables. In statistics, the value of

    the correlation coefficient varies between +1 and -1. Usually, in statistics, we measure three types of correlations:

    Pearson correlation

    Kendall rank correlation

    Spearman correlation

    BPS651 Department of Mathematics, Statistics &Computer Sc. 3

  • 8/16/2019 Basic Operation of R-16!2!2016

    4/30

    Pearson Correlation 

    Pearson r co rrelat ion :  Pearson r correlation is widely used in

    statistics to measure the degree of the relationship between

    linear related variables. Pearson r  correlation, both variables

    should be normally distributed.

    The following formula is used to calculate the Pearson rcorrelation:-

    Where:

    r = Pearson r correlation coefficient

    N = number of value in each data

    ∑xy = sum of the products of paired scores 

    ∑x = sum of x scores 

    ∑y = sum of y scores 

    ∑x2= sum of squared x scores

    ∑y2= sum of squared y scores

    BPS651 Department of Mathematics, Statistics &Computer Sc. 4

  • 8/16/2019 Basic Operation of R-16!2!2016

    5/30

    Kendall rank Correlation 

    Kendall rank correlation is a non-parametric test that measures

    the strength of dependence between two variables. If we

    consider two samples, a and b, where each sample size is n, we

    know that the total number of pairings with a b is n(n-1)/2.  The

    following formula is used to calculate the value of Kendall rank

    correlation:

    Where:

    Nc= number of concordant

    Nd= Number of discordant

    Concordant: Ordered in the same way Discordant: Ordered differently

    BPS651 Department of Mathematics, Statistics &Computer Sc. 5

  • 8/16/2019 Basic Operation of R-16!2!2016

    6/30

    Spearman Correlation 

    Spearman rank correlation is a non-parametric test that is used to

    measure the degree of association between two variables. Spearman

    rank correlation test does not assume any assumptions about the

    distribution of the data and is the appropriate correlation analysis when

    the variables are measured on a scale that is at least ordinal.

    The following formula is used to calculate the Spearman rankcorrelation:

     

     

    Where:

    P= Spearman rank correlation di= the difference between the ranks of corresponding values Xi and Yi 

    n= number of value in each data set

    BPS651 Department of Mathematics, Statistics &Computer Sc. 6

  • 8/16/2019 Basic Operation of R-16!2!2016

    7/30

    The cor( ) function 

    The cor( ) function to produce correlations .A simplified format is

    cor(X , use=, method= )

     where

    X: Matrix or data frame

    Use:Specifies the handling of missing data. Options are

    all.obs  (assumes no missing data, * missing data will

    produce an error), complete.obs  (listwise deletion), and

    pairwise.complete.obs (pairwise deletion)

    Method: Specifies the type of correlation. Options are 

    pearson, spearman or kendall.

    BPS651 Department of Mathematics, Statistics &Computer Sc. 7

  • 8/16/2019 Basic Operation of R-16!2!2016

    8/30

    Visualizing Correlations 

    plot(), abline(), lowess() ,line(), pairs()

    plot(): The basic function is plot(), denoting the (x,y) points to plot

    >plot(x, y)

    pairs() to create scatterplot matrices

    >pairs(y~ x)

    lines() use to point match

    > lines(x, y, col="black")

    abline() use to print regression line (y~ x)

    >abline(lm(x~ y), h=0,v=0,col="red") lowess line function

    > lines(lowess(x, y), col="blue")

    BPS651 Department of Mathematics, Statistics &Computer Sc. 8

  • 8/16/2019 Basic Operation of R-16!2!2016

    9/30

    Exercise 10

    Protein intake X and fat intake Y (in gm) for ten old

    women given as

    X 56,47,33,39,42,38,46,47,38,32

    Y 56,83,49,52,65,52,56,48,59,70 Calculate correlation Coefficient (Pearson)  

    Draw scatter plot matrix, scatter plot

    BPS651 Department of Mathematics, Statistics &Computer Sc. 9

  • 8/16/2019 Basic Operation of R-16!2!2016

    10/30

    Exercise 11 

    Find correlation coefficient (Pearson)  between the

    sales and expenses from the data given below:Firm: 1 2 3 4 5 6 7 8 9 10

    Sales (Rs Lakhs): 50 50 55 60 65 65 65 60 60 50

    Expenses (Rs Lakhs):11 13 14 16 16 15 15 14 13 13

    Draw scatter plot matrix, scatter plot

    BPS651 Department of Mathematics, Statistics &Computer Sc. 10

  • 8/16/2019 Basic Operation of R-16!2!2016

    11/30

    Simple Linear Regression

     A simple linear regression model that describes the relationship

    between two variables x and y can be expressed by the

    following equation. The numbers α and β are called parameters,

    and ϵ is the error term. 

    Estimated Simple Regression Equation

    Coefficient of Determination

    Significance Test for Linear Regression

    Confidence Interval for Linear Regression

    Prediction Interval for Linear Regression

    Residual Plot

    Standardized Residual

    Normal Probability Plot of Residuals

    BPS651 Department of Mathematics, Statistics &Computer Sc. 11

  • 8/16/2019 Basic Operation of R-16!2!2016

    12/30

    Simple Linear Regression cont. 

    Estimated Simple Regression Equation

    >W=lm(y~x)

    Where

    X

    Y

    w

    BPS651 Department of Mathematics, Statistics &Computer Sc. 12

  • 8/16/2019 Basic Operation of R-16!2!2016

    13/30

    Multiple Linear Regressions

    Estimated Multiple Regression Equation

    Multiple Coefficient of Determination

     Adjusted Coefficient of Determination

    Significance Test for MLR

    Confidence Interval for MLR

    Prediction Interval for MLR

    BPS651 Department of Mathematics, Statistics &Computer Sc. 13

  • 8/16/2019 Basic Operation of R-16!2!2016

    14/30

    Logistic Regression

    Estimated Logistic Regression Equation

    Significance Test for Logistic Regression

    BPS651Department of Mathematics, Statistics &

    Computer Sc.

    14

  • 8/16/2019 Basic Operation of R-16!2!2016

    15/30

    Exercise 11

    Geographical area x and area under paddy

    cultivated y ( in hectares) for 15 villages of a

    district are given below-

    X 103,106,120,120,100,151,160,155,136,178,196,140,160,166,112 Y 041,033,087,078,035,081,090,085,070,100,102,070,082,085,050

    Calculate correlation coefficient

    Calculate regression equation of y on x

    Estimate paddy cultivation whore geograhicalarea is 136 hater

    BPS651Department of Mathematics, Statistics &

    Computer Sc.

    15

  • 8/16/2019 Basic Operation of R-16!2!2016

    16/30

    Exercise 12

    Calculate correlation coefficient between

    marks obtained in 1st prefinal and 2nd prefinal

    examination on the basis of the following data

    collected for a sample of 12 students I 12,14,9.5,10.5,8,11.5,10,14,8,9.5,11,12

    II 11.5,13.5,12,14,7,14,8,12.5,6.5,10,9,12

    Calculate correlation coefficients Calculate regression equation of y on x

    BPS651 Department of Mathematics, Statistics &

    Computer Sc.

    16

  • 8/16/2019 Basic Operation of R-16!2!2016

    17/30

    Exercise 10

    Protein intake X and fat intake Y (in gm) for

    ten old women given as

    X 56,47,33,39,42,38,46,47,38,32

    Y 56,83,49,52,65,52,56,48,59,70

    Calculate correlation Coefficient

    Calculate regression equation of y on x

    Estimate fat intake of a women whose proteinintake is 38 gm.

    BPS651 Department of Mathematics, Statistics &

    Computer Sc.

    17

  • 8/16/2019 Basic Operation of R-16!2!2016

    18/30

    Exercise 13

    Twelve students for the following percentage

    of makes in physics & statistics calculate:-

    Correlation coefficient

    Linear regression equation of y on x

    Calculate predicted values & residual value

    for x=80, for the given data below

    X 73,42,88,38,68,75,80,54,64,48,35,37 Y 73,48,86,58,65,60,76,54,50,38,32,30

    BPS651 Department of Mathematics, Statistics &

    Computer Sc.

    18

  • 8/16/2019 Basic Operation of R-16!2!2016

    19/30

    Exercise 14

    BPS651 Department of Mathematics, Statistics &

    Computer Sc.

    19

  • 8/16/2019 Basic Operation of R-16!2!2016

    20/30

    Exercise 15

    BPS651 Department of Mathematics, Statistics &

    Computer Sc.

    20

  • 8/16/2019 Basic Operation of R-16!2!2016

    21/30

    Exercise 16

    BPS651 Department of Mathematics, Statistics &

    Computer Sc.

    21

  • 8/16/2019 Basic Operation of R-16!2!2016

    22/30

    Exercise 17

    BPS651 Department of Mathematics, Statistics &

    Computer Sc.

    22

  • 8/16/2019 Basic Operation of R-16!2!2016

    23/30

    Exercise 18

    BPS651 Department of Mathematics, Statistics &

    Computer Sc.

    23

  • 8/16/2019 Basic Operation of R-16!2!2016

    24/30

    Exercise 19

    BPS651 Department of Mathematics, Statistics &

    Computer Sc.

    24

  • 8/16/2019 Basic Operation of R-16!2!2016

    25/30

    Exercise 20

    BPS651 Department of Mathematics, Statistics &

    Computer Sc.

    25

  • 8/16/2019 Basic Operation of R-16!2!2016

    26/30

    Analysis of Variance 

    BPS651 Department of Mathematics, Statistics &

    Computer Sc.

    26

  • 8/16/2019 Basic Operation of R-16!2!2016

    27/30

    Completely Randomized Design 

    BPS651 Department of Mathematics, Statistics &

    Computer Sc.

    27

  • 8/16/2019 Basic Operation of R-16!2!2016

    28/30

    Randomized Block Design 

    BPS651 Department of Mathematics, Statistics &

    Computer Sc.

    28

  • 8/16/2019 Basic Operation of R-16!2!2016

    29/30

    Factorial Design 

    BPS651 Department of Mathematics, Statistics &

    Computer Sc.

    29

  • 8/16/2019 Basic Operation of R-16!2!2016

    30/30