basic operation of r-16!2!2016
TRANSCRIPT
-
8/16/2019 Basic Operation of R-16!2!2016
1/30
Department of Mathematics, Statistics &Computer Sc. 1
Learning R
Laboratory Exercise-III
BPS651 Research MethodologyR.S.Rajput, Assistant Professor (Computer Science)
Laboratory Instructor
BPS651
-
8/16/2019 Basic Operation of R-16!2!2016
2/30
Department of Mathematics, Statistics &Computer Sc. 2
Laboratory Exercise -III
Correlation
Regression
Analysis of Variance
BPS651
-
8/16/2019 Basic Operation of R-16!2!2016
3/30
Correlation
Correlation is used to test for a relationship between two
numerical variables or two ranked (ordinal) variables.
Correlation is a bi variants analysis that measures the strengths
of association between two variables. In statistics, the value of
the correlation coefficient varies between +1 and -1. Usually, in statistics, we measure three types of correlations:
Pearson correlation
Kendall rank correlation
Spearman correlation
BPS651 Department of Mathematics, Statistics &Computer Sc. 3
-
8/16/2019 Basic Operation of R-16!2!2016
4/30
Pearson Correlation
Pearson r co rrelat ion : Pearson r correlation is widely used in
statistics to measure the degree of the relationship between
linear related variables. Pearson r correlation, both variables
should be normally distributed.
The following formula is used to calculate the Pearson rcorrelation:-
Where:
r = Pearson r correlation coefficient
N = number of value in each data
∑xy = sum of the products of paired scores
∑x = sum of x scores
∑y = sum of y scores
∑x2= sum of squared x scores
∑y2= sum of squared y scores
BPS651 Department of Mathematics, Statistics &Computer Sc. 4
-
8/16/2019 Basic Operation of R-16!2!2016
5/30
Kendall rank Correlation
Kendall rank correlation is a non-parametric test that measures
the strength of dependence between two variables. If we
consider two samples, a and b, where each sample size is n, we
know that the total number of pairings with a b is n(n-1)/2. The
following formula is used to calculate the value of Kendall rank
correlation:
Where:
Nc= number of concordant
Nd= Number of discordant
Concordant: Ordered in the same way Discordant: Ordered differently
BPS651 Department of Mathematics, Statistics &Computer Sc. 5
-
8/16/2019 Basic Operation of R-16!2!2016
6/30
Spearman Correlation
Spearman rank correlation is a non-parametric test that is used to
measure the degree of association between two variables. Spearman
rank correlation test does not assume any assumptions about the
distribution of the data and is the appropriate correlation analysis when
the variables are measured on a scale that is at least ordinal.
The following formula is used to calculate the Spearman rankcorrelation:
Where:
P= Spearman rank correlation di= the difference between the ranks of corresponding values Xi and Yi
n= number of value in each data set
BPS651 Department of Mathematics, Statistics &Computer Sc. 6
-
8/16/2019 Basic Operation of R-16!2!2016
7/30
The cor( ) function
The cor( ) function to produce correlations .A simplified format is
cor(X , use=, method= )
where
X: Matrix or data frame
Use:Specifies the handling of missing data. Options are
all.obs (assumes no missing data, * missing data will
produce an error), complete.obs (listwise deletion), and
pairwise.complete.obs (pairwise deletion)
Method: Specifies the type of correlation. Options are
pearson, spearman or kendall.
BPS651 Department of Mathematics, Statistics &Computer Sc. 7
-
8/16/2019 Basic Operation of R-16!2!2016
8/30
Visualizing Correlations
plot(), abline(), lowess() ,line(), pairs()
plot(): The basic function is plot(), denoting the (x,y) points to plot
>plot(x, y)
pairs() to create scatterplot matrices
>pairs(y~ x)
lines() use to point match
> lines(x, y, col="black")
abline() use to print regression line (y~ x)
>abline(lm(x~ y), h=0,v=0,col="red") lowess line function
> lines(lowess(x, y), col="blue")
BPS651 Department of Mathematics, Statistics &Computer Sc. 8
-
8/16/2019 Basic Operation of R-16!2!2016
9/30
Exercise 10
Protein intake X and fat intake Y (in gm) for ten old
women given as
X 56,47,33,39,42,38,46,47,38,32
Y 56,83,49,52,65,52,56,48,59,70 Calculate correlation Coefficient (Pearson)
Draw scatter plot matrix, scatter plot
BPS651 Department of Mathematics, Statistics &Computer Sc. 9
-
8/16/2019 Basic Operation of R-16!2!2016
10/30
Exercise 11
Find correlation coefficient (Pearson) between the
sales and expenses from the data given below:Firm: 1 2 3 4 5 6 7 8 9 10
Sales (Rs Lakhs): 50 50 55 60 65 65 65 60 60 50
Expenses (Rs Lakhs):11 13 14 16 16 15 15 14 13 13
Draw scatter plot matrix, scatter plot
BPS651 Department of Mathematics, Statistics &Computer Sc. 10
-
8/16/2019 Basic Operation of R-16!2!2016
11/30
Simple Linear Regression
A simple linear regression model that describes the relationship
between two variables x and y can be expressed by the
following equation. The numbers α and β are called parameters,
and ϵ is the error term.
Estimated Simple Regression Equation
Coefficient of Determination
Significance Test for Linear Regression
Confidence Interval for Linear Regression
Prediction Interval for Linear Regression
Residual Plot
Standardized Residual
Normal Probability Plot of Residuals
BPS651 Department of Mathematics, Statistics &Computer Sc. 11
-
8/16/2019 Basic Operation of R-16!2!2016
12/30
Simple Linear Regression cont.
Estimated Simple Regression Equation
>W=lm(y~x)
Where
X
Y
w
BPS651 Department of Mathematics, Statistics &Computer Sc. 12
-
8/16/2019 Basic Operation of R-16!2!2016
13/30
Multiple Linear Regressions
Estimated Multiple Regression Equation
Multiple Coefficient of Determination
Adjusted Coefficient of Determination
Significance Test for MLR
Confidence Interval for MLR
Prediction Interval for MLR
BPS651 Department of Mathematics, Statistics &Computer Sc. 13
-
8/16/2019 Basic Operation of R-16!2!2016
14/30
Logistic Regression
Estimated Logistic Regression Equation
Significance Test for Logistic Regression
BPS651Department of Mathematics, Statistics &
Computer Sc.
14
-
8/16/2019 Basic Operation of R-16!2!2016
15/30
Exercise 11
Geographical area x and area under paddy
cultivated y ( in hectares) for 15 villages of a
district are given below-
X 103,106,120,120,100,151,160,155,136,178,196,140,160,166,112 Y 041,033,087,078,035,081,090,085,070,100,102,070,082,085,050
Calculate correlation coefficient
Calculate regression equation of y on x
Estimate paddy cultivation whore geograhicalarea is 136 hater
BPS651Department of Mathematics, Statistics &
Computer Sc.
15
-
8/16/2019 Basic Operation of R-16!2!2016
16/30
Exercise 12
Calculate correlation coefficient between
marks obtained in 1st prefinal and 2nd prefinal
examination on the basis of the following data
collected for a sample of 12 students I 12,14,9.5,10.5,8,11.5,10,14,8,9.5,11,12
II 11.5,13.5,12,14,7,14,8,12.5,6.5,10,9,12
Calculate correlation coefficients Calculate regression equation of y on x
BPS651 Department of Mathematics, Statistics &
Computer Sc.
16
-
8/16/2019 Basic Operation of R-16!2!2016
17/30
Exercise 10
Protein intake X and fat intake Y (in gm) for
ten old women given as
X 56,47,33,39,42,38,46,47,38,32
Y 56,83,49,52,65,52,56,48,59,70
Calculate correlation Coefficient
Calculate regression equation of y on x
Estimate fat intake of a women whose proteinintake is 38 gm.
BPS651 Department of Mathematics, Statistics &
Computer Sc.
17
-
8/16/2019 Basic Operation of R-16!2!2016
18/30
Exercise 13
Twelve students for the following percentage
of makes in physics & statistics calculate:-
Correlation coefficient
Linear regression equation of y on x
Calculate predicted values & residual value
for x=80, for the given data below
X 73,42,88,38,68,75,80,54,64,48,35,37 Y 73,48,86,58,65,60,76,54,50,38,32,30
BPS651 Department of Mathematics, Statistics &
Computer Sc.
18
-
8/16/2019 Basic Operation of R-16!2!2016
19/30
Exercise 14
BPS651 Department of Mathematics, Statistics &
Computer Sc.
19
-
8/16/2019 Basic Operation of R-16!2!2016
20/30
Exercise 15
BPS651 Department of Mathematics, Statistics &
Computer Sc.
20
-
8/16/2019 Basic Operation of R-16!2!2016
21/30
Exercise 16
BPS651 Department of Mathematics, Statistics &
Computer Sc.
21
-
8/16/2019 Basic Operation of R-16!2!2016
22/30
Exercise 17
BPS651 Department of Mathematics, Statistics &
Computer Sc.
22
-
8/16/2019 Basic Operation of R-16!2!2016
23/30
Exercise 18
BPS651 Department of Mathematics, Statistics &
Computer Sc.
23
-
8/16/2019 Basic Operation of R-16!2!2016
24/30
Exercise 19
BPS651 Department of Mathematics, Statistics &
Computer Sc.
24
-
8/16/2019 Basic Operation of R-16!2!2016
25/30
Exercise 20
BPS651 Department of Mathematics, Statistics &
Computer Sc.
25
-
8/16/2019 Basic Operation of R-16!2!2016
26/30
Analysis of Variance
BPS651 Department of Mathematics, Statistics &
Computer Sc.
26
-
8/16/2019 Basic Operation of R-16!2!2016
27/30
Completely Randomized Design
BPS651 Department of Mathematics, Statistics &
Computer Sc.
27
-
8/16/2019 Basic Operation of R-16!2!2016
28/30
Randomized Block Design
BPS651 Department of Mathematics, Statistics &
Computer Sc.
28
-
8/16/2019 Basic Operation of R-16!2!2016
29/30
Factorial Design
BPS651 Department of Mathematics, Statistics &
Computer Sc.
29
-
8/16/2019 Basic Operation of R-16!2!2016
30/30