regression and analysis variance linear models in r

12
Regression and Analysis Variance Linear Models in R

Upload: rafe-briggs

Post on 02-Jan-2016

224 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Regression and Analysis Variance Linear Models in R

Regression and Analysis Variance

Linear Models in R

Page 2: Regression and Analysis Variance Linear Models in R

What do you know already?

Page 3: Regression and Analysis Variance Linear Models in R

RegressionContinuous Dependent VariableContinuous Independent VariableAssumptions

NormalityIndependenceConstant variance N(0, 2)

Linear or curvilinear

Page 4: Regression and Analysis Variance Linear Models in R

ANOVAContinuous Dependent VariableDiscrete Independent VariableAssumptions

NormalityIndependenceConstant variance N(0, 2)Factor level variances are equal

Page 5: Regression and Analysis Variance Linear Models in R

Linear ModelsRegression and ANOVA (and in fact

ANCOVA) are all related mathematically to one another.

Exactly the same mathematics is used throughout.

The only difference is the type (and number) of independent variables that you are working with.

The base assumptions are required for all linear models.

Page 6: Regression and Analysis Variance Linear Models in R

What procedure are we going to use to analyse linear model data?

Page 7: Regression and Analysis Variance Linear Models in R

Wagga House PricesA Wagga Wagga Real Estate Agent wishes

to use data from 30 recent house sales to predict future selling prices ($ 000) from land area (m2).

The data was collected from the internet from any real estate listings that included the land size and the listing price.

Most of the included listings were for 2 bedroom, 1 bathroom and 1 garage houses.

Page 8: Regression and Analysis Variance Linear Models in R

Call:

lm(formula = Price ~ Land, data = dat)

Residuals:

Min 1Q Median 3Q Max

-169.486 -57.992 1.337 68.666 169.565

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) 53.9420 56.8561 0.949 0.351

Land 0.6202 0.0931 6.662 3.14e-07 ***

---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 92.94 on 28 degrees of freedom

Multiple R-squared: 0.6132, Adjusted R-squared: 0.5994

F-statistic: 44.39 on 1 and 28 DF, p-value: 3.141e-07

Page 9: Regression and Analysis Variance Linear Models in R

anova(dat.lm)

Analysis of Variance Table

Response: Price

Df Sum Sq Mean Sq F value Pr(>F)

Land 1 383397 383397 44.385 3.141e-07 ***

Residuals 28 241863 8638

---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Page 10: Regression and Analysis Variance Linear Models in R

Bottlenose Dolphins

Neonate bottlenose dolphins produce many sounds just after birth. Prior to suckling these sounds intensify and then as the neonate prepares to feed the sounds cease, this is called a latency period (LP). It is thought that the LP is related to the suckling frequency. A study was conducted to collect information about the length of the LP and the suckling frequency, where the aim was to define this relationship if it existed.

Page 11: Regression and Analysis Variance Linear Models in R

Johne’s Disease

To eliminate Johne’s disease from an infected farm or to prevent transmission, it is essential that susceptible animals are not exposed to an environment contaminated with the virus. The virus causing Johne’s disease is capable of persisting in the environment for long periods due to the high lipid content in the cell wall and the metabolic inactivity of the organism. Factors that could influence the survival of the virus in the soil including temperature, pH, organic matter exposure to ultra violet light and moisture content were investigated under controlled conditions.

Page 12: Regression and Analysis Variance Linear Models in R

Johne’s Disease continued

This experiment involved trays of contaminated soil randomised to 12 unique treatments, involving changing the pH, UV light and the moisture content. They are uniquely defined as Treatment 1:12. The treatments were randomised to the trays of soil on a completely randomised fashion so there each treatment was replicated 5 times. The ln(number of virsus) remaining was the response measured as an indication of the effectiveness of the treatment. The aim of the experiment is to determine the “best” treatment for removing the virus from the soil.