The following dataset collects head capsule widths and body masses
from Hydropsyche (a genus of caddisflies) in the Danube in Austria. Note
that this data file is tab delimited, not comma delimited, so you will
need read.table
.
hydrop = read.table("data/Hydropsyche.txt", header = TRUE)
lm
to fit a simple linear model (one variable, no
transformations), using body mass as the response (y) variable, and
width as the predictor. Is the regression significant? Report the
statistics as shown in the lecture.# use the formula syntax within lm to describe the model you want to fit
mod1 = lm(weight ~ width, data = hydrop)
# use the summary function to get information from your model
# how do you interpret the output?
summary()
coefficients()
anova()
residuals()
plot(mod)
). Is this an
adequate model? Does it meet the assumptions of the linear model?# some useful functions
residuals()
plot(mod1$fitted.values,mod1$residuals) # check if model adequate, should not show any trend
hist(mod1$residuals) # check ND of residuals
qqnorm(mod1$residuals); qqline(mod1$residuals) # check ND of residuals
plot(mod1$fitted.values,mod1$residuals^2) # check variance homogeneity, should not show any trend
# Predicted values with confidence and prediction limits #
conf<-predict(model,interval="confidence",level=0.95)
pred<-predict(model,interval="prediction",level=0.95,newdata=data)
# What is the difference between confidence and prediction intervals?
# perhaps useful
matlines(data$age,conf,lty=c(1,2,2),col="black")
text(locator(1),paste("R2 = ",r2,", P<0.001",sep=""))
Bonus: Produce a scatterplot of width (x-axis) and mass (y-axis), on the original scale with no transformations. Add the regression curve from question 4 to your plot with confidence limits.