| Mathematics 1040-1: An Introduction to
Statistical Thinking
| |
r denotes the correlation between the variables x (the explanatory variable) and y (the response variable).
| y in standard units = r times x in standard units. |
x = femur length (in cm); y = humerus length (in cm).Check that the following hold:
average(x) = 58.2 cm
average(y) = 66 cm
SD(x) = 13.20 cm
SD(y) = 15.89 cm.
We have not yet discussed how the correlation
r is computed (nor what it really is). So let me
just tell you that r is 0.994. This is excellent
correlation (what does this mean?). So regression
predictions should be good.
Let us use the second description of regression to find the equation of the regression line: We have y = a + bx, and we know that b is the slope. I.e.,
b = r SD(y)/SD(x)
= 0.994 times 15.89/13.20
= 1.197, approximately.
So we know that y = a + 1.197 x. What is a?
We know that when x = average(x)=58.2,
then y=average(y)=66. So
66 = a + ( 1.197 times 58.2).Solve for a (math 1030) to get a = -3.66. That is, the equation of the regression line is
y = -3.66 + 1.197 x.So, for example, if we find a femure bone of one of this species that is 80 cm long, then our regression estimate for the length of its humerus is
y = -3.66 + 1.196 times 80 = 92 cm, approximately.
Warning. You cannot solve for x to regress y vs. x that way. The regression equation is not an algebraic equation. It is a statistical estimate! To see if you understand this notion, try the following. If you do not succeed, do not be discouraged; it is a subtlenotion. Seek help until you understand this notion.
Question: What is the regression estimate for the femur length of one such fossil whose humerus length turned out to be 42 cm long?
Disclaimer
© 2003 by the Dept of Math. University of Utah