*Statistics problem* Help me to prove why the line of linear regression always passes through the coordinate pair (mean of x values, mean of y values)?

Question

zarkon · Answer

$$y=\hat{\alpha}+\hat{\beta}x$$
$$y=\bar{y}-\hat{\beta}\bar{x}+\hat{\beta}x$$
$$y-\bar{y}=-\hat{\beta}\bar{x}+\hat{\beta}x$$
$$y-\bar{y}=\hat{\beta}(x-\bar{x})$$

this is a line with slope $\hat{\beta}$ that goes through the point $(\bar{x},\bar{y})$

anonymous · Answer

How did you go  from first to second step?

anonymous · Answer

how do you know y bar is the y intercept

zarkon · Answer

I never said that $\bar{y}$ is the $y$-intercept also, see the formulas here http://en.wikipedia.org/wiki/Simple_linear_regression#Fitting_the_regression_line In particular the the formula for $\hat{\alpha}$.

anonymous · Answer

Ok nevermind I see what you did. My teacher says that a line of regression will ALWAYS pass through (xˉ,yˉ). Why would this be true. I see the manipulation of the formula but I'm just trying to wrap my head around it

anonymous · Answer

that is supposed to be (xbar, ybar)

anonymous · Answer

i want to know a logical interpretation/explanation of it

zarkon · Answer

consider this.  Treat all the points in the scatter plot as being weights (all with the same mass). $(\bar{x},\bar{y})$ is the center of mass of all these points.  One would think the line of best fit should pass through the center of mass of all these points.  And from what I wrote above, it does.