*Statistics problem* Help me to prove why the line of linear regression always passes through the coordinate pair (mean of x values, mean of y values)?
\[y=\hat{\alpha}+\hat{\beta}x\] \[y=\bar{y}-\hat{\beta}\bar{x}+\hat{\beta}x\] \[y-\bar{y}=-\hat{\beta}\bar{x}+\hat{\beta}x\] \[y-\bar{y}=\hat{\beta}(x-\bar{x})\] this is a line with slope \(\hat{\beta}\) that goes through the point \((\bar{x},\bar{y})\)
How did you go from first to second step?
how do you know y bar is the y intercept
I never said that \(\bar{y}\) is the \(y\)-intercept also, see the formulas here http://en.wikipedia.org/wiki/Simple_linear_regression#Fitting_the_regression_line In particular the the formula for \(\hat{\alpha}\).
Ok nevermind I see what you did. My teacher says that a line of regression will ALWAYS pass through (xˉ,yˉ). Why would this be true. I see the manipulation of the formula but I'm just trying to wrap my head around it
that is supposed to be (xbar, ybar)
i want to know a logical interpretation/explanation of it
consider this. Treat all the points in the scatter plot as being weights (all with the same mass). \((\bar{x},\bar{y})\) is the center of mass of all these points. One would think the line of best fit should pass through the center of mass of all these points. And from what I wrote above, it does.
Join our real-time social learning platform and learn together with your friends!