OLS estimator derivation with constant

Question

anonymous · Answer

$$y_{i}=2+bx_{i}+\epsilon_{i}$$

anonymous · Answer

Following normal means of derivation I get to point where I need to differentiate with respect to two........

anonymous · Answer

$$\sum_{}^{}\epsilon_{i}^{2}=\sum_{}^{}(y_{i}-2-b x_{i})^{2}$$

kirbykirby · Answer

if you know the number given for $b_0$ (or maybe you call it $a$, then I believe you'd only need to differentiate with respect to $b$ since you only need to estimate b. and you actually know the constant value.

anonymous · Answer

Ah, nice! I'm also told there will be something special about $$\sum_{i}^{} (\hat \epsilon_{i}) $$ Are you aware of any differences having the constant involved creates on the error terms?

anonymous · Answer

Thanks a lot by the way, had been going around in circles looking for a way out of the differential issue!

kirbykirby · Answer

Hm I am not 100% sure. I have to admit I haven't really encountered a situation where the constant was given before. Since OLS aims to minimize the sum of the errors, but you were given some "arbitrary" constant (maybe it's given for some reason), then my guess is that the sum of the errors won't actually be minimized without using the "optimal" estimate for the constant

kirbykirby · Answer

In my head, I imagine they give you the constant, and from that you are trying to fit the best line using the slope only... so imagine fixing the line at the intercept and you are "swinging" the line up and down to minimize the error terms , so you cannot guarantee the minimization of the error terms, unless 2 is actually the best estimate

anonymous · Answer

That is a really nice visualisation with the swinging - thanks a lot for that!

phi · Answer

you should take the derivative with respect to "b"
You are trying to find the slope that will minimize the error

anonymous · Answer

Thanks @phi. I was initially trying to construct the intercept estimate, but obviously a little silly since as @kirbykirby pointed out, what's the use in an estimate when one has the actual value.. Thanks for confirmation!

phi · Answer

what do you get for the derivative ?

anonymous · Answer

$$-2\sum_{}^{}(y_{i}-2-bx_{i})x_{i}$$

phi · Answer

I would distribute the x_i  and write out each term as a separate sum
set that = 0  and solve for b

anonymous · Answer

What does distribute the x_i mean? Surely there could be any number of x's?

phi · Answer

you have N terms of the form (yi -2 -b xi)* xi  
or N terms of the form
xi*yi - 2 xi - b (xi)^2
put a summation sign in front of each term to "collect" like terms

phi · Answer

in other words, each (xi, yi) pair creates  an expression xi*yi - 2 xi - b (xi)^2
and you add them up

anonymous · Answer

Oh this is interesting; by 'distributing' the terms if I've interpreted correctly, I'm finding the construction of this b term is the same method as a constant term

phi · Answer

b is the variable that is allowed to vary to minimize the total squared error, so I would not call it a constant.  But it does have the same value for each of the N expressions.

phi · Answer

you have 
$$ error= f(b)$$ 
you find the derivative with respect to b and set it equal to 0
$$ \frac{df}{db} = 0 $$
solve for the value of b.  This will be the value that makes the "slope" zero, indicating we found a min or max of the f(b). In this case, it will be the minimum value of f(b)
i.e. the minimum squared error

notice all the "data" (the x and y's ) are treated as constants

phi · Answer

after taking the derivative with respect to b you have
$$ -2\sum_{}^{}(y_{i}-2-bx_{i})x_{i} = 0  \
\sum y_{i} x_i - 2 x_i -b (x_i)^2 = 0 \
\sum y_{i} x_i - 2 \sum x_i -b \sum (x_i)^2 = 0
$$

anonymous · Answer

Amazing - thank you for such detailed insight! I did manage to get to that position but without considering moving b outside the summation I was a little stuck. I guess now it's a case of moving the -b(summation)(x_i)^2 over and dividing by the (summation)(x_i)^2?

phi · Answer

if you divide every term by N (number of terms in the summation)
you can say, for example,
$$ \frac{\sum x_i}{N} = \bar{x} $$
similarly for the other terms

anonymous · Answer

$$b=\frac{ \sum_{}^{}y_{i}x_{i} -2\sum_{}^{}x_{i}  }{ \sum_{}^{}(x_{i})^2 }$$

phi · Answer

yes. or in terms of averages
$$ b = \frac{ \overline{xy} - 2 \bar{x}}{\overline{x^2}} $$

anonymous · Answer

Thank you so much for your help. Really appreciated!