@Mathematics Hours Studying and Test Grades

Time (hours) Grade
1 66
2.5 70
3 81
3.5 92
4 97

Jennifer believes that she can predict her test scores based on how long she studies. She asks 5 of her friends how long they studied and what grade they made on the test and then preforms a regression to model the data. Her data is shown in the table. Which function best models the data?
A)y = 59.576(1.108)x
B)y = 63.461(1.105)x
C)y = 2.833x2 - 3.833x + 67
D)y = 7.733x2 - 24.4x + 82.667

Question

@Mathematics Hours Studying and Test Grades

Time (hours)	Grade
1	                66
2.5	                70
3	                81
3.5	                92
4	                97

Jennifer believes that she can predict her test scores based on how long she studies. She asks 5 of her friends how long they studied and what grade they made on the test and then preforms a regression to model the data. Her data is shown in the table. Which function best models the data?
A)y = 59.576(1.108)x	
B)y = 63.461(1.105)x	
C)y = 2.833x2 - 3.833x + 67	
D)y = 7.733x2 - 24.4x + 82.667

mathteacher1729 · Answer

I'm gonna make a video about this, I think my students might benefit from it. Give me like 10 minutes. :)

anonymous · Answer

ok

mathteacher1729 · Answer

It is done! :D Note: I changed the numbers up a little, but it's the same thing. http://www.youtube.com/watch?v=nNHfu6XeVS4 Download geogebra here: http://www.geogebra.org/cms/en/installers Hope this helps!

anonymous · Answer

@mathtecher1729 you changing the numbers to the problem I need help on isn't going to benefit me,, but thanks !

mathteacher1729 · Answer

You should now be able to do any linear regression problem for any numbers. (including the original ones you posted). :)

jamesj · Answer

@sabrina, what have you learnt about error functions?

The general approach to this problem is the following: 

Suppose you have a set of data points $ D = \{ (x_i, y_i) | \ i=1,2, ..., n \} $.  

We want to know if a function f where y = f(x) is a good approximation for these data points.  The way to measure the "fit" of the function f to the set D is to measure the error between the prediction of the function f on a data point $ x_i $, $ f(x_i) $ and the observed value $ y_i $.  The standard error function in this situation is this:
$$ E(f,D) = \sum_1^n (f(x_i) - y_i)^2 $$

You have a number of such functions f.  The formal way to answer your question is to measure E(f,D) for each function f.  The f which has the minimum value of E is the best fit among the possibilities given.

mathteacher1729 · Answer

James, 

I'm gonna go out on a limb and say that your explanation might be slightly more than what Sabrinay was bargaining for. I've taught a class where questions similar to hers were asked, and we basically showed students how to use the calc to do the nitty-gritty calculations.  

The goal of the lesson was to understand the general idea that "the line of best fit... fits the data best, and since it's a straight line, we can use y = mx + b to represent it, and solve for various values if needed." 

:-p

zarkon · Answer

What James is doing is exactly what you need to do to solve this problem..I might even go a step further and look at
$$\sum_{i=1}^n |f(x_i) - y_i|$$

zarkon · Answer

doing a regression does not help since none of the above functions is the 'best' fitting line under either criterion

anonymous · Answer

I'm not familiar with this material at all @JamesJ,, but I have seen heard that the E looking symbol ends with whatever it begins with if i'm not mistaking. I need help doing THIS EXACT problem and not any "nitty gritty tricks," that @mathteacher1729 is trying to display regardless of his teaching experience .

zarkon · Answer

using the lest squares (or abs) one of the functions is better than the others

zarkon · Answer

*least

jamesj · Answer

In other words, using the error function that I have given and/or the one Zarkon suggests, find the error for each function given as a possible answer.

Then determine which has the minimum error.  

I'd definitely use Excel or another spreadsheet program if you've got it.

mathteacher1729 · Answer

Sabrinay, I was trying to show you how to use your own computer to solve this problem. (inputting your own data points instead of the ones I made up, then finding the line of best fit). 

James and Zarkon are trying to give you a college lecture on regression analysis.  Interesting, but perhaps not appropriate at this point in time. :) 

Excel will also do line of best fit.

zarkon · Answer

and once he finds the 'line' (quadratic/exponential reg) of best fit he will notice that it matches none of the 4 functions he is given ....then what does he do?

jamesj · Answer

Exactly.  None of the solutions given is THE best fit regression line.  So that being the case, which is the best fit.

And even if one of them were the best fit regression line, it could be that one of the quadratic approximation functions is a better fit.

Now if the question had asked "which of the following is the best fit regression line?", then I think the approach MT is proposing would be the right one.

But given that that is not the question, it appears to be asking for a more general approach.

anonymous · Answer

Thanks everyone,, but I think I'm just going to drop this whole problem because I don't expect any of you to teach or go through working out the whole problem for me to gain an better understanding at this point. I do appreciate your attempts though.

jamesj · Answer

I'll do it, because it's actually a fun exercise and I've haven't done this for a long time.

anonymous · Answer

Ok,, if you think it's an fun exercise and your willing then I would be more than glad for you assist me. Begin when you're ready .

anonymous · Answer

Those charts are separate or intertwined ?

zarkon · Answer

are the first two functions lines or exponential functions...is the x in the exponent?

zarkon · Answer

by the way they are written I'm betting exponential

jamesj · Answer

scrap that.  I found an error.  Give me a minute.

jamesj · Answer

Good point.  Sabriyna: what are the first two functions.  I thought they were straight lines, but are they power functions or exponentials?  Can you write them out exactly?

jamesj · Answer

For example, the first function, is
$$A.  y =  59.576e^{1.108x} $$
or 
$$B. y =  59.576x^{1.108} $$
or something else?

zarkon · Answer

I'm thinking $$59.576(1.108)^x$$

jamesj · Answer

got it.  Sabriyna, please confirm asap.

zarkon · Answer

$$a\cdot b^x$$ is the standard form of an exponential regression

jamesj · Answer

it _has_ been a while

zarkon · Answer

one can do the following '

$$y=ab^x$$
$$ln(y)=ln(a)+x\ln(b)$$

perform a linear regression using the normal equations

$$(A'A)^{-1}A'w$$

jamesj · Answer

Right.

anonymous · Answer

I think I understand linear regression part,, but I'm confused on the exponential regression part and I think that the answer will be rounded to D)

jamesj · Answer

What is the form of the functions A and B.  Please write them out again exactly.

jamesj · Answer

And quickly, because I want to move on from this problem, as I'm sure you do too.

jamesj · Answer

Ok.  I'm going to assume it is what Zarkon is suggesting.  The first function is
$$ 59.576(1.108)^x $$
Likewise for the second function .

jamesj · Answer

That being the case, here's the data and the value of each of the trial functions:

jamesj · Answer

And here's the calculation of the errors:

jamesj · Answer

attachment

jamesj · Answer

With both forms of error function, the third function, option C, has the smallest error.

You should replicate all of this analysis and check carefully for errors.

zarkon · Answer

That's what I got.

jamesj · Answer

Good.

jamesj · Answer

If I ever teach linear regression again, I'm going to begin with error functions.

zarkon · Answer

I hope sabrinay25 remembers to give  you a medal for all that work you did.