Ask your own question, for FREE!
Mathematics 18 Online
OpenStudy (anonymous):

The table represents data collected on the time spent studying (in minutes) and the resulting test grade. Time Spent Studying (min) 52 37 31 9 26 40 22 10 45 34 19 60 Grade on Test 95 84 72 58 77 86 72 43 90 81 62 98 Part 1: Create a scatter plot with the predicted line of best fit drawn on it. Determine the type of correlation (if any), and predict the model that will be used. Part 2: Find the line of best fit for the data either by hand or using technology. Explain your method. Find the predicted score for each time listed in the table.

OpenStudy (anonymous):

Part 3: Find the residuals, and decide if your model is a good fit. Explain your method.

OpenStudy (amistre64):

well, what does your scatter plot look like? can you work with excel?

OpenStudy (anonymous):

I already did a scatter plot I just need help with part 2 and part 3

OpenStudy (amistre64):

a few things you will need to determine the line of best fit sum of the y parts sum of the x parts sum of the xy parts sum of the xx parts and the number of points used you will also need the average X and Y values.

OpenStudy (anonymous):

I dont understand, do you mind explaining? Like I know the line of best fit is when you draw a line through the points.

OpenStudy (anonymous):

One second, I will try to attach my scatter plot here.

OpenStudy (amistre64):

the line of 'best' fit is the one that has the least amount of total error ... as such, we could either develop the formulas needed for a slope and a point; or, we could just memorize the formulas already stated in the textbooks.

OpenStudy (amistre64):

either way, we need to know those 7 values to play with since:\[y=mx+b\] \[m=\frac{n\sum xy-\sum x\sum y}{n\sum xx-\sum x\sum x}\] \[b=\bar y-m\bar x\]

OpenStudy (amistre64):

i spose they generally name the equation as y hat: \(\hat y\), to distinguish it from the observed y values

OpenStudy (anonymous):

Ok here is the scatter plot

OpenStudy (amistre64):

something that will make life easier if we have to do this by hand is to simply subtract the smallest value from all the points to reduce the size of the numbers involved. its the same line, just moved. but if this is done by a computer program, then thats irrelevant

OpenStudy (amistre64):

(52 37 31 9 26 40 22 10 45 34 19 60) - 40 (12 -3 -9 31 -14 0 -18 -30 5 -6 -21 20) 12-3 -9+31-14+0-18-30+5-6-21+20 -14-18-1 = -33, average is -33/12 ...................................................................... 95 84 72 58 77 86 72 43 90 81 62 98 (95 84 72 58 77 86 72 43 90 81 62 98)-70 25 14 2 -12 7 16 2 -27 20 11 -8 28 25+14+2-12+7+16+2-27+20+11-8+28 2+76 = 78, average is 78/12 ........................................................................ these values will give us a shifted line by (-40,-70) that we will have to shift back sum of xx, and sum of xy of course would be more time consuming by hand

OpenStudy (amistre64):

do you have excel or some equivalent program?

OpenStudy (anonymous):

I don't have excel, but I still don't understand what your doing. Like what the lesson tells me on how to find the line of best fit is to use two of the points on the line and the point-slope formula, y−y1=m(x−x1), to write the equation of the line then convert the equation of the line to slope-intercept form so that I can make predictions or answer questions about the data.

OpenStudy (anonymous):

So the two points i used was (52, 58) and (60,98) i found the slope and got 3/8 or 0.375 then the equation for the line i got was y=0.375x + 75.5

OpenStudy (amistre64):

OpenStudy (amistre64):

so you did not create the regression equation, you used a 2 point guesstimation

OpenStudy (anonymous):

those two points were from the data

OpenStudy (amistre64):

\[\hat y=\frac{n\sum xy - \sum x \sum y}{n\sum xx - \sum x \sum x}(x-avgX)+avgY\] \[\hat y=\frac{12(32120) - 385(918)}{12(15117) - 385(385)}(x-32.08333)+76.5\] which simplifes to something like: m = 0.9647 b = 45.5471

OpenStudy (amistre64):

when we guess a line of best fit by using 2 data points, its not generally not going to be a good fit.

OpenStudy (anonymous):

Oh okay

OpenStudy (amistre64):

the slope between the points that you choose is more like 5 ... 60, 98 -52, -58 -------- 8, 40 ... 40/8 = 5

OpenStudy (anonymous):

Wont it be 52,95 and 60,98

OpenStudy (amistre64):

yes, i was just checking to see if you had used valid points to start with :)

OpenStudy (amistre64):

id used 9,58 since its relatively the lowest/leftmost point instead of 52,95

OpenStudy (anonymous):

oh ok

OpenStudy (amistre64):

60, 98 -9,-58 -------- 51, 40 that gives us a more realistic slope closer to the best fit line

OpenStudy (amistre64):

40/51 = 0.7843

OpenStudy (anonymous):

oh because it is closer to 0

OpenStudy (amistre64):

not that is closer to 0, its just a better fit to me, visually that is.

OpenStudy (anonymous):

ok

OpenStudy (anonymous):

now i have to find the equation of the line

OpenStudy (amistre64):

notice your line in red, versus my line in black.

OpenStudy (anonymous):

Right, so yours is more accurate.

OpenStudy (amistre64):

the library im at 'upgraded' to the newer version of excel .... i like the older one better.

OpenStudy (amistre64):

visually, mine if more accurate :)

OpenStudy (anonymous):

So what do i do now? I find the equation, correct?

OpenStudy (anonymous):

I got y=0.7843x +50.9413

OpenStudy (amistre64):

with my points? seems fiar, let me dbl chk

OpenStudy (anonymous):

ok

OpenStudy (amistre64):

y = 0.7843x +50.9412 but yeah, thats good

OpenStudy (anonymous):

Okay, so after i got that, I dont know what to do next

OpenStudy (amistre64):

lets call the line; yh instead of just y y represents the actual data values that have been recorded, yh is just a model that we think will predict the values.

OpenStudy (anonymous):

Ok

OpenStudy (amistre64):

well, use your yh equation for each time value so determine how well it predicts it. you are going to create a whole new set of data points which we should call yh.

OpenStudy (amistre64):

we know when x=9, y=58 since we used that to create the line with to start with, and the last point will be exact as well. the others are going to differ to some degree

OpenStudy (anonymous):

so we have to make up an x value?

OpenStudy (amistre64):

no, the x values stay the same, we are trying to predict the outcome of a given time value; not predict the time value itself

OpenStudy (amistre64):

like this

OpenStudy (anonymous):

Oh

OpenStudy (amistre64):

the residuals are just the difference between yh and y

OpenStudy (anonymous):

so i would write 57.9999?

OpenStudy (amistre64):

well, since we used an approximation for the slope fraction, yes ... there is going to be some slight differences when it comes to the points we used.

OpenStudy (amistre64):

how accurate your yh is, is strictly up to you.

OpenStudy (amistre64):

we are "Finding the predicted score for each time listed in the table". by using our yh equation with the given time values

OpenStudy (anonymous):

I get it, but like i dont know what to write on the paper, so far i just have my equation work.

OpenStudy (amistre64):

um, i used excel to find the yh values .... and posted a screenshot of it.

OpenStudy (amistre64):

create a new table, and define your x and yh values.

OpenStudy (anonymous):

the yh is the y with the ^ on top?

OpenStudy (amistre64):

yes :)

OpenStudy (amistre64):

y hat

OpenStudy (anonymous):

so the values in yh is the predicted scores?

OpenStudy (amistre64):

heres is what has happened so far, so that you can see the big picture that you might be missing. a data set we given, this formed a scattered set of points on the graph we used 2 of those points to construct an equation that we can use to predict those points, and others. we are now finding how our equation plays with the time values, so yes yh is the predicted scores since we are using our equation to try to determine a given score with it.

OpenStudy (amistre64):

realistically we should have our points stated in terms of (time,grade) or simply g(t) we dont have an equation for g(t), we have a bunch of scattered about time,grade points and so we construct a mathematical model that we hope we can use to some effect. so we develop the grade prediction model as: g' = 0.7843(t) +50.9412 we use g' to compare with the known time,grade points to determine how good or bad a fit our model is

OpenStudy (anonymous):

Im sorry its still confusing i get part of it but not fully and i dont know what to write on the paper, this assignment is holding me back from doing the others

OpenStudy (amistre64):

i dont know what you need to write either, but my guess is that you need to write up something like the first 3 rows of this. http://assets.openstudy.com/updates/attachments/53c010f6e4b00f624a91300f-amistre64-1405100458010-untitled.jpg

OpenStudy (anonymous):

ok

OpenStudy (amistre64):

the 4th row is the residuals, the difference between our observed and predicted values: yh - y

OpenStudy (amistre64):

the last row, is just taking the differences and squaring them .... not sure if thats the process you are expected to determine the goodness of fit with tho

Can't find your answer? Make a FREE account and ask your own questions, OR help others and earn volunteer hours!

Join our real-time social learning platform and learn together with your friends!
Can't find your answer? Make a FREE account and ask your own questions, OR help others and earn volunteer hours!

Join our real-time social learning platform and learn together with your friends!