how do you know when a linear model is appropriate?
Because a linear regression model is not always appropriate for the data, you should assess the appropriateness of the model by defining residuals and examining residual plots. Residuals The difference between the observed value of the dependent variable (y) and the predicted value (ŷ) is called the residual (e). Each data point has one residual. Residual = Observed value - Predicted value e = y - ŷ Both the sum and the mean of the residuals are equal to zero. That is, Σ e = 0 and e = 0.
the residual shows an even scattering, or should i say normal
and the correlation coefficient is 0.981
@campbell_st @dan815
A residual plot is a graph that shows the residuals on the vertical axis and the independent variable on the horizontal axis. If the points in a residual plot are randomly dispersed around the horizontal axis, a linear regression model is appropriate for the data; otherwise, a non-linear model is more appropriate.
so if the residual plot looks normal (evenly distributed) then a linear regression model is appropriate? @Prosper2win
yeah
so the correlation coefficient has nothing tot do with it?
@Prosper2win
???/ @Prosper2win
now I'm confused sorry
does the correlation coefficient come into play when decide ding if its a linear model or no
@jim_thompson5910
Yes I think so. If r = 1 or close to 1, then the points fall on a straight line or are near a straight line. This line has a positive slope. For r = -1, it's the same story but with a negative slope if r is close to 0, then the points are NOT going to all fall on/near a line. They are either randomly scattered or follow some other pattern such as a parabola.
so if the correlation coefficient is 0.981 its close to one meaning the points fall on a straight line or somewhat close aka a linear model would be appropriate? (in addition to the fact that the residual shows its evenly and randomly distributed)
yeah so it sounds like the points are close to a straight line with a positive slope ie we have positive correlation
okay so answering the question "Is a linear model appropriate for modeling these data" with my answer above would be okay? @jim_thompson5910
yeah I think so
okay, would you mind helping me with a few other problems? @jim_thompson5910
sure, 2 more
Animal-waste lagoons and spray fields near aquatic environments may significantly degrade water quality and endanger health. The National Atmosphere Deposition Program has monitored the atmosphere ammonia at swine farms since 1978.The data on the swine population size (in thousands) and atmospheric ammonia (in paper million) for one decade are given below: a) Construct a scatterplot for these data. (Find out from your instructor on how to submit your scatterplot). b) The value for the correlation coefficient for these data is 0.85. Interpret this value. c) Based on the scatterplot in part (a) and the value of the correlation coefficient in part (b), does it appear that the amount of atmospheric ammonia is linearly related to the swine population size? Explain. d) What percent of the variability in atmospheric ammonia can be explained by swine population size?
what do you have for this one?
i have the scatterplot made (for a), and for b, i started to interpret the correlation coefficient of 0.85 saying that it is a strong positive linear relationship
I got a scatter plot with a line nearly going through or very close to all the points, so I agree
so for c would i be able to say yes the atmospheric ammonia appears to be linearly related to the swine population because you see a scatterplot with a line nearly going through or very close ton all the points
yes and it makes sense: the swine are creating the waste, so the more pigs, the more waste (and more ammonia)
okay, how would we do d?
compute r^2 to answer that question
.7225 so 72.25%?
correct, 72.25% of the variability is explained by the population of pigs while the remaining 27.75% is explained by other unknown factors
okay, last one
When children and adolescents are discharged from the hospital the parents may still provide substantial care, such as the insertion of a feeding tube through the nose and down the esophagus into the stomach. It is difficult for parents to know how far to insert the tube, especially with rapidly growing infants. It may be possible for parents to measure their child's height and from that calculate the appropriate insertion length using a regression equation. At a major children's hospital, children and adolescents' heights and esophageal lengths were measured and a regression analysis performed. The data from this analysis is summarized below: a) For a child with a height one standard deviation above the mean, what would be the predicted esophageal length? b) What proportion of the variability in esophageal length is accounted for by the height of the children and adolescents? c) From the information presented above, does it appear that the esophagus length can be accurately predicted from the height of young patients? Provide statistical evidence for your response.
"For a child with a height one standard deviation above the mean" what is H in this case?
143.5
good
you plug that height into the regression equation given to find the predicted esophageal length
37.4495
got the same
ok b?
b) What proportion of the variability in esophageal length is accounted for by the height of the children and adolescents? compute r^2
.990025
yep
okay c?
c) From the information presented above, does it appear that the esophagus length can be accurately predicted from the height of young patients? Provide statistical evidence for your response. look at how close r is to 1
r is very close to one (only .005 away)
yes, so it's very strongly linearly correlated. Let's hope so since we're dealing with a medical procedure that could be dangerous/deadly. Having the right info is crucial.
wait what
you don't want to be making random guesses, you want to be accurate in medical stuff like this
okay
nvm I mentioned that
wait so what would i put for c
what you had is good
that its really close?
but it doesn't answer the question "does it appear that the esophagus length can be accurately predicted from the height of young patients?"
yeah it's close to 1, so the data points pretty much fall on the line or really close to it so you can use that regression equation to determine the esophagus length based on the height pretty accurately
oh okay
thank you so much!!!
np
Join our real-time social learning platform and learn together with your friends!