# John puts his change in a container every night. If he has $11.09 at the end of Jan, $22.27 at the end of Feb, $44.35 in April, $75.82 in July and $89 in Aug and $114.76 in Oct., peform a...

John puts his change in a container every night. If he has $11.09 at the end of Jan, $22.27 at the end of Feb, $44.35 in April, $75.82 in July and $89 in Aug and $114.76 in Oct., peform a linear regression on this date to complete:

What does the value of the correlation coefficinet tell you about the correlation of the data?

What is the equation of the best - fitting line? Round to the nearest thousandths.

Show all steps to solve.

*print*Print*list*Cite

### 1 Answer

A linear regression is one which predicts the values of a dependent variable based upon the values of an independent variable. Correlation analysis is used to measure the strength of association between the two variables X and Y. Closer the correlation coefficient is to +1 or -1 the better the association between the two variables X and Y.

Generally, the concept of *least squares* is used to draw the best fit line. The least squares method uses the vertical deviation of each data point from the best fit line (i.e. the deviation denoted as `Y - hatY` ). The best fit line results when there is the smallest value for the sum of the squares of the deviations between Y and `hatY` . That is, we want to minimize the equation

`sum_(i=1)^R(Y-hatY)^2`

Here we want to put the amount accumulated (in dollars) by John along Y axis and number of days since the beginning of the year along the X-axis. The steps of regression analysis will follow:

**X Y X^2 Y^2 XY**

**-------------------------------------------------------------**

31 11.09 961 122.9881 343.79

59 22.27 3481 495.9529 1313.93

120 44.35 14400 1966.923 5322

212 75.82 44944 5748.672 16073.84

243 89 59049 7921 21627

304 114.76 92416 13169.86 34887.04

---------------------------------------------------------------

969 357.29 215251 29425.39 79567.6

(`barX` =161.5, `barY` =59.548)

`b=(sum(XY)-(sumXsumY)/n)/(sumX^2-(sumX)^2/n)`

`=(79567.6-969*357.29/6)/(215251-969^2/6)`

`=0.372`

`a=barY-bbarX=59.548-0.372*161.5=-0.550`

Therefore, the equation of the best fit line is `Y=0.372X-0.550`

Value of the regression coefficient is:

`=sum(XY)/sqrt(sum(X^2)*sum(Y^2))`

`=79567.6/sqrt(215251*29425.39)`

`=0.9998`

So, there is a very strong correlation between the independent variable (number of days) with the dependent variable (amount accumulated in dollars).