# John puts his change in a container every night.  If he has \$11.09 at the end of Jan, \$22.27 at the end of Feb, \$44.35 in  April, \$75.82 in July and \$89 in Aug and \$114.76 in Oct., peform a...

John puts his change in a container every night.  If he has \$11.09 at the end of Jan, \$22.27 at the end of Feb, \$44.35 in  April, \$75.82 in July and \$89 in Aug and \$114.76 in Oct., peform a linear regression on this date to complete:

What does the value of the correlation coefficinet tell you about the correlation of the data?

What is the equation of the best - fitting line?  Round to the nearest thousandths.

Show all steps to solve.

llltkl | College Teacher | (Level 3) Valedictorian

Posted on

A linear regression is one which predicts the values of a dependent variable based upon the values of an independent variable. Correlation analysis is used to measure the strength of association between the two variables X and Y. Closer the correlation coefficient is to +1 or -1 the better the association between the two variables X and Y.

Generally, the concept of least squares is used to draw the best fit line. The least squares method uses the vertical deviation of each data point from the best fit line (i.e. the deviation denoted as `Y - hatY` ). The best fit line results when there is the smallest value for the sum of the squares of the deviations between Y and `hatY` . That is, we want to minimize the equation

`sum_(i=1)^R(Y-hatY)^2`

Here we want to put the amount accumulated (in dollars) by John along Y axis and number of days since the beginning of the year along the X-axis. The steps of regression analysis will follow:

X         Y           X^2         Y^2                 XY

-------------------------------------------------------------

31        11.09        961        122.9881        343.79

59         22.27       3481       495.9529       1313.93

120       44.35       14400     1966.923        5322

212       75.82       44944     5748.672       16073.84

243        89           59049     7921             21627

304       114.76      92416     13169.86       34887.04

---------------------------------------------------------------

969     357.29         215251     29425.39     79567.6

(`barX` =161.5, `barY` =59.548)

`b=(sum(XY)-(sumXsumY)/n)/(sumX^2-(sumX)^2/n)`

`=(79567.6-969*357.29/6)/(215251-969^2/6)`

`=0.372`

`a=barY-bbarX=59.548-0.372*161.5=-0.550`

Therefore, the equation of the best fit line is `Y=0.372X-0.550`

Value of the regression coefficient is:

`=sum(XY)/sqrt(sum(X^2)*sum(Y^2))`

`=79567.6/sqrt(215251*29425.39)`

`=0.9998`

So, there is a very strong correlation between the independent variable (number of days) with the dependent variable (amount accumulated in dollars).