How does using a logarithm help achieve linearity in a dsitribution?Please explain, a diagram would be cool :D.
I think this is what your question refers to:
When you have a bunch of data points, and you would like to model them with an equation, one way to do that is to use the "least squares" method to find a straight line that looks like it works.
For example, from the wikipedia page "least squares" comes this picture:
The blue dots are data points, and the red line is a best fit line.
Some relationships aren't linear, however.
The graph is:
(EDIT: hopefully the graph is showing up for you? My computer is only displaying an empty graph??)
Now, suppose we didn't know the equation, but we just had a whole bunch of data points that surrounded the curve `y=5e^(2x)` , and we wanted to figure out that the equation was `y=5e^(2x)`
If you did a least squares analysis on these hypothetical points, you would get a straight line, which wouldn't really match the data.
Try laking the logarithm:
`"log" y = "log" (5 e^(2x)) = "log" 5 + "log" (e^(2x)) = "log" 5 + 2x`
Let's call Y=log (y)
And log (5) is approximately 1.6
So, what you get is:
This is a straight line!
What that means is, if your collection of data points (dots on a graph) looks exponential, then you can take the logarithm of all the y-values, and do least squares on (x, log y) of all the points. You wind up with a straight line (in this example, you will have figured out that your slope is 2 and your y-intercept is about 5). You can convert this back to the exponential relationship by reversing the steps:
`"log" y = 2x+1.6`
`y = e^(2x+1.6) = e^(1.6) e^(2x) = 5e^(2x)`
So you can use a straight line to help you model exponential relationships, by using a logarithm
Or, suppose your data points were all near the curve
But suppose we didn't have the actual equation `y=1.5 x^2.5`
We just had a bunch of dots that were near the curve on the graph. How could we use logarithms to figure out the equation?
`"log" y = "log" 1.5 + "log" (x^2.5) = "log" 1.5 + 2.5 "log" x`
If we write Y = log y, X= log x, we have:
Y = .4 + 2.5 X
Again, this is a straight line. What this means is, if we took looked at the logarithm of the x and y coordinates of all our data points, and plotted those instead, we would get data points that resembled a straight line, and we could do a least squares analysis of it. Working backwards, we could get an equation that modeled the original data, even though the original data weren't in a straight line.
If your data seems to show an exponential relationship, or a power relationship, then you can use logarithms to transform your data into a straight line, use "least squares" to figure out that line (its slope and intercept), and then produce an equation that still models your data.