The *standard deviation *of a set of data points is a measure for expressing the spread of the data about their (sample) mean. It is the square root of the *variance *of the data about their mean. The variance in turn is (very close to) the average squared distance of the data from the mean.

Mathematically speaking, if we define our set of data to be `{x_1,x_2,...,x_n}` then

the sample mean of the data is given by

`bar(x) = sum_(i=1)^n x_i/n`

and the sample variance is given by

`s^2 = sum_(i=1)^n (x_i -bar(x))^2/(n-1)`

The sample standard deviation is the square root of the sample variance, ie `s` As a rough rule of thumb, 95% of future data collected should lie between `bar(x) pm 2s` provided `bar(x)` and `s` are good estimates for the *true* population mean and variance and the distribution of the random variable `X` is approximately Normal or Gaussian.

The reason we don't take the sample variance as the straight average of the squared distances of the data about their mean is that this is a biased estimate of the true population variance. If we took the average squared distance about the *true mean *``(usually denoted `mu` ) rather than the sample mean (which is only an estimate of the true mean), this would not be biased. But since we only have the sample mean to work with, we compensate by decreasing the denominator of the estimate for the *true variance *(usually denoted `sigma^2` ) from n (the number of data points) to n-1 .

Considering *squared *deviation about the mean of a population gives rise to the Normal or Gaussian distribution which has all sorts of nice mathematical properties and is seen often in natural phenomena such as height and weight of an adult male or female. Another possibility is to consider *absolute *deviation about the mean, which is a more robust measure of spread as it is not as sensitive to outlying data. Erroneous data points (if they happen to be a long way from the mean) have a big influence on the estimation of the true standard deviation `sigma` when squared deviation is measured, so that outlying data points have less influence on the measure of spread if it is based on absolute rather than squared deviation. Unlike the standard deviation measure, however, the absolute deviation does not give rise to the Normal distribution and does not have nice mathematical properties to work with.

The standard deviation describes the typical distance your data points are from the average in the data set. If you graph the data (in a histogram, a box plot, etc.) this will oftentimes be visible in how spread out the data appears. When the standard deviation is large, the typical distance away from the mean is large as well, so the data points will be more spread out, leading to a wider and flatter histogram or to a longer box plot (assuming no outliers). On the other hand, if the standard deviation is small, the typical distance away from the mean is also small, so the data points will not be that spread out, meaning the box plot and histogram are both shorter in width and, assuming that we are keeping the number of data points consistent, a taller histogram.

Standard deviation gives indication as to how accurate your data is, since accuracy talks about how similar your data points are to one another, without regard to the "real" value of whatever you are measuring. Also, standard deviation is a relevant value in calculating for the presence of outliers.

The standard deviation is how spread out the numbers are. Deviation simply means how far from normal your numbers/data are. To find the standard deviation, you would take the square root of the variance, which is the average of the squared differences of the mean. To find the variance you would first find the mean, then for each one of your numbers you would subtract the mean and then square it. After that is done you would find the mean of the squared results.

Once you have found the standard deviation you now have your "standard" for your set of data. You can now use this number to determine which numbers in your data are "normal" and which ones fall much higher or much lower than your standard deviation.

Hope this helps!

The standard deviation tells you how likely a data point is to fall between a range of values.

Foe example, if you have a normal data distribution with a mean of 20 and a standard deviation of 2, you can expect that 68% of the time, your data points will fall between 18 and 22 (one standard deviation away). 95% of the time, your data points will fall between 16 and 24 (two standard deviations). And 99.7% of the time, your data points will fall between 14 and 26 (three standard deviations).

Knowing the standard deviation tells you what shape your distribution curve will be, and helps you predict whether a data point is statistically likely. If a hypothetical data point is outside of three standard deviations, it's a pretty safe bet to say it's statistically unlikely to occur.