## Normal & Binomial Distributions

(Research Starters)

The normal distribution is a family of idealized bell-shaped curves derived from a mathematical equation. Normal distributions are unimodal, symmetrical about the mean, and have an area that is always equal to 1. In addition, normal distributions are continuous rather than discrete and are asymptotic to the horizontal axis. Normal distributions can vary on several factors including central location, variation, skewness, and kurtosis. Another well-known distribution is the binomial distribution. This is a discrete distribution that occurs when a single trial of an experiment has only two possible outcomes: Success or failure. In some situations, the normal distribution can be used to approximate the binomial distribution. The fact that an underlying distribution approximates the normal distribution can be leveraged so that inferential statistics can be applied to the data in order to do hypothesis testing.

In both business and research, one is constantly bombarded with data. The question in both areas, however, is how to best interpret these data. Although it is nice to know that one received a rating of 85 on a 100 point scale, this statistic alone does not truly provide much information. Other questions need to be asked including what the mean score and the standard deviation for the sample were. Further, it would be helpful to know how many of those evaluated received a score of 85 or above. If of the people or companies evaluated, 90 percent of them received a score of 85 or above, the score is not so remarkable. If, on the other hand, only 2 percent received a score of 85 or above, a different interpretation should be put on this information. To understand where one's score falls within the larger group of scores from all the people rated or tested, one needs to understand the underlying distribution -- a set of numbers collected from data and their associated frequencies -- within which the score is situated.

### Normal Distribution

Although there are as many distributions as there are individual collections of data, there also exists the concept of a "normal" distribution that describes the population from which the sample distributions are drawn. The normal distribution is an idealized bell-shaped curve that is derived from a mathematical equation (Figure 1). Although "the" normal distribution is hypothetical, the family of normal distributions describes a wide variety of characteristics occurring in nature as well as in business and industry. For example, many characteristics of humans including height, weight, speed, life expectancy, IQ, and scholastic achievement all fit within the paradigm of the normal distribution. Similarly, many variables more directly related to business concerns also have a normal distribution. For example, the cost of household insurance, rental cost for square foot of warehouse space, employee satisfaction, performance appraisal ratings, and percentage of defects on a production line can all take the shape of a normal distribution. On a more practical level, the normal distribution provides the basis for many aspects of inferential statistics and hypothesis testing. In addition, it is used by human factors engineers in designing equipment (e.g., so that it can be usable by a wide range of people such as those between the 5th percentile woman through the 95th percentile man) and by quality control engineers to determine whether or not a process is within quality standards. The normal distribution is also referred to as the Gaussian distribution after its discoverer Karl Gauss, an astronomer in the early nineteenth century. Gauss observed that when objects are repeatedly measured, the measurement errors are typically distributed normally. For this reason, the normal distribution is also sometimes referred to as the normal curve of errors.

### Characteristics of Normal Distributions

Normal distributions have several distinguishing characteristics. Normal distributions are continuous rather than discrete and are asymptotic to the horizontal axis (i.e., they never cross or touch the axis, but continue into infinity becoming ever closer to the axis). Normal distributions are also unimodal (i.e., have only one mound), such that the mound is in the center and the graph of the distribution is symmetrical about its mean (i.e., the two halves are mirror images of each other). Another property of normal distributions is that the area under the curve is always equal to 1.

### Variations in Normal Distributions

Specific normal distributions can vary on a number of different characteristics, particularly: central location, variation, skewness, and kurtosis.

- Central location is the value of the midpoint of the distribution. Measures of central tendency for a distribution include the median (the number in the middle of the distribution), the mode (the number occurring most often in the distribution), and the mean (a mathematically derived measure in which the sum of all data in the distribution is divided by the number of data points in the distribution).
- Variation is the degree to which the values cluster around the central value. If the values are clustered closely together there will be less variation than if the values are spread further apart.
- Skewness refers to whether or not the distribution is symmetrical around its central value. Skew refers to the end of the distribution where there is the least concentration of values. As shown in Figure 2, distributions that are asymmetrical and whose values cluster to the left of the central value (i.e., the major of the values are lower than the central value) are said to have positive skew because the longer tail of the distribution is on the positive side of the central value. Distributions whose values cluster on the right of the central value are said to have negative skew (i.e., the longer tail is on the negative side of the central value). Distributions that are symmetrical about the central value are not skewed. Skewness can be an important consideration in statistics. For example, the mode is by definition influenced by the skew. The median, although more stable than the mode, is also influenced by the direction of the skew.

- Another characteristic on which frequency distributions can differ is kurtosis. This characteristic refers to the degree to which a distribution is peaked (i.e., whether it is flat or tall in comparison with the normal distribution) near the central point of the distribution. As shown in Figure 3, a distribution is said to be leptokurtic if it is tall and thin in comparison with the normal distribution or platykurtic if it is flat in comparison with the normal distribution. Normally shaped distributions are said to be mesokurtic.

As mentioned above, the normal distribution is not a single distribution, but is actually a...

(The entire section is 3022 words.)