## Business Statistics

Business statistics is the application of mathematical statistical techniques to the real problems of the business world. In addition to helping the business analyst to organize and describe data, business statistics allows meaningful comparisons to be made between and among complex sets of data. Business statistics can be applied to a wide range of business problems including marketing, operations, quality control, and forecasting. The usefulness of statistical analysis, however, depends on the quality of the hypothesis being tested. There are a number of considerations for developing a testable hypothesis that will yield meaningful results when analyzed and for designing a research study that will control extraneous variables while emulating the real world situation to which the results will be extrapolated.

Every day, business persons are faced with a multitude of questions the answers to which can determine not only the course but the very success of the business. Will the new logo design better connect our product in the customer's mind than the current logo does? Will the new widget design attract people's attention and make them want to buy it? Are men or women more likely to buy a gizmo, and how do we best advertise it to them? How do we turn prospective customers into established customers? Will oil prices continue to rise and will people be open to alternative fuel sources? If the business answers these questions correctly, it can be on the leading edge of its industry. However, if the business answers these questions incorrectly, it can potentially lose money, its market share, or even its viability.

Mathematical statistics is a branch of mathematics that deals with the analysis and interpretation of data. Mathematical statistics provides the theoretical underpinnings for various applied statistical disciplines, including business statistics, in which data are analyzed to find answers to quantifiable questions. Business statistics is the application of these tools and techniques to the analysis of real world problems for the purpose of business decision making.

There are two general classes of statistics that are used by the business analyst. Descriptive statistics are used to describe and summarize data so that they can be more easily comprehended and studied. Among the tools of descriptive statistics are various graphing techniques, measures of central tendency, and measures of variability. Graphing techniques help the analyst aggregate and visually portray data so that they can be better understood. Included in this category are histograms, frequency distributions, and stem and leaf plots. Measures of central tendency estimate the midpoint of a distribution. These measures include the median (the number in the middle of the distribution), the mode (the number occurring most often in the distribution), and the mean (a mathematically derived measure in which the sum of all data in the distribution is divided by the number of data points in the distribution). Measures of variability summarize how widely dispersed the data are over the distribution. The range is the difference between the highest and lowest scores in the distribution. The standard deviation is a mathematically derived index of the degree to which scores differ from the mean of the distribution.

Descriptive statistics are helpful for taking large amounts of data and describing them in ways that are easily comprehendible. Pie charts, histograms, and frequency polygons are frequently used in business presentations and are all examples of ways that descriptive statistics can be used in business. Although such descriptive statistics are useful in summarizing and describing data, business statistics is an applied form of mathematics and is a valuable tool for helping analyze and interpret data. This can be done through the use of inferential statistics, a collection of techniques that allow one to make inferences about the data, including drawing conclusions about a population from a sample.

In general, inferential statistics are used to test hypotheses to determine if the results of a study occur at a rate that is unlikely to be due to chance (i.e., have statistical significance). A hypothesis is an empirically testable declarative statement that the independent and dependent variables and their corresponding measures are related to in a specific way as proposed by the theory. The independent variable is the variable that is being manipulated by the researcher. For example, a market researcher might be trying to determine which new breakfast cereal the organization should bring to market. The independent variable is the type of breakfast cereal. The dependent variable (so called because its value depends on which level of the independent variable the subject received) is the subject's response to the independent variable (e.g., whether or not the people like the breakfast cereal they are given to try). Examples of hypotheses include "the new red widget logo is better remembered than the old blue logo," "grade school children prefer the taste of new, improved Super Crunchies to original Crunchies cereal," or "Widget Corporation stores in the western states are more profitable than those in the East."

For purposes of statistical tests, the hypothesis is stated in two ways. The null hypothesis (H0) is the statement that there is no statistical difference between the status quo and the experimental condition. In other words, the treatment being studied made no difference on the end result. For example, a null hypothesis about the effectiveness of the two possible logos for Widget Corporation would be that there is no difference in the way that people react to the old logo versus the new logo. This null hypothesis states that there is no relationship between the variables of old/new logo (independent variable) and whether or not people like it (dependent variable). The alternative hypothesis (H1) states that there is a relationship between the two variables (e.g., people prefer the new logo).

Following the formulation of the null hypothesis, an experimental design is developed that allows the researcher to empirically test the hypothesis. Typically, this design will have a control group that that does not receive the experimental conditions (e.g., the group sees only the old logo) and an experimental group that does receive the experimental condition (e.g., the group sees the new logo). The analyst then collects data from people in the study to determine whether or not the experimental condition had any effect on the outcome. After the data have been collected, they are statistically analyzed to determine whether the null hypothesis should be accepted (i.e., there is no difference between the control and experimental groups) or rejected (i.e., there is a difference between the two groups). As shown in Figure 1, accepting the null hypothesis means that if the data in the population are normally distributed, the results are more than likely due to chance. This is illustrated in the figure as the unshaded portion of the distribution. By accepting the null hypothesis, the analyst is concluding that it is likely that people do not react any differently to the red logo than they do to the blue logo. For the null hypothesis to be rejected and the alternative hypothesis to be accepted, the results must lie in the shaded portion of the graph. This means that there is a statistical significance that the difference observed between the two groups is probably not due to chance but to a real underlying difference in people's attitudes toward the two logos.

Part of the process of designing an experiment is determining how the data will be analyzed. There are a number of different statistical methods for testing hypotheses, each of which is appropriate to a different type of experimental design. One class of statistical tests is the t-tests. This type of statistical technique is used to analyze the mean of a population or compare the means of two different populations. In other situations where one wishes to compare the means of two populations, a z statistic may be used.

Another frequently used technique for analyzing data in applied settings is analysis of variance (ANOVA). This family...

(The entire section is 3605 words.)