## Statistics & Data Analysis

(Research Starters)

Statistical data analysis is a tool to help us better understand the world around us and make sense of the infinite data with which we are constantly bombarded. In the business world, statistical data analysis can be used across the organization to help managers and others make better decisions and optimize the effectiveness and profitability of the organization. In particular, inferential statistical techniques are used to test hypotheses to determine if the results of a study occur at a rate that is unlikely to be due to chance. Statistical techniques are applied to the analysis of data collected through various types of research paradigms. However, although it would be comforting to assume that the application of statistical tools to the analysis of empirical data would yield unequivocal answers to aid in decision making, it does not. Without understanding the principles behind statistical methods, it is difficult to analyze data or to correctly interpret the results.

Mathematical statistics is a branch of mathematics that deals with the analysis and interpretation of data and provides the theoretical underpinnings for various applied statistical disciplines. Although statistical manipulation for the sake of learning more about stochastic processes or expanding the understanding of statistical principles is fine for theorists or the classroom, most people use statistics as a tool, a means rather than an end. In particular, statistics is a way to help us better understand the world around us and make sense of the infinite data with which we are constantly bombarded. To that end, in the business world, statistics tends to be used to organize and analyze data so that it can be interpreted and applied to solving business problems. Through statistical data analysis, marketing analysts can better predict future trends in the marketplace or understand how best to market to specific market segments. Through statistical data analysis, logisticians can better understand how to manage the supply chain so that it is both more effective and efficient, with supplies, raw materials, and components being received just before they are needed and products finished just before they are to be delivered in order to cut down on wasted time, money, and storage. Through statistical data analysis, engineers can determine ways to better control the quality of manufacturing processes or design products that will meet the needs of the marketplace while lowering costs for the organization.

The best way to perform these and other tasks is determined through the application of inferential statistics to the analysis of empirical data. Inferential statistics is a collection of techniques that allow one to make inferences about data, including drawing conclusions about a population from a sample. Inferential statistics is used to test hypotheses to determine if the results of a study have statistical significance, meaning they occur at a rate that is unlikely to be due to chance. A hypothesis is an empirically testable declarative statement that the independent and dependent variables and their corresponding measures are related in a specific way as proposed by the theory. The independent variable is manipulated by the researcher. For example, an organization might want to determine which of two new designs it should bring to market. The independent variable is the design of the product. The dependent variable, so called because its value depends on which level of the independent variable the subject receives, is the subject's response to the independent variable -- in this case, whether people prefer Design A or Design B. The researcher may set up an experiment to test the hypothesis that one design is preferred over the other. The results of the analysis would give the company support for making an empirically based decision about which product to bring to market.

For purposes of data analysis, a hypothesis is stated in two ways. The null hypothesis (H0) is a statement that there is no statistical difference between the status quo and the experimental condition. For example, a null hypothesis about people's preference for the two new product designs would be that there is no preference for one design over the other. The alternative hypothesis (H1) would be that there is, in fact, a preference for one design over the other. After the hypothesis has been formulated, an experimental design is developed that allows the hypothesis to be empirically tested. Data is then collected and statistically analyzed to determine whether the null hypothesis should be accepted or rejected.

### Statistical Methods

There are a number of different statistical methods for testing hypotheses, each appropriate for a different type of experimental design. One frequently used technique is the t-test, which is used to analyze the mean of a population or compare the means of two different populations. When one wishes to compare the means of two populations, a z statistic may be used. Another useful technique is analysis of variance (ANOVA), a family of techniques used to analyze the joint and separate effects of multiple independent variables on a single dependent variable to determine the statistical significance of the effect. Other statistical tools allow the prediction of one variable from the knowledge of another variable. Correlation coefficients allow analysts to determine whether two variables are positively related (e.g., the older people become, the more they prefer a certain brand of cereal), negatively related (e.g., the older people become, the less they prefer that brand cereal), or not related at all. Regression is a family of techniques that are used to develop mathematical models for use in predicting one variable from the knowledge of another variable. In general, statistical techniques can be applied to a wide range of business problems, including marketing research, quality control, prediction of marketplace trends or sales volume, and comparing the relative efficiency of the various operations in a multinational organization.

It would be comforting to assume that the application of statistical tools to the analysis of empirical data would yield definitive answers that would unequivocally indicate what decision should be made. Unfortunately, it does not. Without understanding the principles behind statistical methods, it is difficult to analyze data or to correctly interpret the results.

### Limitations to Real-World Statistical Data Analysis

Even if these limitations could be overcome, there are also practical limitations to real-world statistical data analysis that need to be taken into account. As complicated as human behavior is and as confounding as extraneous variables can be, the data that is collected in a laboratory is pristine compared with the data that can be collected in...

(The entire section is 2984 words.)