## Advanced Statistical Methods

Applied statistics is concerned with the analysis of real world problems. However, real world situations can be complex and messy, and do not always lend themselves to analysis using simple inferential statistics. This is not to say that the workhorse procedures such as t-tests, the Pearson coefficient of correlation, and analysis of variance are unimportant or less valuable. However, there are situations where these techniques are not appropriate. Multivariate statistics is used to summarize, represent, and analyze multiple quantitative measurements obtained on a number of individuals or objects. Nonparametric statistics is used in situations where it is not possible to estimate or test the values of the parameters of the distribution or where the shape of the underlying distribution is unknown. In addition, techniques are available to analyze time series data — data gathered on a specific characteristic over a period of time — in order to develop meaningful business forecasts.

Keywords Distribution; Factor Analysis; Inferential Statistics; Model; Multivariate Analysis of Variance (MANOVA); Multivariate Statistics; Nonparametric Statistics; Population; Regression; Sample; Statistical Significance; Statistics; Time Series Data; Variable

### Statistics: Advanced Statistical Methods

### Overview

Knowledge of statistical methods is becoming increasingly important for success in business and management in many sectors in the 21st century. One of the reasons for this requirement is the ever increasing influx of data enabled by the proliferation of information systems. Technology today allows people to communicate greater amounts of data and information faster than ever before. This trend has enabled businesses to better manage customer accounts, forecast marketplace needs, and in general more successfully proactive and effective in the marketplace. Statistical methods can be very useful for analyzing real world problems and providing decision makers with better information.

### Limitations to Inferential Statistics

However, the fact is that real world situations can be complex and messy, and do not always yield the kind of neat data that are easily analyzed by descriptive or simple inferential statistics. The power of more advanced statistical procedures is readily available through advances in computer science that take the drudgery out of complex and repetitive calculations, enabling business analysts to better analyze the multitude of data that crosses their desks.

This is not to say that the workhorse procedures such as *t*-tests, the Pearson coefficient of correlation, and analysis of variance are unimportant or less valuable. The analyst must choose the right tool for the job, and these techniques are workhorses because they are appropriate and powerful for so many situations. However, there are situations where these techniques are not appropriate. Sometimes the real world situation is so complex that it cannot easily be fit into a simple 2x2 or 3x3 matrix. The analyst may need to know about the influence of multiple independent variables on multiple dependent variables. In other cases, the data are not clean enough to meet the assumptions required for standard parametric statistics. There may be no reason to assume, for example, that the underlying distribution is normal or no way to estimate parameters of the population such as the mean and standard deviation. At other times, the data are not static and do not hold still for a neat analysis of the past in order to forecast the future. Changes in data over time must be acknowledged in order to forecast the needs of the future. In situations such as these, other methods are required.

### Statistics for Complex Computations

Multivariate statistics is a branch of statistics that is used to summarize, represent, and analyze multiple quantitative measurements obtained on a number of individuals or objects. Examples of multivariate statistics include factor analysis, cluster analysis, and multivariate analysis of variance (MANOVA). These powerful tools help the analyst derive important information from complex data sets such as where it is important to determine the joint and separate effects of multiple independent variables on multiple dependent variables and determine the statistical significance of the effect. Nonparametric statistics is another class of statistical procedures that is used in situations where it is not possible to estimate or test the values of the parameters (e.g., mean, standard deviation) of the distribution or where the shape of the underlying distribution is unknown. Although not as powerful as standard parametric techniques, nonparametric statistics allow the analyst to derive meaningful information from a less than perfect data set. In addition, techniques are available to analyze time series data — data gathered on a specific characteristic over a period of time — in order to develop meaningful business forecasts.

### Applications

There are many statistical tools available to the analyst who needs advanced tools to analyze real world situations. These approaches include:

- Multivariate analyses
- Nonparametric statistics
- Time series analysis.

Multiple techniques are available in each of the categories.

### Multivariate Analysis

### Reasons for Use

Multivariate analysis can be very useful for the analysis of complex real world systems and situations. There are a number of reasons for using multivariate analysis. First, any treatment or independent variable can affect subjects in multiple ways. For example, a downturn in the market may not only decrease consumer spending but also increase savings or the way that investments are made. The introduction of an innovative new product may make a task easier to perform, but may or may not change consumer buying habits based on how the new product performs *vis a vis* the old technology, how satisfied people are with the other technology, and so forth. In addition, the use of multiple criteria can give the analyst a better understanding of the characteristics under investigation. Behavior is influenced by many factors. Because of this, the results of laboratory experiments often translate poorly to the real world. Multivariate analysis allows the researcher to consider a more complex situation that is closer to the real world and to better explain and understand observed phenomena. In the research situation, multivariate analysis also enables researchers to cut down on the cost of research by allowing the data from several independent variables to be analyzed using a relatively small sample size compared with running multiple sequential experiments. Further, the use of powerful multivariate techniques helps the analyst avoid false positive results that can occur when running multiple univariate techniques.

### Multivariate Analysis of Variance

A complete discussion of multivariate techniques is well beyond the scope of this document. However, several techniques are worthy of notice. Multivariate analysis of variance is a multivariate extension evaluation of variance for use in situations where analysis is needed for the joint and separate effects of multiple independent variables on multiple dependent variables and determine the statistical significance of the effect. Like univariate analysis of variance, multivariate analysis of variance attempts to determine whether or not changes in the independent variables affect the dependent variables. In addition, multivariate analysis of variance attempts to identify any interactions among the independent variables and associations between the dependent variables. Multivariate analysis of variance is particularly of interest in research and in complex situations that require an understanding of the effects of multiple independent variables on multiple dependent variables.

### Factor Analysis

Another multivariate technique that can be useful in business settings is factor analysis. This multivariate technique analyzes interrelationships between variables and attempts to articulate their common underlying factors. Factor analysis is used in seemingly random situations where it is assumed that the nature of a domain is not actually chaotic, but it is attributed to multiple underlying factors. Multidimensional mathematical techniques are applied to the data to illustrate how they cluster together into "factors." In many ways, factor analysis is more a logical procedure than a statistical one although it is based on the analysis of Pearson correlation coefficients between data. Factor analysis performs a causal analysis to determine the likelihood of the same underlying processes resulting in...

(The entire section is 3861 words.)