## Mathematical Statistics

(Research Starters)

Human beings are constantly being bombarded with data and, as a result, continually look for ways to organize and analyze them in an attempt to better understand the world in which they live. Mathematical statistics is a branch of mathematics that deals with the analysis and interpretation of data. There are two basic kinds of statistics. Descriptive statistics help one organize and describe data. Descriptive statistics comprise measures of central tendency and measures of variability. Inferential statistics are used in the analysis and interpretation of data and enable one to draw conclusions about the data and the statistical significance at which the result occurred. Statistics, however, do not give black and white answers, and their interpretation requires an understanding of their limitations and meaning.

No matter where we are or what we are doing, we are constantly being bombarded with data. As I sit at my desk, I hear the radio in the background, the noises from outside my window, the clacking of the keys on the computer keyboard as I type this sentence, and even the quiet beating of my heart. I feel the heat of the halogen lamp on one arm and the cool breeze from the open window on the other. I see the words on the computer screen as I type, while my peripheral vision catches sight of the brightly colored Post-It notes set around the perimeter of the screen as reminders to do various tasks. I also see the hastily scratched outline for this article on a notepad as well as an open book with mathematical formulae and a large pile of other books waiting to be opened. The list, of course, goes on, but I cannot process it all. Even most of the sensory inputs I just mentioned recede into the background unnoticed. I hear nothing except the next words that I shall write in my head. I ignore every visual image except the computer screen in my direct line of sight. Unless I become uncomfortably hot or cold, I do not even notice the temperature. Human beings are constantly receiving more data than they can meaningfully process. In some cases we ignore the data or let them recede into the background. In other cases, we attempt to organize, analyze, and interpret them.

Statistics is a branch of mathematics that helps human beings make sense of the data that they receive. Although statistics is not needed to make sense of the inputs that my senses receive as I sit at my desk, it can help make sense of the grade distribution from the class final so that I can better understand whether there are topics that I need to emphasize more in my lectures or questions that need to be more clearly worded on the test. Statistics can also help me make sense of raw data on community demographics so that I can help one of my clients determine how the local area is changing and how they can focus their marketing strategy to better reach prospective local customers. Statistics can also help me interpret the data on trends in consulting for my target market so that I can keep up with my professional studies and be better prepared to help my clients. Although I could just use my best guess to answer any of these questions, the appropriate application of statistical techniques helps me to base my interpretation of the data on science and be more confident in my predictions.

These things are all done through the practical application of mathematical statistics. Statistics is a branch of mathematics that deals with the analysis and interpretation of data. Mathematical statistics provides the theoretical underpinnings for various applied statistical disciplines, including business statistics, in which data are analyzed to find answers to quantifiable questions. There are two types of statistics. Descriptive statistics is a set of mathematical tools that are used to help organize and describe the data. Inferential statistics is a subset of mathematical statistics used in the analysis and interpretation of data. Inferential statistics are used to make inferences about data such as drawing conclusions about a population from a sample and in decision making.

One of the most basic tools used in descriptive statistics is the graphing of data in order to organize and summarize them so that the data are more easily comprehendible. There are many ways to do this, but one of the most frequently used methods is through the development of a frequency distribution. This is a graph in which the data are divided into intervals of typically equal length. This process decreases the number of data points on the graph by organizing the data and making them easier to comprehend. For example, if one had height data from a sample of 1,000 people, a graph showing all 1,000 points would be difficult to process meaningfully and comprehend. However, if the data were aggregated instead into ranges within the span of scores (e.g., 61 inches to 61.99 inches, 62 inches to 62.99 inches), the number of points on the graph would decrease and larger patterns would emerge. People who were 61 inches, 61.25, and 61.50 inches would all be grouped together. However, for many purposes, this would be sufficient. In fact, since height is an analog measurement, it would not be possible to ever get someone's exact height, so no matter how the data were placed on the graph, they still would be aggregated into intervals: Virtually every measurement one makes is an approximation. Figure 1 shows a sample frequency distribution for the heights of a group of people.

Although graphing the data in this way is helpful, there are other pieces of information about the data that can further help the observer better understand them. Descriptive statistics are used to describe the central tendency and the variability of a sample. Measures of Central Tendency are used to estimate the midpoint of a distribution. These measures include the median (the number in the middle of the distribution when the data points are arranged in order), the mode (the number occurring most often in the distribution), and the mean (a mathematically derived measure in which the sum of all data in the distribution is divided by the number of data points in the distribution). For example, as shown in Figure 2, for the distribution 2, 3, 3, 7, 9, 14, 17, the mode is 3 (there are two 3s in the distribution, but only one of each of the other numbers), the median is 7 (when the seven numbers in the distribution are arranged numerically, 7 is the number that occurs in the middle), and the mean (or arithmetic mean) is 7.857 (the sum of the seven numbers is 55; 55 / 7 = 7.857).

Although measures of central tendency are helpful in organizing and understanding large amounts of data, they are only one part of the puzzle. Without seeing the graph of the distribution, knowing that a sample of data has a mean of 10 does not give one much information about the data. To better understand what the data signify, one also needs to know how the data are distributed. Measures of variability summarize how widely dispersed the data are over the distribution. The range is this difference between the highest and lowest scores. So, for example, one would draw different conclusions about how well a class did on a test with a total possible score of 100 where the mean score was 60 if one knew that some of the people taking the test missed all the questions and some got them all right than if one knew that lowest grade received was 50 and the highest grade received was 70. In the first case, it would appear that more people got over half of the questions correct (because otherwise the mean would be less than 50), while in the second case it would appear that no one understood the material well enough to get most of the questions correct (or that the test had some badly worded questions) but that everyone had learned some of the material or that the questions...

(The entire section is 3407 words.)