It is frequently impossible to gather data from every member of a population of interest. Therefore, sociologists and other researchers typically base their studies on samples of individuals that are drawn from the population of interest. In order to be useful for research purposes, these samples need to be drawn in such a way as to minimize the probability of introducing bias into the selection process so that the resulting sample truly represents the underlying population. There are a number of ways to do this, however, the comparative efficacy of various sampling approaches remains a matter of debate.
Keywords Bias; Data; Population; Probability; Sample; Sampling; Sampling Error; Skewed; Validity; Variable
Why Do Researchers Take Samples?
To better understand the behavior of people within society, sociologists develop theories and collect and analyze data to test the validity of those theories. In some situations, it is relatively easy to gather data about opinions, behavior, or other characteristics of interest from every member of a population. For example, if I wish to know whether or not my class prefers to write one 10-page paper or two 5-page papers, I can simply ask my students for their preferences and make the assignment based on the majority opinion. This is easy to do because the class size is relatively small. I can easily collect the data and, since their motivation to respond is relatively high, the students' are likely to participate in my survey, giving me the data I need to make my decision.
If, on the other hand, I want to determine the same preference for all students in the university, or all students in all universities across the country or across the globe, the activity becomes more complicated. First, the sheer number of students in those larger populations makes the task of collecting information from all of them both costly and time-consuming. Second, although the students in my class may be motivated to answer my question because they are directly affected by the outcome, the students in these greater populations have no such motivation. As a result, the probability of collecting data from them all is rather low. However, I cannot in good conscience extrapolate the answer of my class to university students in general, students taking all my courses, or even university students taking this particular course from other professors at my university. There are too many differences between my class and those other, larger groups to make an accurate extrapolation. Although the members of my class have characteristics in common with the members of the other, larger groups (e.g., general age range, education level), it cannot necessarily be reasonably assumed that there are not other variables (e.g., workload, expectations) that may affect the responses of members of the other groups, making their answers different from those of this particular class.
To develop theories and build our knowledge of human behavior in society, however, it is often necessary to collect data about groups of people too large to poll individually. Rather than collecting data from a manageable group that has a low likelihood of representing the population that we wish to test, we instead take a sample of individuals from the larger group using a methodology that we believe will allow us to draw a sample that reflects the characteristics of the larger population. For example, although it may be impossible to collect data from every university student across the country, it may be possible to gather data from a representative sample of the population (e.g., university students taking an introductory sociology courses in several universities that have the characteristics in which we are interested). The sample that is selected can then be used in research based on the assumption that the sample has the same characteristics as the population as a whole.
It is very important that the method used to draw the sample gives us a sample that is representative of the characteristics of the population in which we are interested. Otherwise, our sample will be biased and the results of our study will not represent the results that would have been obtained from the population in general.
Selection bias occurs when the sample asked to participate in a study is selected in a way that is not representative of the underlying population. One of the classic examples of a biased sample that led to an erroneous conclusion is the Gallup Poll result following the 1948 presidential election which predicted that Thomas Dewey would beat Harry Truman. Results obtained from biased samples cannot be meaningfully extrapolated to the population at large.
Defining Sample Characteristics
Before determining the best way to draw a sample, we must first operationally define which characteristics are important in the target population. Although in some cases it is of value to just randomly interview every 15th person who walks into a shopping mall, in most cases the target population needs to be better defined. For example, if we are interested in the opinions of shoppers on whether or not they would play the latest video game, it would be better to draw our sample from those shoppers who are more likely to play the game than from shoppers in general.
Once the population in which one is interested has been operationally defined, a sample needs to be drawn from the population. There are two general approaches used to select a representative sample. The first is random sampling, in which a subset of the population is randomly chosen for the sample. Choosing names out of a hat or using a random number generator or a list of names are examples of this approach. However, although this is a widely used technique and may in many cases accurately represent the larger population because it is based on random probability, it also may be skewed to unfairly represent some characteristic. As a result, in some situations it is important to use a stratified random sample. This technique takes into account the known characteristics of the population. For example, if it is known that half the sociology students in the country are women, a researcher might randomly select 100 women and 100 men for a research study so that both genders are represented in the sample in the same proportion that they appear in the population.
Representative samples can be drawn in a number of different ways. The simplest approach to sampling is to merely randomly select people from the population through such methods as having a computer pick names at random from a list or by selecting names from a hat. These randomly chosen individuals are then assigned to the sample. Based on the laws of probability, this approach will more than likely be representative of the underlying population. However, in practice, achieving a truly random sample can be more difficult than it sounds. Written surveys, for example, tend to have notoriously low return rates, and people are frequently loath to give out information over the phone. As a result, many of the people from whom one would like to collect data take themselves out of the sample. This self-selection means that the resultant sample is not truly random. Further, the characteristics that are common to the individuals who opt out of participating in the research may be less frequently observed in the rest of the sample. This means that the sample may not represent a significant segment of the underlying population.
Another way to select samples is through systematic sampling, which determines who will be included or excluded from the sample on the basis of an a priori rule. For example, the researcher could select every nth person who walks in the door of a mall to participate in the survey. Although it is easier to select the participants using this...
(The entire section is 3503 words.)