Survey Research Methods (Encyclopedia of Public Health)
A survey is a method of collecting information about a human population. In a survey, direct (or indirect) contact is made with the units of the study (e.g., individuals, organizations, communities) by using systematic methods of measurement such as questionnaires and interviews.
Many surveys are conducted around the world each year. While the purpose, topics, and size of these surveys varies, similar steps are followed in the planning, development, and implementation of each. These steps are described below, with an example from an existing survey, the Youth Risk Behavior Survey, which is taken every two years by the Centers for Disease Control and Prevention (CDC).
Identify the Purpose. To determine the purpose of a survey, two questions must be asked: (1) what information is wanted or needed, and (2) where can this information be found. A researcher may want to describe a population or program, plan a new program, or evaluate an existing one. Survey questions might address scientific issues (e.g., "What is the prevalence of cigarette smoking in the United States?"), social marketing issues(e.g., "How do adolescents respond to a new public service announcement?"), or broad public opinion (e.g., "Should schools teach sex education?").
Information may be obtained from a specific group (e.g., high school students in a city or women of childbearing age in a state) or a broader group (e.g., adults in the United States). The population of interest (unit of study) can be identified in a country, state, city, or local area. For example, the purpose of the Youth Risk Behavior Survey (YRBS) is to provide information on priority health-risk behaviors among students in grades nine through twelve throughout the United States.
Develop the Questionnaire. Once the purpose of the survey and population of interest are determined, a questionnaire must be developed. The questionnaire should be designed to provide the information being sought. It is important to determine which topics are essential, and previous questionnaires can be reviewed to identify questions that can be used for each topic. Reviewing previous questionnaires will also help determine the best format (e.g., multiple choice, open-ended) and question order. Questionnaires should start with easy questions rather than sensitive or hard to remember questions. If it is necessary to translate the questionnaire into multiple languages, the quality of the translation can be checked by having it "back-translated" into the original language.
Pilot testing the questionnaire using focus groups or small samples of respondents in the population of interest will determine the acceptability of the questionnaire to typical respondents and how long it takes them to complete it. A cognitive lab test of the questionnaire is also useful for determining problems with question comprehension, flow, and understanding of response options. Internal review board or other approvals should be obtained prior to survey administration.
The questionnaire for the YRBS contains eighty-six multiple-choice questions measuring six categories of behaviors: tobacco use, dietary behaviors, physical activity, alcohol and other drug use, sexual behaviors, and behaviors that may result in unintentional injuries or violence.
Identify the Setting. Surveys are usually conducted in households, schools, health care facilities, or worksites. It is important to pick the location where the population of interest can be accessed most easily and where it can be most fully represented. Because of the need to obtain information from students in grades nine through twelve, the YRBS is conducted in schools.
Identify the Mode. Within each setting, data can be obtained in three ways: through personal interviews, through self-administrated questionnaires, or by reviewing records. Personal interviews are conducted by an interviewer who records the respondent's answers on a questionnaire or directly into a computer. Personal interviews are done in person or by telephone.
Self-administered questionnaires can be "paper and pencil" or electronic in nature. Paper and pencil questionnaires are brought to the respondent by a data collector or mailed to the respondent. Electronic questionnaires use computerassisted self-interviewing technology. The questions are answered either on a lap top computer that is brought to the respondent or on a web site which the respondent can access. Record review is typically done on-site, but also can be done electronically if the records are stored on a web site or local area network.
Selecting the appropriate combination of setting and mode is important and should be based on the survey topic and population of interest, as well as answers to the following questions:
- Which approach will produce the most valid and reliable data? For youth, sensitive topics are often best measured in a school setting using a paper and pencil self-administered questionnaire.
- Which approach will yield the highest response rate? Household surveys often produce the highest response rate for general population surveys.
- How much will the survey cost to conduct? Household surveys are generally the most expensive.
- How long will the survey take to complete? Telephone surveys, such as public opinion polls, often are the fastest.
The YRBS uses a paper and pencil self-administered questionnaire, which is provided to students by a data collector.
Select the Sample. The quality of the sample often determines the quality of the data. Samples of convenience or volunteer samples produce data representative only of persons who participate in the survey. Scientifically selected samples can be representative of a larger population and are used to generalize findings to persons beyond those who participate in the survey.
To identify the appropriate sample for a survey, the survey topic, population of interest, setting, and mode must all be considered. Once this is done, the next step is to select an appropriate sampling frame from which to draw the sample. The sampling frame is a list of all the members of the population of interest. It should be as current and inclusive as possible. Existing databases may be available, or it may be necessary to construct a sampling frame. For the sample design, many possibilities exist (e.g., simple random sample, stratified sample, cluster sample). The YRBS uses a sampling frame of all public and private high schools in the United States and a three-stage cluster sample design to produce a nationally representative sample of students in grades nine through twelve.
Conduct the Fieldwork. Fieldwork begins with obtaining clearance or approval to conduct the survey. It may be necessary to seek clearance or approval not only from respondents (e.g., students), but also gatekeepers to the respondent (e.g., school administrators and parents). Data collection protocols must also be developed. The goal is to standardize data collection as much as possible to assure quality control throughout the fieldwork, to obtain a high response rate, and, often, to protect the privacy of respondents.
Selection of data collectors or field staff is also important. It is best to select persons appropriate for the content of the survey and the demographic characteristics of the population of interest (e.g., female interviewers for surveys on reproductive health issues among women). Formal training of data collectors or field staff will help them become familiar with the questionnaire format, content, mode of data collection, data collection protocol, and quality control procedures.
Before the YRBS is conducted, clearance is obtained from school administrators and parents. Then, trained data collectors are sent to each school to collect data according to the survey protocol. Because of the sensitive nature of the questionnaire, special procedures are used to protect student privacy.
Enter, Edit, and Prepare Data for Analysis. Since the 1980s, data entry has become easier due to advances in electronic data input. Previously, most survey data were entered manually by key punching and then reentered to assure accuracy. Today, most survey data collected using questionnaires are scanned electronically into a data set. Data collected via computer-assisted interviewing are automatically entered into a data set. Once entered, the data are edited for out-of-range responses, simple consistency, and logic errors. Then the sample is weighted to adjust for nonresponse, varying probabilities of selection, and sample characteristics. Weighting is necessary to ensure that the data are representative of the entire population of interest.
YRBS questionnaire booklets are scanned to produce the data file. Basic out-of-range and consistency edits are run and a weighting factor is applied to each student record to adjust for nonresponse and for varying probabilities of selection.
Conduct Analyses. Analyses are done using a software package that incorporates the sample design. Several software packages have been developed to analyze complex survey data (e.g., STATA, SUDAAN, and Westvar). Without these packages, accurate standard errors cannot be produced. Analyses are used to answer the questions that were originally developed to identify the purpose and population of interest for the survey. They should be kept as simple as possible to enhance the usefulness of the data for multiple audiences. To analyze YRBS data, SAS and SUDAAN are used to compute prevalence estimates and 95 percent confidence intervals for priority health risk behaviors among high school students.
Write and Disseminate Reports. Survey data are often used to improve policies and programs. Consequently, key decision makers need to be able to access and understand the data. In addition to formal research papers, other methods can be used to share survey results, such as press releases, fact sheets or pamphlets, or Internet materials. If the needs and interests of the target audience are considered, it will enhance the likelihood they will act on the results. YRBS data are disseminated in MMWR Surveillance Summaries, fact sheets, a CD-Rom (YOUTH '99), on the CDC web site (at .
CHARLES W. WARREN
(SEE ALSO: Behavioral Risk Factor Surveillance System; Census; Cohort Study; Data Sources and Collection Methods; National Health Surveys; Sampling; Statistics for Public Health; Surveillance; Surveys)
Centers for Disease Control and Prevention (2000). Youth Risk Behavior Surveillance Systemt-A-Glance. Atlanta, GA: Author.
(2000). "CDC Surveillance Summaries, June 9, 2000." Morbidity and Mortality Weekly Report 49(SS-5)-16.