Criterion-referenced tests are used to determine what students can do and what they know based on a predetermined, specific set of educational outcomes. These outcomes can be determined by the instructor, school, district, or state based on the curriculum standards that are set. Criterion-referenced tests do not compare students to other students, which is the purpose of norm-referenced tests. As long as the criterion-referenced test is properly aligned to the expected educational outcomes, it can give detailed information about how well a student has performed on each outcome included on the test (Bond, 1995).
Keywords Assessment; Bell Curve; Content Validity; Criterion-Referenced Tests; Cut Scores; Evaluation; High School Exit Exams; High-Stakes Tests; No Child Left Behind Act of 2001 (NCLB); Norm Group; Norm-Referenced Tests; Rubric; Percentile Rank; Standardized Tests
Criterion-referenced tests are used to determine what students can do and what they know based on a predetermined, specific set of educational outcomes. These outcomes can be determined by the instructor, school, district, or state based on the curriculum standards that are set. Criterion-referenced tests do not compare students to other students, which is the purpose of norm-referenced tests. As long as the criterion-referenced test is properly aligned to the expected educational outcomes, it can give detailed information about how well a student has performed on each outcome included on the test (Bond, 1995). Criterion-referenced test scores can indicate whether students meet, do not meet, or exceed the predetermined acceptable standards that were assessed (Taylor & Walton, 2001).
For certain criterion-referenced tests, such as high school exit or grade advancement exams, “tests have cut scores, which are scores that determine whether a student passes or fails” (Bracey, 2000, p. 8). This cut score, which can also be referred to as the criterion, determines success or failure with the only concern being whether or not the student has attained the cut score. For example, on a high school exit exam, if the cut score is determined to be 65, all that matters is whether or not students get a 65 or better and not what their exact score is. Those students who score 65 or better pass, and those students who score 64 or below fail (Bracey, 2000). Those who achieve that score will receive their diploma; those who do not may be given another chance to take the assessment, may be referred for remediation, may have to take additional courses, or may have to repeat the school year.
Criterion-referenced tests are used for meeting No Child Left Behind (NCLB) standards because one of its goals is to increase instructional standards by mandating that states challenge their students in mathematics, reading/English language arts, and, by the end of the 2007-2008 school year, science. NCLB has added mandatory testing and federal reporting with potentially serious consequences for those states and districts that do not demonstrate 'adequate yearly progress,' making this type of criterion-referenced testing a high stakes form of test.
The NCLB stipulates that states are “to set challenging academic content standards and that the assessments must be aligned with those standards” (Linn, 2005, p. 81). However, NCLB does not define content standards, set the performance standards for each state, or detail the type of assessments and cut scores that should be used, leaving these determinations up to each individual state. NCLB goes further and lists what states need to use to set their annual measurable objectives “based on the percentage of students performing at or above proficiency. These standards are used to determine if schools, districts and states make adequate yearly progress, and the progress targets must be set so all students will be at or above the proficient level by 2014” (Linn, 2005, p. 91). If the percentage of students passing state tests is insufficient, schools then have not made adequate yearly progress and sanctions can be imposed.
Criterion Referenced Tests vs. Norm-Referenced Tests
It is possible for students in states that use both norm-referenced and criterion-referenced tests to score well on one test and not as well on the other, due to the differences in the tests. For example, the criterion-referenced test is not timed and the norm-referenced test is, the criterion-referenced test may give credit for demonstrating the proper technique even if the final answer is not correct, and a norm-referenced will split the test takers with half the students above the 50 percent mark and half the students below the 50 percent mark (Taylor & Walton, 2001). This means that if a student did well on a norm-referenced test, getting 85 percent of the questions correct, that student will fall below the 50 percent mark if a majority of the other students taking the test correctly answered 86 percent of the questions. On a criterion-referenced test correctly answering that percentage of questions would normally earn the student a grade of B.
Criterion-referenced tests are developed by reviewing a set of objectives or a curriculum and then composing the test questions with a goal of having the test “determine how well the students have mastered the identified objectives or curriculum. As with teacher-made tests, a criterion-referenced test can contain words that are unusual or rare in everyday speech and reading, as long as they occur in the curriculum and as long as the students have had an opportunity to learn them. With a criterion-referenced test, we are not much interested in differentiating students by their scores” (Bracey, 2000, p. 9). Ideally, all students would earn a passing score. Whereas norm-referenced tests normally align students by percentile ranks, criterion-referenced tests tend to differentiate students in terms of whether or not they have mastered or exceed expectations. As with “norm-referenced tests, criterion-referenced tests should be evaluated in terms of reliability and content validity” before being used (Bracey, 2000, p. 9).
Criterion-referenced tests have many advantages and uses. A properly aligned criterion-referenced test can give detailed information about how well a student has performed on each educational outcome or goal included on a test (Bond, 1995). For example, a criterion-referenced mathematics test focusing on fractions can pinpoint each student's strengths and weaknesses because instructors will be able to see if their students have mastered adding, subtracting, multiplying, and dividing fractions based on the questions they correctly answer. In addition to the primary competencies, instructors will also be able to determine if their students know basic mathematical concepts by how they work out each problem. Instructors can use criterion-referenced tests to determine if their students are learning the curriculum and how well they are teaching the curriculum.
If the subject matter of the tests is properly coordinated with the content of the instruction, criterion-referenced tests can also give students, their parents and instructors more information about what exactly students have learned, which can help everyone focus on what competencies still need to be mastered and which ones have been mastered (Bond, 1995). Many proponents of using criterion-referenced tests instead of norm-referenced tests believe that the grades produced by a norm-referenced test are depreciated because they occur despite what the test scores actually are. Thus, receiving the highest grade in the class is certainly not such a great achievement if the high score was 45 out of 100 and the student received an A nonetheless. On a norm-referenced test, 50 percent of any class achieve scores that lie above or below the average score; the other grades assigned are based on predetermined distribution patterns to form a traditional bell curve. Advocates of using norm-referenced tests instead of criterion-referenced tests contend that the grades resulting from criterion-referenced tests are cheapened because more "A" grades can be received than a norm-referenced test will produce because, in theory, all students in the classroom can correctly answer almost all the questions on a criterion-referenced test and will be given a grade of "A". This can lead to a debate about whether a grade of "A" is more meaningful when there are not as many of them that were earned, as is the case with a norm-referenced test, or is it more important to show that more students have mastered the...
(The entire section is 3811 words.)