In the evaluation of second language (L2) learners' writing abilities, many factors can influence the fairness and accuracy of the assessment. As in evaluating native speakers, the assessment results may vary depending on what kind of test is given, who scores the test and what criteria are applied to the final product. Because L2 writing assessments are frequently used to measure not only writing ability, but also language proficiency, an additional complicating factor is present.
Keywords Analytical Scoring; Assessment; Concurrent Validity; Construct Validity; Contrastive Rhetoric; Critical Language Testing; High Stakes tests; Holistic Scoring; Interrater Reliability; Second Language (L2) Learners; Writing Assessment; Writing Portfolio
English as a Second Language: Writing Assessment for Second Language Learners
In the evaluation of second language (L2) learners' writing abilities, many factors can influence the fairness and accuracy of the assessment. As in evaluating native speakers, the assessment results may vary depending on what kind of test is given, who scores the test and what criteria are applied to the final product. Because L2 writing assessments are frequently used not just to measure writing ability, but also language proficiency, an additional complicating factor is present. This article examines the most important factors that must be considered when appraising L2 student writing.
Before designing or administering a writing assessment, the purpose of the assessment must be clearly identified. Three common purposes for evaluating L2 writers are:
• Developmental and
Diagnostic or entrance exams are given in order to determine placement within language programs. Developmental or progress assessments give an indication of a writer's growth over time. Promotional or exit tests assess whether the writer is ready to advance to the next level or graduate from a program.
The assessment's purpose influences the choice of rating scale used to evaluate the test. Typically, assessments are evaluated using either holistic or analytical scales. Holistic scoring involves assigning a single score based on a rater's overall impression of a particular piece of work. The rater is usually given a rubric of criteria outlining expected norms for each level of writing. However, scoring relies heavily on a rater's training and expertise to intuitively weight the criteria before producing a final mark. Holistic scoring is frequently used for diagnostic testing. It is also popular for large-scale assessments such as the Test of English as a Foreign Language (TOEFL) because it saves time and money.
Analytic scoring, on the other hand, is often used when the assessment is meant to provide feedback regarding a student's progress or readiness for promotion. Analytic scales identify one or more traits of writing to be assessed. Primary trait scales evaluate one main aspect of a text. Multiple trait scales focus on more than one component of a work (Bacha, 2001).
An issue in both analytic and holistic scoring is determining what aspects of the writing should be assessed. In other words, what are the components of good writing? Investigations in both the field of composition and second language learning have been conducted to answer this question (Casanave, 2004; Cumming, Kantor & Powers, 2002). However, because writing is a complex mental activity that involves creativity, because there are many purposes for which one writes and because formats affect style, no one set of criteria can be said to definitely define the best qualities of writing. Moreover, no one has yet identified a definite developmental sequence in writing. Rather, it appears that students' writing performance is variable. They may perform better in one area, such as being able to write a complex sentence, while doing poorly in another such as organizing a longer text (Bacha, 2001).
What should be Assessed?
Despite the difficulties in defining the best qualities of writing, in order to make judgments, criteria must be established. Studies have been conducted to determine what native speakers and experienced raters consider to be good writing. Cumming et. al. (2002) compared the evaluation processes of experienced English mother tongue composition raters and experienced ESL/EFL raters and found that both groups of raters tended to report the following qualities as being particularly effective in writing for a composition exam:
• Rhetorical organization: including introductory statements, development, cohesion, fulfillment of the writing task;
• Expression of ideas: including logic, argumentation, clarity, uniqueness, and supporting points;
• Accuracy and fluency of English grammar and vocabulary; and
• The amount of written text produced (p. 72).
Other researchers have developed frameworks for evaluation that include specific grammatical and discourse features. Chiang (1999), in a study that looked at the relative importance of these features to raters, evaluated 35 textual features under the categories of morphology, syntax, cohesion and coherence. Haan & Van Esch's (2004) framework considered overall quality, linguistic accuracy, syntactic complexity, lexical features, content, mechanics, coherence & discourse, fluency and revision. Connor and Mbaye (2002), in an attempt to apply a model of communicative competence to writing assessment, suggest the following breakdown of linguistic skills:
• Grammatical competence - spelling, punctuation, knowledge of words and structures;
• Discourse competence - knowledge of discourse, organization of genre, cohesion and coherence;
• Sociolinguistic competence - written genre appropriateness, audience awareness and appeals to audience, pertinence of claim and tone.
• Strategic competence - use of transitions and metatextual markers
Finally, in what is only a brief list of ways to categorize writing components, the ESL Composition Profile, which has been frequently adopted or modified for research and assessment, consists of five categories:
• Language and
• Mechanics (Jacobs, Zinkgraf, Wormuth, Hartfiel, & Hughey, 1981).
While having a clearly defined purpose and assessment scale are important, these two factors alone cannot ensure that tests are scored fairly and accurately. Into the testing mix are thrown several other variables that influence the outcome of any assessment. Kroll (1998) summarizes these critical variables as follows:
• The writer whose work is to be assessed;
• The writing task(s) that writers have been asked to complete;
• The written product(s) subject to assessment;
• The reader(s) who score or rank the product;
• The scoring procedures, which can be subdivided into the scoring guidelines and the actual reading procedures (p. 223).
The writer, naturally, has a central role in the outcome of the writing assessment. The writer brings to the task a unique background that is comprised of several variables that impact writing performance. Level of language proficiency, cultural background, familiarity with testing situations, motivation to complete the task and prior educational experiences can all influence the extent to which the writer performs in a novel testing situation (Kroll, 1998).
Creating a writing task that is valuable and reliable is important in any kind of assessment. In writing, tests are said to have content validity when they ask testers to perform the same kind of tasks that they would in the classroom (Bacha, 2001). Yet ESL writers in a language program typically represent multiple disciplines. Creating prompts that are general enough to be fair to all but specific enough to allow individuals to draw upon prior knowledge can be difficult. The wording of the prompt, its mode of discourse, rhetorical specifications and subject matter can affect results (Tedick, 1990). For instance, Tedick (1990) compared the impact of writing performance of a field-specific topic vs. a general writing topic. Drawing on research in cognitive psychology and reading research that shows comprehension and understanding of a...
(The entire section is 3801 words.)