Tag Archives: assessment

Rubric’s Clues

21 Aug 2022 Alan Broomhead Leave a comment

When the accountabilty movement in education took off at the turn of the century, schools and teachers were called on to justify their quality claims by defining expected student learning outcomes and stating publicly whether students had met them. This ‘no child left behind’ approach filtered through to postsecondary English language programs by way of accreditation requirements, and classroom evaluation of students moved from a largely qualitative exercise (‘speaks well,’ ‘a good writer’) to a strongly quantitative one. From that point on, student language evaluation had to be based on observable, measurable behaviors.

There is a challenge in trying to quantify something that is largely qualitative in nature. Think of assigning a numerical score to a work of art or a scene in nature, for example. Measurement is typically reserved for the physical world and is expressed in standardized units such as centimeters, kilograms, and degrees, which everyone in the world agrees on. How do you apply that approach to a second language learner’s performance in a conversation, or an essay? What is the unit of measurement for language ability? (Standardized test providers have tried to quantify language proficiency for many years, but it would be a challenge to describe what a single point on the TOEFL test represents.)

For classroom teachers, this is where rubrics come in. A rubric is a grid that typically comprises columns representing levels of achievement on an assessment task (e.g. “Did not achieve,” “Minimally achieved,” “Achieved,” “Exceeded”) and rows describing aspects of the task (such as “Includes a topic sentence,” “Uses discourse markers,” “Uses vocabulary appropriately”). A rubric is essentially a yardstick for measuring students’ performance on an assessment task. It lacks the objectivity of a standardized measure (although this can be improved if teachers ‘calibrate’ the rubric by agreeing on different levels of performance in students’ work), but it makes teachers and schools accountable for student achievement by certifying what students did or didn’t do. From an accountability standpoint, this is a step forward compared with purely qualitative evaluations of students’ performance.

If you have to create rubrics, here are three pieces of advice:

1. Determine how much detail you need. Before creating the rubric, consider what level of detail you need to give in the student’s evaluation. If it is a simple letter grade, you may not need to do a detailed analysis of a student’s language using many rows of the rubric. You might be able to take a more holistic approach, describing the whole task in one or two rows. Similarly, include only as many columns (indicating level of performance) as necessary. Writing differentiated performance levels is challenging; there should be a clear difference between each one so that you are not scratching your head wondering which description best fits the student’s performance. As with many things in life, keep things simple and avoid unnecessary effort by including only as much detail as is needed for your purpose.

2. Avoid using the words, “is able to” on the rubric. Remember that the rubric is a measuring tool and should describe only what the student did or did not do. A tape measure can give you the dimensions of a bookcase, but you (not the not the tape measure) have to evaluate whether it fits in your living room. Similarly, you use the information from the rubric to determine what the student can and cannot do. The rubric itself does not tell you that.

3. Include only assessed items on the rubric. If it is not in your course objectives, it should not be in your rubric. For example, if students are required to give a PowerPoint presentation, you would only include ‘creative, eyecatching slides’ on the rubric if this were one of the course objectives and you had taught it. Otherwise you are assessing skills that were not taught. (An exception to this advice is if students had learned a skill in a previous or connected course in the curriculum, which they were expected to incorporate in this course.) Don’t assess students on knowledge and skills that you did not teach them.

Demands for accountability in education continue, so rubrics are here to stay. Even if rubrics will never reach the level of objectivity of standardized measures, educators should learn to create effective rubrics as part of their professional skill set. Good luck!

For Schools

Is it okay for language learners to make mistakes?

21 May 2022 Alan Broomhead Leave a comment

“It’s okay to make mistakes.”

That’s what many English language teachers tell their students at the start of a course. It’s a reassurance designed to address the anxiety of students who are reticent about speaking or writing because they are used to teachers emphasizing accuracy in language use: accuracy in grammar and vocabulary in particular. And it’s a recognition that taking risks is a means to improvement.

In spite of the merits of the ‘it’s okay to make mistakes’ advice though, grading practices of many teachers, programs, or institutions contradict it. In fact, grading systems often implicitly communicate to students that it is emphatically not okay to make mistakes.

Here’s how the contradiction happens. Most schools still give grades, and most students want to get a good grade. Final grades are most often arrived at by combining the results of work done during the term or session – assignments, quizzes, and the like, known as formative assessment – and an evaluation of the extent to which a student has met the goal(s) of the course, or summative assessment. Teachers’ gradebooks and the gradebooks of online learning management systems combine these grades in some way to arrive at the final grade.

The problem is that formative assessment is done while the students are still learning, when they haven’t yet mastered the course outcomes, when they are bound to make mistakes – those mistakes that their teachers tell them it is “okay to make.” But if a student does poorly on some of those formative assessments, and the grade from those assessments factors into the student’s final grade, then even if the student eventually succeeds in meeting the course goals, her final grade is brought down by the low grades she received while she was learning and making mistakes. If she cared about her final grade, then it was certainly not okay for her to make mistakes, contrary to what her teacher told her at the start.

If we truly want students not to worry about making mistakes as they progress in their learning, then formative assessment shouldn’t figure into the final course grade. Instead, we would determine whether and to what extent the student had met the course learning goals. Students could follow their own route to achievement without fear of mistakes along the way bringing their final grade down.

Shifting the burden of the final grade onto final summative assessments brings its own problems, however. In particular, it is stressful for students if their entire grade for a course hinges on how well they do in a final, summative assessment. How to deal with that is another discussion…

Other news

Inputs and outcomes – how we wound up with two systems for grading students in english language programs

13 Mar 2021 Alan Broomhead

row-students-doing-exam Assigning final grades to students has been done in various ways over the years. In some contexts, everything rested on a final exam – this was the case with the O-level and A-level exams I took in a British high school ‘back in the day.’ Then ‘continuous assessment’ became popular, making the final grade a composite of grades for assignments completed during the course, either with our without a final exam. This approach became popular in U.S. intensive English programs, where the final grade might be made up of homework assignments, projects, tests and quizzes, and the usually ill-defined ‘participation’ by the student.

But English language programs, like all other schools in the U.S., became caught up in larger forces that had an enormous impact on how students were evaluated. Following the successful launch of the Russian satellite Sputnik in the late 1950s, there was much nail-biting over the quality of American education, culminating in the ‘A Nation at Risk’ report in 1983, which painted an anxiety-inducing picture of failing U.S. public schools.

In the years following the publication of the report, the means of defining quality in education were questioned. The focus of quality tended to be on inputs – number of hours in class (this is where the ‘credit-hours’ system came from), teacher qualifications, teaching methods, and so on. Many schools in competitive environments still make such inputs the basis of their quality claims – “highly qualified teachers!” “innovative teaching methods!”

But those raising the red flag about school quality were less concerned with inputs and more concerned with what the students came out of their education with – that is the outcomes, or as they have come to be known, student learning outcomes, or SLOs. No matter how great the inputs, if students were not learning useful knowledge and skills for the job market, the education they were receiving was not valuable. The solution was to turn the traditional curriculum planning process around and start at the end by first defining the desired outcomes, and having course design lead to student achievement of outcomes.

This enabled education bureaucracies to hold schools and teachers accountable: school and teacher quality could be judged not by the quality of the teachers or the hours spent in class, but by the extent to which students were meeting the defined learning outcomes. In the public schools, those outcomes were assessed by standardized tests, and schools were judged and ranked by how well students scored on those tests. (The downside of all this was that quality aspects of school such as adequate breaks between classes and time for the arts, music and sports suffered as schools honed in on efforts to increase standardized test scores in math, science, and English.)

Back to grading. ESL teachers have for many years been used to giving final grades based on a combination of test and quizzes, homework assignments, projects, participation, and final exams. But with a shift toward accreditation of English language programs – mandatory in many cases, voluntary in others – teachers in those programs are now required to fall in with the requirement to define learning outcomes at the outset, and assess and evaluate students with sole reference to the students’ achievement of the outcomes. This has to be done at the school level, and it results in a greater standardization of curricula, syllabi, and assessments in schools. Schools are required to record and analyze the data arising from the assessment of SLO achievement. Decisions about whether a student may progress to the next level of study or complete the program successfully must be made solely on the basis of whether the student achieved the learning outcomes.

The result is that schools have to take a mixed approach to grading students. Schools may still assign a traditional grade based on continuous assessment and participation, but they must also maintain a system that isolates achievement of the student learning outcomes and makes promotion and completion decisions based on that. What’s certain is that choosing one or the other of these two systems is not possible – both are needed. Yes, we can agree that it’s important for students and their sponsors to understand what the expected outcome of a course or program was and whether the student achieved it. This kind of accountability is needed when many are questioning the dollar value of their education. But as educators we also want to know whether students engaged with the educational process – collaborated with peers, challenged themselves on difficult projects or assignments, sought help and advice and gave them to others in turn. How the students got there is important to us, and still largely defines the benefit of studying at one school rather than another.

And so we ended up with two types of student assessment and evaluation, one based on inputs into the process and ongoing or continuous assessment, the other on outcomes. Both systems are here to stay, and educators need to be familiar with the rationale and procedures for each of them.

Background photo created by pressfoto – www.freepik.com

For Schools

How SWBATs and can-do statements shortchange language learners

9 May 2018 Alan Broomhead

“Can keep up with an animated discussion, identifying accurately arguments supporting and opposing points of view.” “Can tell a story or describe something in a simple list of points.” If your program is using Common European Framework of Reference (CEFR) descriptors as its outcomes statements, you’ll be familiar with ‘can-do’ statements like these.

The CEFR was developed as a means to assess and describe language proficiency. It was built on the European tradition of communicative language teaching (CLT), which emphasized the performance of language tasks. Since language performance can be observed, the CEFR’s can-do statements were a perfect match for the measurable-outcomes-based accountability initiatives that came in the wake of No Child Left Behind. Many teachers have been trained, encouraged, or badgered to plan their lessons and courses around SWBAT (‘students will be able to’) or can-do statements.

There is a persuasive case to be made that CEFR (and similar) performance statements are a useful way to describe language proficiency. Employers, for example, what to know what a potential employee can do in a language – what practical uses the employee can use the language for. Language educators are not employers, though. What language educators need to know is whether and to what extent learning has taken place, and here’s the problem.

Broadly speaking, two educational traditions have informed language teaching: the behavioral, and the cognitive. Behaviorists see learning as a change in behavior, one that can be observed or measured. Cognitivists see learning as acquiring and understanding knowledge. The cognitivist tradition fell out of fashion with the demise of the grammar-translation method and the rise of behavior-based approaches to language teaching. These days, we can probably all agree that in language learning, we need to refer to both traditions: the acquisition or construction of a mental representation of the language, and the skill required to be able to use it in practice. When our outcomes are can-do statements, we focus on observable or measurable behaviors, but tend to pay less attention to acquired or constructed knowledge. We want to know if the learner ‘can tell a story,’ or ‘keep up with an animated discussion,’ for example.

If you have taught students from various countries, you know that some are great performers even if they lack a solid language base – somehow, they manage to draw on sparse linguistic resources to communicate. And on the other hand, you know that some learners have extensive language knowledge, especially grammar and vocabulary knowledge, but have a great deal of difficulty ‘performing.’ Hence, Chomsky wrote of language proficiency, “behavior is only one kind of evidence, sometimes not the best, and surely no criterion for knowledge,” (as cited in Widdowson, 1990). The one is not necessarily indicative of the other.

If you are an educator (as opposed to an employer), you are interested in student learning in any form. You want to know what progress a learner has made. From a cognitive point of view, that includes changes in the learner’s mental representation of the language – a clearer understanding of the form, meaning, and use of the present perfect, for example – even if that has not yet resulted in a change in behavior, such as the ability to use that tense easily in a conversation. A learner who has made great strides in his/or mental representation of the language but is still speaking in telegraphic speech may be of little interest to an employer, but should be of great interest to an educator, because learning has taken place that is a basis for future teaching. Assessment and description of the learner’s language should address this type of progress. The behavioral tradition, with its can-do outcomes statements have no interest in such cognitive development – it is not interested until there is a change of behavior, an observable, measurable performance.

This approach to assessment shortchanges learners who may have made real progress on the cognitive side. So, I’m calling on language educators not to accept uncritically the use of CEFR and similar performance-based descriptors as measures of language learning.

Reference
Widdowson, H.G., Aspects of Language Teaching, Oxford University Press, 1990

For Schools

The Accreditation-Ready Program

22 Apr 2018 Alan Broomhead

There are few obligations for faculty and staff that cause knots in the stomach and departmental wrangling than preparing the accreditation self-study. It is often viewed as a burden, a distraction from everyone’s ‘real’ work, and a process of bureaucratic box-checking or of trying to fit the round peg of the program into the square hole of accreditation requirements.

In Five Dimensions of Quality, Linda Suskie draws on years of experience with accreditation, institutional and program assessment, and accountability to re-frame the role of accreditors as “low-cost consultants who can offer excellent collegial advice” (p. 245) to schools and programs seeking to demonstrate their value to stakeholders in an increasingly competitive market. Accreditation should be viewed not as an imposition of alien practices on an established program, but as a way for a school or program to gain external affirmation of already-existing quality. The challenge is not to make the program ‘fit’ accreditation standards, but actually to be a quality program and demonstrate that quality.

Accreditation success, then, flows naturally from the pursuit of quality, and is not an end in itself. But what is quality? Suskie breaks it down into five dimensions or ‘cultures’:

A Culture of Relevance
Deploying resources effectively to put students first, and understand and meet stakeholders’ needs.

A Culture of Community
Fostering trust among faculty, students, and staff, communicating openly and honestly, and encouraging collaboration.

A Culture of Focus and Aspiration
Being clear about school or program purpose, values, and goals.

A Culture of Evidence
Collecting evidence to gauge student learning and program or school effectiveness.

A Culture of Betterment
Using evidence to make improvements and deploy resources effectively.

Fostering these cultures is the work of leadership, since they require widespread buy-in from all stakeholders. The challenge in many institutions is institutional inertia, as Suskie points out in her chapter, “Why is this so hard?” Faculty, staff, and governing boards may feel satisfied that the school’s reputation is sufficient for future success; resources – especially money and people’s time – may not be forthcoming; faculty and staff may live in comfortable isolation from the real-world needs of students; there may be an ingrained reluctance to communicate successes; there is frequently resistance to change; and siloed departments in programs and institutions make across-the-board cultural change difficult to pull off.

The question administrators and faculty should ask themselves is, “Do we put our efforts into pursuing quality, or into maintaining our accreditation?” Suskie’s book presents a convincing case that working on the former will make the latter much easier and will result in quality rather than box-checking. For its straightforward prose (including jargon alerts scattered throughout), its sound advice, and its call for schools to demonstrate quality in a highly competitive environment, Five Dimensions of Quality should be a go-to resource on the reference bookshelf of decision-makers and leaders in higher education programs.

Suskie, L., Five Dimensions of Quality, Jossey-Bass 2015

More of my education-related book reviews are at Amazon.

For Schools

Why language is best assessed by real people

23 Jul 2017 Alan Broomhead

“Classroom decoration 18” by Cal America is licensed under CC BY 2.0

What is the most effective way to assess English learners’ proficiency?

It has become accepted in the field to rely on psychometric tests such as the iBT (Internet-Based TOEFL) and the IELTS for college and university admissions. Yet these and most other language tests are an artifice, a device that is placed between the student’s actual proficiency and direct observation of that proficiency by a real human being. Students complete the limited set of tasks on the test, and based on the results, an algorithm makes an extrapolation as to their broader language abilities.

When you look at a TOEFL score report, it does not tell you that student’s English language ability; what it tells you is what a learner with that set of scores can typically do. And in the case of the TOEFL, this description is an evaluation that is based largely on multiple choice answers and involved not one single encounter with an actual human being. Based on this, university admissions officers are expected to make an assumption about the student’s ability to handle the demands of extensive academic reading and writing, classroom participation, social interaction, written and spoken communications with university faculty and staff, SEVIS regulations, and multiple other demands of the U.S. college environment. (Although the IELTS includes interaction with the examiner and another student, these interactions are highly structured and not very natural. TOEFL writing and speaking tasks are limited, artificial, and assessed by a grader who has only a text or sound or text file to work with.)

Contrast that with regular, direct observation of students’ language proficiency by a trained and experienced instructor, over a period of time. The instructor can set up a variety of language situations involving variation in interlocutors, contexts, vocabulary, levels of formality, and communication goals. In an ACCET or CEA accredited intensive English program, such tasks are linked to documented learning objectives. By directly observing students’ performance, instructors are able to obtain a rich picture of each student’s proficiency, and are able to comment specifically on each student’s strengths and weaknesses.

Consider this a call, then, for colleges and universities to enter into agreements with accredited intensive English programs to waive the need for a standardized test such as the TOEFL. Just as those colleges and universities don’t use a standardized test to measure the learning of their graduates, they should be open to accepting the good judgment of teachers in intensive English programs – judgment based on direct observation of individual learners rather than the proxy scores obtained by impersonal, artificial tests.

For Schools

Aligning assessment and IEP culture

16 Mar 2017 Alan Broomhead

Since the passage of the Accreditation Act of 2010, intensive English programs (IEPs) have been under pressure to justify their quality claims by recording and reporting on student achievement. This has meant devising program-wide systems for assessing and evaluating students, and has been a challenge for many IEPs.

The type of system a program develops is influenced by its culture. A more managerial (top-down, administratively driven) culture typical of proprietary English schools tends to favor standardization of assessment that includes program-wide level-end tests. Many university IEPs have more of a collegial (faculty-driven with a degree of shared governance) culture in which individual faculty decision-making and autonomy are valued. In the latter type, it can grate against the culture when there is an attempt to introduce or impose standard testing. It may be more agreeable to retain faculty autonomy in assessment but introduce checks to ensure that assessments are aligned with course objectives and outcomes.

Both approaches (and blends of the two) are used by CEA-accredited programs and are able to meet the CEA standards. There is no need to create standard assessments across a program if they do not fit the culture. On the other hand, the imperative to assess students in a more consistent way can be a catalyst for culture change. This will need leadership, persuasion, and buy-in from faculty.

I’ve designed and overseen assessment and evaluation systems in proprietary and university programs, and can support programs in determining and developing the right approach. Get in touch if I can help!

Have a great weekend!

(Learn more about academic cultures in Engaging the Six Cultures of the Academy by William Bergquist and Kenneth Pawlak. I highly recommend it.)

Global School Support

Tag Archives: assessment

Rubric’s Clues

Is it okay for language learners to make mistakes?

Inputs and outcomes – how we wound up with two systems for grading students in english language programs

How SWBATs and can-do statements shortchange language learners

The Accreditation-Ready Program

Why language is best assessed by real people

Aligning assessment and IEP culture

The blog for intensive English program management