Tag Archives: assessment

Inputs and outcomes – how we wound up with two systems for grading students in english language programs

row-students-doing-examAssigning final grades to students has been done in various ways over the years. In some contexts, everything rested on a final exam – this was the case with the O-level and A-level exams I took in a British high school ‘back in the day.’ Then ‘continuous assessment’ became popular, making the final grade a composite of grades for assignments completed during the course, either with our without a final exam.  This approach became popular in U.S. intensive English programs, where the final grade might be made up of homework assignments, projects, tests and quizzes, and the usually ill-defined ‘participation’ by the student. 

But English language programs, like all other schools in the U.S., became caught up in larger forces that had an enormous impact on how students were evaluated. Following the successful launch of the Russian satellite Sputnik in the late 1950s, there was much nail-biting over the quality of American education, culminating in the ‘A Nation at Risk’ report in 1983, which painted an anxiety-inducing picture of failing U.S. public schools. 

In the years following the publication of the report, the means of defining quality in education were questioned. The focus of quality tended to be on inputs – number of hours in class (this is where the ‘credit-hours’ system came from), teacher qualifications, teaching methods, and so on. Many schools in competitive environments still make such inputs the basis of their quality claims – “highly qualified teachers!” “innovative teaching methods!” 

But those raising the red flag about school quality were less concerned with inputs and more concerned with what the students came out of their education with – that is the outcomes, or as they have come to be known, student learning outcomes, or SLOs. No matter how great the inputs, if students were not learning useful knowledge and skills for the job market, the education they were receiving was not valuable. The solution was to turn the traditional curriculum planning process around and start at the end by first defining the desired outcomes, and having course design lead to student achievement of outcomes. 

This enabled education bureaucracies to hold schools and teachers accountable: school and teacher quality could be judged not by the quality of the teachers or the hours spent in class, but by the extent to which students were meeting the defined learning outcomes. In the public schools, those outcomes were assessed by standardized tests, and schools were judged and ranked by how well students scored on those tests. (The downside of all this was that quality aspects of school such as adequate breaks between classes and time for the arts, music and sports suffered as schools honed in on efforts to increase standardized test scores in math, science, and English.)

Back to grading. ESL teachers have for many years been used to giving final grades based on a combination of test and quizzes, homework assignments, projects, participation, and final exams. But with a shift toward accreditation of English language programs – mandatory in many cases, voluntary in others – teachers in those programs are now required to fall in with the requirement to define learning outcomes at the outset, and assess and evaluate students with sole reference to the students’ achievement of the outcomes. This has to be done at the school level, and it results in a greater standardization of curricula, syllabi, and assessments in schools. Schools are required to record and analyze the data arising from the assessment of SLO achievement. Decisions about whether a student may progress to the next level of study or complete the program successfully must be made solely on the basis of whether the student achieved the learning outcomes. 

The result is that schools have to take a mixed approach to grading students. Schools may still assign a traditional grade based on continuous assessment and participation, but they must also maintain a system that isolates achievement of the student learning outcomes and makes promotion and completion decisions based on that. What’s certain is that choosing one or the other of these two systems is not possible – both are needed. Yes, we can agree that it’s important for students and their sponsors to understand what the expected outcome of a course or program was and whether the student achieved it. This kind of accountability is needed when many are questioning the dollar value of their education. But as educators we also want to know whether students engaged with the educational process – collaborated with peers, challenged themselves on difficult projects or assignments, sought help and advice and gave them to others in turn. How the students got there is important to us, and still largely defines the benefit of studying at one school rather than another. 

And so we ended up with two types of student assessment and evaluation, one based on inputs into the process and ongoing or continuous assessment, the other on outcomes. Both systems are here to stay, and educators need to be familiar with the rationale and procedures for each of them. 

Background photo created by pressfoto – www.freepik.com

How SWBATs and can-do statements shortchange language learners

“Can keep up with an animated discussion, identifying accurately arguments supporting and opposing points of view.” “Can tell a story or describe something in a simple list of points.” If your program is using Common European Framework of Reference (CEFR) descriptors as its outcomes statements, you’ll be familiar with ‘can-do’ statements like these.

The CEFR was developed as a means to assess and describe language proficiency. It was built on the European tradition of communicative language teaching (CLT), which emphasized the performance of language tasks. Since language performance can be observed, the CEFR’s can-do statements were a perfect match for the measurable-outcomes-based accountability initiatives that came in the wake of No Child Left Behind. Many teachers have been trained, encouraged, or badgered to plan their lessons and courses around SWBAT (‘students will be able to’) or can-do statements.

There is a persuasive case to be made that CEFR (and similar) performance statements are a useful way to describe language proficiency. Employers, for example, what to know what a potential employee can do in a language – what practical uses the employee can use the language for. Language educators are not employers, though. What language educators need to know is whether and to what extent learning has taken place, and here’s the problem.

Broadly speaking, two educational traditions have informed language teaching: the behavioral, and the cognitive. Behaviorists see learning as a change in behavior, one that can be observed or measured. Cognitivists see learning as acquiring and understanding knowledge. The cognitivist tradition fell out of fashion with the demise of the grammar-translation method and the rise of behavior-based approaches to language teaching. These days, we can probably all agree that in language learning, we need to refer to both traditions: the acquisition or construction of a mental representation of the language, and the skill required to be able to use it in practice. When our outcomes are can-do statements, we focus on observable or measurable behaviors, but tend to pay less attention to acquired or constructed knowledge. We want to know if the learner ‘can tell a story,’ or ‘keep up with an animated discussion,’ for example.

If you have taught students from various countries, you know that some are great performers even if they lack a solid language base – somehow, they manage to draw on sparse linguistic resources to communicate. And on the other hand, you know that some learners have extensive language knowledge, especially grammar and vocabulary knowledge, but have a great deal of difficulty ‘performing.’ Hence, Chomsky wrote of language proficiency, “behavior is only one kind of evidence, sometimes not the best, and surely no criterion for knowledge,” (as cited in Widdowson, 1990). The one is not necessarily indicative of the other.

If you are an educator (as opposed to an employer), you are interested in student learning in any form. You want to know what progress a learner has made. From a cognitive point of view, that includes changes in the learner’s mental representation of the language – a clearer understanding of the form, meaning, and use of the present perfect, for example – even if that has not yet resulted in a change in behavior, such as the ability to use that tense easily in a conversation. A learner who has made great strides in his/or mental representation of the language but is still speaking in telegraphic speech may be of little interest to an employer, but should be of great interest to an educator, because learning has taken place that is a basis for future teaching. Assessment and description of the learner’s language should address this type of progress. The behavioral tradition, with its can-do outcomes statements have no interest in such cognitive development – it is not interested until there is a change of behavior, an observable, measurable performance.

This approach to assessment shortchanges learners who may have made real progress on the cognitive side. So, I’m calling on language educators not to accept uncritically the use of CEFR and similar performance-based descriptors as measures of language learning.

Reference
Widdowson, H.G., Aspects of Language Teaching, Oxford University Press, 1990

The Accreditation-Ready Program

There are few obligations for faculty and staff that cause knots in the stomach and departmental wrangling than preparing the accreditation self-study. It is often viewed as a burden, a distraction from everyone’s ‘real’ work, and a process of bureaucratic box-checking or of trying to fit the round peg of the program into the square hole of accreditation requirements.

In Five Dimensions of Quality, Linda Suskie draws on years of experience with accreditation, institutional and program assessment, and accountability to re-frame the role of accreditors as “low-cost consultants who can offer excellent collegial advice” (p. 245) to schools and programs seeking to demonstrate their value to stakeholders in an increasingly competitive market.  Accreditation should be viewed not as an imposition of alien practices on an established program, but as a way for a school or program to gain  external affirmation of already-existing quality. The challenge is not to make the program ‘fit’ accreditation standards, but actually to be a quality program and demonstrate that quality.

Accreditation success, then, flows naturally from the pursuit of quality, and is not an end in itself. But what is quality? Suskie breaks it down into five dimensions or ‘cultures’:

A Culture of Relevance
Deploying resources effectively to put students first, and understand and meet stakeholders’ needs.

A Culture of Community
Fostering trust among faculty, students, and staff, communicating openly and honestly, and encouraging collaboration.

A Culture of Focus and Aspiration
Being clear about school or program  purpose, values, and goals.

A Culture of Evidence
Collecting evidence to gauge student learning and program or school effectiveness.

A Culture of Betterment
Using evidence to make improvements and deploy resources effectively.

Fostering these cultures is the work of leadership, since they require widespread buy-in from all stakeholders. The challenge in many institutions is institutional inertia, as Suskie points out in her chapter, “Why is this so hard?” Faculty, staff, and governing boards may feel satisfied that the school’s reputation is sufficient for future success; resources – especially money and people’s time – may not be forthcoming; faculty and staff may live in comfortable isolation from the  real-world needs of students; there may be an ingrained reluctance to communicate successes; there is frequently resistance to change; and siloed departments in programs and institutions make across-the-board cultural change difficult to pull off.

The question administrators and faculty should ask themselves is, “Do we put our efforts into pursuing quality, or into maintaining our accreditation?” Suskie’s book presents a convincing case that working on the former will make the latter much easier and will result in quality rather than box-checking. For its straightforward prose (including jargon alerts scattered throughout), its sound advice, and its call for schools to demonstrate quality in a highly competitive environment, Five Dimensions of Quality should be a go-to resource on the reference bookshelf of decision-makers and leaders in higher education programs.

Suskie, L., Five Dimensions of Quality, Jossey-Bass 2015

More of my education-related book reviews are at Amazon.

Why language is best assessed by real people


“Classroom decoration 18” by Cal America is licensed under CC BY 2.0

What is the most effective way to assess English learners’ proficiency?

It has become accepted in the field to rely on psychometric tests such as the iBT (Internet-Based TOEFL) and the IELTS for college and university admissions. Yet these and most other language tests are an artifice, a device that is placed between the student’s actual proficiency and direct observation of that proficiency by a real human being. Students complete the limited set of tasks on the test, and based on the results, an algorithm makes an extrapolation as to their broader language abilities.

When you look at a TOEFL score report, it does not tell you that student’s English language ability; what it tells you is what a learner with that set of scores can typically do. And in the case of the TOEFL, this description is an evaluation that is based largely on multiple choice answers and involved not one single encounter with an actual human being. Based on this, university admissions officers are expected to make an assumption about the student’s ability to handle the demands of extensive academic reading and writing, classroom participation, social interaction, written and spoken communications with university faculty and staff, SEVIS regulations, and multiple other demands of the U.S. college environment. (Although the IELTS includes interaction with the examiner and another student, these interactions are highly structured and not very natural. TOEFL writing and speaking tasks are limited, artificial, and assessed by a grader who has only a text or sound or text file to work with.)

Contrast that with regular, direct observation of students’ language proficiency by a trained and experienced instructor, over a period of time. The instructor can set up a variety of language situations involving variation in interlocutors, contexts, vocabulary, levels of formality, and communication goals. In an ACCET or CEA accredited intensive English program, such tasks are linked to documented learning objectives. By directly observing students’ performance, instructors are able to obtain a rich picture of each student’s proficiency, and are able to comment specifically on each student’s strengths and weaknesses.

Consider this a call, then, for colleges and universities to enter into agreements with accredited intensive English programs to waive the need for a standardized test such as the TOEFL. Just as those colleges and universities don’t use a standardized test to measure the learning of their graduates, they should be open to accepting the good judgment of teachers in intensive English programs – judgment based on direct observation of individual learners rather than the proxy scores obtained by impersonal, artificial tests.

Aligning assessment and IEP culture

Since the passage of the Accreditation Act of 2010, intensive English programs (IEPs) have been under pressure to justify their quality claims by recording and reporting on student achievement. This has meant devising program-wide systems for assessing and evaluating students, and has been a challenge for many IEPs.

The type of system a program develops is influenced by its culture. A more managerial (top-down, administratively driven) culture typical of proprietary English schools tends to favor standardization of assessment that includes program-wide level-end tests. Many university IEPs have more of a collegial (faculty-driven with a degree of shared governance) culture in which individual faculty decision-making and autonomy are valued. In the latter type, it can grate against the culture when there is an attempt to introduce or impose standard testing. It may be more agreeable to retain faculty autonomy in assessment but introduce checks to ensure that assessments are aligned with course objectives and outcomes.

Both approaches (and blends of the two) are used by CEA-accredited programs and are able to meet the CEA standards. There is no need to create standard assessments across a program if they do not fit the culture. On the other hand, the imperative to assess students in a more consistent way can be a catalyst for culture change. This will need leadership, persuasion, and buy-in from faculty.

I’ve designed and overseen assessment and evaluation systems in proprietary and university programs, and can support programs in determining and developing the right approach. Get in touch if I can help!

Have a great weekend!

(Learn more about academic cultures in Engaging the Six Cultures of the Academy by William Bergquist and Kenneth Pawlak. I highly recommend it.)