Hey prof, grade my essay? There’s (kind of) an app for that.
The e-Rater, an automated essay-grading system developed by the Educational Testing Service, can grade up to 16,000 essays a minute. For educators across America, such a creation could mean a far easier job or even spell disaster in the form of "u-n-e-m-p-l-o-y-m-e-n-t."
But we’re talking essays here, not math problems. When it comes to composition, right and wrong answers aren’t always objective, nor do they always exist. So how effective are these robo-graders? And should they be trusted?
Incorporating robo-grading into academia will in time alter the way in which students write, being taught to fool a machine instead of establish a compelling argument in a creative way.
A recent study by the University of Akron College of Education compared the ratings of man and machine for some 22,000 short essays and found little difference in the final grades awarded.
"In terms of being able to replicate the mean [ratings] and standard deviation of human readers, the automated scoring engines did remarkably well," Mark Shermis, the study’s lead author, said in an interview with Inside Higher Ed.
Are these robo-graders really this smart? Or has the general ability of students to write well become so predictably shallow that grading is a formulaic cakewalk? There’s definitely a need for speculation.
The latter cynicism may be unanswerable and unfair: But Les Perelman, a director of writing at the Massachusetts Institute of Technology, attempted to address the first question. And he argues, "No" is the answer.
The e-Rater, Perelman claimed in an article reported by the New York Times, is easy to fool once a writer gains an understanding of its biases. It seems to value certain things and discredit others: Short sentences, for instance, are bad. Big words? Good.
Essentially, the robo-graders do a less-than-stellar job of evaluating and interpreting content and concentrate on sure signs of intelligence in the form of length and lexical complexity. If you can speak like an academic, the argument itself seems of little importance.
The e-Rater seems to think a bit like an idiot, fascinated and impressed by big words, long sentences, and syntactical diligence but utterly oblivious to content. It likes flashy, shiny things.
Yet, even if the system is a flawed one, what’s disturbing is its existence in the first place. Writing has long stood up to technological imposition. It’s to be a stubborn thing, something that refuses to be tied down or reduced to order. For the essay, possibilities are infinite; there are endless ways to arrive at a point and endless points at which to arrive.
The development of robo-graders for written works has far-reaching implications.
Not only does it degrade and limit writing to a science instead of an art, which it most certainly is, it also destroys creativity.
If these e-Raters, when "perfected" and approved by the mass culture of academia, overtake the process of grading essays, the educational system will then have to adapt accordingly. Instructors will be forced to teach their students how to write according to the mandates and criteria of a machine, thus making writing no different from mathematics: There will be a formula, a set methodology of arriving at a predetermined conclusion.
The e-Rater essentially poses a threat to originality. There is a constant struggle between the integrity of a discipline and the efficiency of it. In the search to eliminate labor and simplify the process of evaluating pieces of writing, the essence of why a student is taught to write will be lost. Standardization in the educational sphere works incredibly well, but it’s limited to certain disciplines.
The attempt to objectify and formualize that which is dynamic and inherently unbound will be damaging.
A machine that operates in a consciousness dictated by algorithms cannot be expected — or allowed — to evaluate the legitimacy or quality of a product that is subjective by nature.