Students Prefer AI Feedback Over Human Teacher Comments

The Students Chose the Bot

A few months ago, a group of university students learning English as a new language were given a choice. They could get detailed, personalized feedback on their essays from a human tutor—a trained, experienced educator who knew their names, their struggles, and their progress. Or they could get feedback from a machine: ChatGPT, powered by GPT-4.

Nearly half of them picked the machine.

This is not a story about students being lazy or antisocial. It is a story about what feedback is actually for. And it suggests that we may have been misunderstanding the entire enterprise of giving students comments on their writing.

The study, published in the International Journal of Educational Technology in Higher Education by Juan Escalante, Austin Pack, and Alex Barrett (Escalante et al., 2023), followed 91 university students across two separate experiments. In the first, they measured actual learning outcomes: Did students who got AI feedback improve their writing more than students who got human feedback? In the second, they asked a different group of students to compare both types of feedback and state a preference.

The results are uncomfortable for anyone who believes that teaching is an irreplaceably human craft. Not because the AI was better. But because the students didn't see a meaningful difference.

The Experiment That Should Have Settled Things

Study 1: Who Actually Got Better?

The first study was a six week long repeated measures design with 48 English as a new language (ENL) learners at a university. The students were split into two groups. Both wrote essays, submitted them, and received feedback. The control group got feedback from their human tutor. The experimental group got feedback from ChatGPT 4, with the prompts carefully designed to match the kind of comments a tutor would give.

Both groups then revised their essays. Then they wrote another essay. Then revised again. For six weeks.

The researchers measured improvement in writing quality across multiple dimensions: organization, content, grammar, vocabulary, and mechanics. They used blind raters who did not know which group each essay came from.

The result: no statistically significant difference between the two groups. Students who received AI feedback improved at the same rate as students who received human feedback. The machine did not outperform the human. But it also did not underperform.

This is not a small finding. It means that for the specific task of providing written feedback on ENL student essays, an AI chatbot can substitute for a human tutor without hurting learning outcomes. That is a direct challenge to the assumption that the human touch is essential for effective writing instruction.

Study 2: Who Did the Students Prefer?

The second study asked a different group of 43 ENL students to receive feedback from both ChatGPT and their human tutor on the same essay. Then the researchers asked: Which one did you like more?

The split was nearly even. Some students preferred the human. Some preferred the AI. And their reasons revealed something important about what students actually want from feedback.

Students who preferred the human tutor said things like: the tutor understood their personal context, could explain things in a way that matched their learning style, and offered encouragement. The human felt attuned to them as individuals.

Students who preferred the AI said things like: the feedback was more detailed, more specific, more actionable. The AI did not get tired. It did not rush. It wrote long, paragraph by paragraph breakdowns of what needed fixing. It did not hold back criticism to spare their feelings.

One student described the difference this way: "The tutor tells me what is wrong and then moves on. ChatGPT tells me what is wrong, why it is wrong, and how to fix it."

What the AI Did That Humans Don't

The researchers coded the feedback from both sources. The differences were striking.

Human tutors, constrained by time and energy, tended to give feedback that was broad and evaluative. They made global comments: "Your thesis needs work" or "This paragraph is confusing." They focused on higher order concerns like argument structure and clarity. They also, importantly, offered praise and encouragement.

ChatGPT, by contrast, gave feedback that was granular and prescriptive. It pointed to specific sentences. It offered rewrites. It explained why a particular verb tense was wrong and how to correct it. It did not praise. It did not encourage. It just kept working through the text, line by line, until it was done.

This is the paradox at the heart of the study. The human tutors, trying to be good teachers, gave feedback that was holistic and supportive. The AI, with no concept of teaching or support, gave feedback that was relentless and specific. And for many students, the relentless specificity was more useful.

Escalante and his colleagues note that this may be a function of the ENL context. Students learning a new language often need granular feedback on grammar, vocabulary, and sentence structure. A human tutor, trained to focus on higher order concerns, may skip over these details. The AI does not skip anything.

But the finding raises a deeper question: What if the most useful feedback is not the most human? What if the ideal writing tutor is not a mentor who understands you, but a machine that simply refuses to stop editing?

The Blended Approach That Actually Works

Neither group of students in the study was wrong. The students who preferred the human tutor valued something real: the feeling of being seen and understood. The students who preferred the AI valued something equally real: the feeling of getting concrete, usable instructions.

The authors argue for a blended approach. Not because it is the safe middle ground, but because the two forms of feedback serve different functions.

Human feedback is good for motivation, for building relationships, for understanding the big picture of a student's development over time. AI feedback is good for precision, for volume, for the kind of line by line editing that humans find tedious.

Imagine a writing classroom where the AI handles the first pass: grammar, sentence structure, word choice, basic organization. The human teacher then reads the AI edited draft and focuses on argument, voice, creativity, and the personal development of the writer. The student gets both the granular correction they need and the human recognition they deserve.

This is not a dystopian vision. It is a practical division of labor. And it is supported by the data.

What This Study Does Not Prove

It is important to be precise about what Escalante, Pack, and Barrett actually found.

They did not find that AI feedback is better than human feedback. The learning outcomes were the same. The preference was a split. The machine did not outperform the person.

They did not find that AI feedback works for every type of writing or every student population. The study used university level ENL learners writing academic essays. The results may not generalize to native speakers, to creative writing, to elementary school students, or to advanced graduate level work.

They did not find that AI feedback is free of problems. The researchers note that ChatGPT occasionally made errors: it flagged correct grammar as incorrect, or offered suggestions that were technically right but contextually wrong. A human tutor would have caught these mistakes. A student using AI feedback alone might not.

They also did not study long term effects. Six weeks is enough to measure short term improvement. It is not enough to know whether students who rely on AI feedback develop the same writing skills over years of practice.

And crucially, they did not study what happens when students become dependent on AI feedback. If a student knows that a machine will catch every error, do they stop learning to catch errors themselves? That question remains open.

What the Students Were Really Saying

The most interesting finding in the study may be the one the authors did not emphasize. When students explained their preference for AI feedback, they kept using the same word: specific.

"I like that ChatGPT tells me exactly what to change."

"The tutor says 'work on grammar' but ChatGPT shows me which sentences are wrong."

"With ChatGPT, I know what to do next."

These students were not rejecting human connection. They were rejecting vagueness. They had experienced years of teachers writing "good point" or "needs development" in the margins, and they had learned that these comments did not actually help them improve. The AI, by contrast, gave them a clear path forward.

This is a critique of human feedback practices, not a celebration of AI. Teachers are busy. They have too many students and too little time. They write short comments because they cannot write long ones for every paper. The AI is not more intelligent than the teacher. It is simply more available.

The students understood this. They were not choosing the machine over the person. They were choosing the detailed feedback over the brief feedback. It just so happened that the detailed feedback came from a machine.

The Teacher's New Job

If AI can handle the granular work of editing grammar and sentence structure, what is left for the human teacher?

The answer, according to the study and to the students themselves, is everything that matters most.

The human teacher can ask: What are you trying to say here? Why did you choose this example? How does this argument connect to your life? The human teacher can see patterns across a student's work over time: the same confusion about paragraph structure appearing in essay after essay, the same fear of using the first person, the same tendency to avoid the hard argument. The human teacher can offer encouragement, push back gently, and tell a student that their writing is getting better even when they cannot see it themselves.

The AI cannot do any of these things. Not yet. Maybe not ever.

But the human teacher, freed from the obligation to mark every comma splice, can finally focus on the parts of teaching that require a human. That is the real promise of the study. Not that AI will replace teachers. But that AI will force teachers to become more fully human in their work.

What This Actually Means

▸Give students a choice. The study found that preference for AI vs. human feedback split almost 50/50. Some students genuinely prefer the machine. Let them use it. Others prefer the person. Let them have that. A one size fits all feedback policy ignores what the students themselves are telling you.

▸Use AI for the boring parts. Grammar, sentence structure, word choice, basic organization. These are tasks that AI does well and humans find tedious. Let the machine handle the first pass. Then the human teacher can focus on argument, voice, and development.

▸Be specific or be replaced. The main advantage students saw in AI feedback was its specificity. If human teachers give vague comments like "work on this" or "needs improvement," students will rightly prefer the machine. The solution is not to become more like the machine. It is to be specific in ways that only a human can: "I see you are trying to argue X, but your example actually supports Y. Let's talk about why."

▸Watch for dependency. The study did not measure what happens when students rely on AI for every draft. Do they improve faster because they get more feedback? Or do they improve slower because they stop thinking for themselves? This is the open question that needs urgent research.

▸Do not fear the split. The near even preference split in the study is not a failure of either method. It is evidence that different students need different things. A good writing program offers both. A good teacher knows when to step in and when to let the machine do its work.

The students in this study were not confused. They were clear eyed about what they wanted. Some wanted a teacher who knew them. Some wanted a machine that would not stop editing. Both groups were right.

The question is whether the rest of us are ready to admit that both forms of feedback have a place in the classroom. The research says yes. The students have already decided.

References

[1]Juan Escalante, Austin Pack, Alex Barrett (2023). AI-generated feedback on writing: insights into efficacy and ENL student preference. International Journal of Educational Technology in Higher EducationDOI· 442 citations