The Day the Classroom Stopped Being a Secret

On November 30, 2022, something slipped into the world that would quietly rewrite the rules of how we teach and learn. It wasn't a new curriculum or a government policy. It was a chatbot. Within one week, more than a million people had signed up for ChatGPT, and among them were students and teachers who had no idea what they were about to unleash (Baidoo-Anu & Owusu Ansah, 2023).
The paper by David Baidoo-Anu and Leticia Owusu Ansah, published in the Journal of AI, is not a breathless ode to a shiny new tool. It is an exploratory study, a careful sifting of the early literature, trying to understand what happens when a machine that can write essays, solve math problems, and generate lesson plans suddenly sits in every classroom. The authors are not prophets. They are mapmakers, sketching a territory that is still being formed.
What Did the Paper Actually Find?

Baidoo-Anu and Owusu Ansah (2023) conducted a synthesis of recent extant literature. This is not a controlled experiment with 500 students and a control group. It is a systematic review of what researchers and educators had already published about ChatGPT's potential in education, as of early 2023. The authors looked at benefits and drawbacks, then offered recommendations.
The core finding is a paradox. ChatGPT can do things that would have been science fiction five years ago. It can generate personalized practice problems for a struggling student, offer instant feedback on a draft, and act as a Socratic tutor that never gets tired. But it can also produce confident nonsense, embed racial and gender biases from its training data, and raise privacy questions that no school district has adequately answered.
The Benefits: What the Machine Does Well

Personalized Learning at Scale
Every teacher knows the dream: a classroom where each student gets exactly what they need, when they need it. In practice, that dream dies under the weight of 30 students, 45 minutes, and a pile of grading. Baidoo-Anu and Owusu Ansah (2023) found that ChatGPT can generate prompts for formative assessments that provide ongoing feedback. This is not about replacing the teacher. It is about giving the teacher a tool that can produce a thousand variations of a math problem, each one calibrated to a specific student's level, in seconds.
The authors argue that this promotes "personalized and interactive learning." The machine does not get bored. It does not judge. It can explain a concept five different ways without sighing. For a student who needs extra time or a different explanation, that is not a luxury. It is a lifeline.
The Eternal Teaching Assistant
Consider what happens when a student is stuck on an essay at 10 p.m. The teacher is asleep. The textbook is unhelpful. The student's only option is to stare at the screen or give up. ChatGPT changes that. The authors note that the tool can serve as a "conversational partner" that helps students refine their thinking by asking questions and offering counterarguments.
This is not plagiarism. It is scaffolding. A student can ask ChatGPT to "argue against my thesis" or "explain why this source might be unreliable." The machine's response is not the final product. It is the raw material for thinking. The student still has to decide what to believe, what to cut, what to revise. But they no longer have to do it alone.
Formative Assessment Without the Paperwork
Teachers spend a staggering amount of time creating quizzes, grading them, and then trying to figure out what the results mean. Baidoo-Anu and Owusu Ansah (2023) highlight that ChatGPT can generate "prompts for formative assessment activities." A teacher can type a topic, and the machine produces a set of questions that test different levels of understanding. The teacher can then use those questions, adapt them, or throw them out. The point is speed. The point is that the teacher's energy goes into interpretation, not creation.
The Drawbacks: What the Machine Gets Wrong
The Hallucination Problem
Here is the thing about generative AI. It does not know what it does not know. It has no internal model of truth. It produces text that looks like a correct answer, but it is only a statistical prediction of what words should follow other words. Baidoo-Anu and Owusu Ansah (2023) explicitly warn that ChatGPT can generate "wrong information." The authors call this a "limitation," but that is too polite. It is a feature of the technology. The machine is designed to sound plausible, not to be correct.
For a student who is already struggling to distinguish reliable sources from unreliable ones, this is a disaster. The machine can produce a perfectly written explanation of the War of 1812 that gets the year wrong, the causes wrong, and the outcome wrong. The student copies it, submits it, and learns nothing. Worse, they learn that sounding right is the same as being right.
The Bias Embedded in the Training Data
ChatGPT was trained on the internet. The internet is full of human biases. Baidoo-Anu and Owusu Ansah (2023) note that the tool may "augment existing biases" in its training data. This is not a bug that can be patched. It is a structural problem. If the training data overrepresents certain perspectives, the machine will reproduce those perspectives. A student asking about historical figures might get a list that is overwhelmingly male and white. A student asking about career advice might get suggestions that subtly reinforce gender stereotypes.
The authors do not offer a solution. They simply point out that the tool is not neutral. It carries the fingerprints of its creators and its data sources. Any classroom that uses ChatGPT without addressing this is not using the tool critically. It is using it naively.
Privacy: The Question Nobody Is Asking
When a student types a question into ChatGPT, that question goes to a server. The company behind the tool collects data. Baidoo-Anu and Owusu Ansah (2023) mention "privacy issues" as a limitation, but this deserves more attention than the authors give it. In a classroom setting, students might ask questions about their personal struggles, their family situations, their confusions about sensitive topics. All of that becomes data. Who owns it? Who can access it? What happens if the company is sold or hacked?
The authors call for "policy makers, researchers, educators and technology experts" to work together on this. That is a polite way of saying that nobody has figured it out yet.
How the Study Was Done: A Quick Look Under the Hood
This is not a randomized controlled trial. Baidoo-Anu and Owusu Ansah (2023) conducted an exploratory study that synthesized recent extant literature. They searched for academic papers and reports published on ChatGPT's use in education, then extracted themes, benefits, and drawbacks. The method is qualitative. It is a literature review with a thesis.
This matters because it means the paper's claims are not proven in the way that a clinical trial proves a drug works. The authors are not saying "this intervention caused a 20% improvement in test scores." They are saying "here is what the early evidence suggests, and here is what we should watch out for." That is a different kind of authority. It is the authority of a careful observer, not a controlled experiment.
What This Means for a Teacher Right Now
If you are a teacher reading this, you might feel a knot in your stomach. The machine is here. It is not going away. Baidoo-Anu and Owusu Ansah (2023) offer recommendations that are surprisingly practical for an academic paper.
- ▸Use ChatGPT as a starting point, not an ending point. Generate discussion questions, then have students critique them. Ask the machine to write a bad essay, then have students fix it.
- ▸Teach students to fact check the machine. If ChatGPT says something, the student's job is to verify it. This is a skill that transfers directly to evaluating any source.
- ▸Be transparent about when and how you use it. If you use ChatGPT to generate a quiz, tell the students. Model the critical mindset you want them to adopt.
- ▸Do not assume the machine is unbiased. Every output should be examined for assumptions, omissions, and stereotypes.
The authors argue that the goal is not to ban the tool. The goal is to "maximize teaching and learning" by using it constructively. That is a hard ask. It requires teachers to learn a new skill, to think differently about assessment, and to trust that students can be taught to use a powerful tool responsibly.
What the Research Does Not Prove
Let me be clear about what this paper does not claim. It does not prove that ChatGPT improves test scores. It does not prove that students learn more with AI than without it. It does not prove that the benefits outweigh the risks. Those questions will take years to answer, and the answers will depend on how the tool is implemented.
The paper also does not address a deeper question: What happens to motivation when a machine can do the work? If a student knows that ChatGPT can write a perfectly acceptable essay, why would they struggle through the messy, frustrating process of writing their own? Baidoo-Anu and Owusu Ansah (2023) do not answer this. They only hint at it when they mention that the tool could "promote personalized and interactive learning." But personalization is not the same as engagement. A student can interact with a machine without ever being challenged, without ever being uncomfortable, without ever growing.
This is the open question that haunts the paper. The authors are optimistic. They see potential. But they are honest enough to admit that the evidence is thin and the risks are real.
The Structural Problem No One Is Talking About
Here is something the paper does not say, but that every teacher will feel in their bones. The arrival of generative AI exposes a pre existing weakness in education. For decades, we have assessed students by asking them to produce text. Write an essay. Answer a short answer question. Solve a problem and show your work. Those assessments assume that the student is the one doing the producing. That assumption is now broken.
Baidoo-Anu and Owusu Ansah (2023) recommend "generating prompts for formative assessment activities that provide ongoing feedback." That is a good start. But it does not solve the fundamental problem. If a machine can produce the answer, what are we actually measuring when we grade a student's output? The paper does not answer this. It cannot. The question is too big for a single study.
What the paper does do is force us to ask it. That is its real value. It is not a solution. It is a diagnosis.
What This Actually Means
- ▸Redesign assessment around process, not product. If a machine can produce the final answer, then the grade should be based on how the student got there. Require students to show their revisions, their dead ends, their conversations with the AI. The output is the least interesting part.
- ▸Teach AI literacy as a core skill. Students need to know how to prompt a model, how to evaluate its output, and how to recognize when it is wrong. This is not a tech elective. It is as fundamental as reading and writing.
- ▸Use the machine to do what humans cannot. ChatGPT can generate a thousand variations of a math problem in seconds. A teacher can then pick the best ones. The teacher's judgment is the scarce resource. The machine's speed is the abundant one. Use the machine for speed. Use the human for judgment.
- ▸Do not trust the machine with sensitive student data. If a student types a question about their mental health, their family, or their identity, that data leaves the classroom. Schools need clear policies about what data can be shared with AI tools and what cannot.
- ▸Prepare for the long game. The paper by Baidoo-Anu and Owusu Ansah (2023) is a snapshot of early 2023. The technology will improve. The biases will shift. The privacy questions will become more urgent. The conversation is not over. It has barely begun.
The classroom was never a sealed box. It was always connected to the world outside. But now the connection is live, two way, and capable of generating answers that look exactly like the ones a student would write. That is not a crisis. It is a new condition. The question is not whether we adapt. It is whether we adapt well.
References
- [1]David Baidoo-Anu, Leticia Owusu Ansah (2023). Education in the Era of Generative Artificial Intelligence (AI): Understanding the Potential Benefits of ChatGPT in Promoting Teaching and Learning. Journal of AIDOI· 1,530 citations
