peampath - stock.adobe.com

The challenges posed by AI tools in education

Artificial intelligence tools to enhance productivity are being developed for use in multiple sectors, but are they sufficiently reliable for use in education or do they create other problems?

Education can be an incredibly rewarding but demanding profession, as the workload of teachers is ever increasing. The progress of students needs to be assessed, usually through tests, exams, essays and coursework, all of which need to be reviewed and marked.

The workload of students is also increasing, as they struggle to revise for their exams and meet all the deadlines for their assessments. Unfortunately, a minority of students give in to the obvious temptation of using generative artificial intelligence (AI) to write their essays.

There has been a significant promotion of generative AI tools, such as Gemini, which are presented as assistants that can help the user with their work. In one recent advert, a busy scientist asks their AI tool to create a slideshow for a presentation, for which they receive applause at the end.

“We have Copilot in everything for work and there are huge debates about how to incorporate AI into teaching,” says David Waldron, an associate professor of history at the Federation University. “The AI is good to do data processing, such as sorting data and condensing key information into a summary, but it is no good for creative material or for trusted sources.”

To detect generative AI (GenAI) essays submitted by students, there have been a variety of AI tools, such as Scribbr and ZeroGPT, made freely available online for people to use.

My daughter recently wrote an essay for her A-Level coursework and, out of curiosity, submitted it to an online AI checker. She was shocked to find that, despite writing the essay herself, the checker concluded much of her work was created using AI. Further investigation revealed that the deliberate use of simple words and grammatical errors made the essay less likely to be flagged.

Social media personality Vivian Jenna Wilson highlighted a similar issue in a recent social media post, where she observed that she now has to minimise the use of dashes in her writing, otherwise people assume it is written by AI.

As demonstrated above, not only were online AI detection tools insufficiently reliable in accurately detecting AI-generated content, but they could also drive the introduction of deliberate mistakes.

Other AI tools have also recently proven to be unreliable. Earlier this year, a rogue coding agent deleted an entire production database during a coding session and then lied about it. Meanwhile, in 2023, an AI-powered recruiting tool broke employment laws by automatically rejecting female applicants over the age of 55 and male applicants over the age of 60.

“Generative AI presents exciting opportunities to improve education, but we recognise that its use must be carefully managed to protect learning and uphold high standards,” said a spokesperson for the Department of Education. “In June, we published guidance shaped by the latest evidence, which advises using pupil-facing AI tools with caution, and provides support for teachers on how AI can be used safely and responsibly in schools and colleges, with a strong emphasis on academic integrity, safeguarding and legal compliance.”

Relying on AI detection tools risks students being unfairly accused of plagiarism and potentially teaching them that poor-quality work is less likely to get them into trouble than high-quality work that might wrongly be flagged as AI-generated. English literature and language students are disproportionately affected, as they typically have strong essay writing skills and excellent grammar and punctuation, all of which are identifiers for content created using GenAI.

Most AI tools operate over the internet and require user data, hence there are significant concerns about data protection, i.e. what happens to the data, as the written work that is uploaded could feed into the AI model. This could be especially concerning regarding academic research commissioned by private companies. “The data-sharing element has a lot of people worried,” says Waldron.

A metaphorical arms race

There is an added complication for students who use grammar tools, such as Grammarly, which utilise AI to fix grammar and punctuation in their writing. The rules regarding their use are unclear, especially given the proliferation of spellcheckers. Although grammar tools do not suggest ideas or create content, AI detection tools could still flag them as evidence of generative AI being used.

Students are also becoming aware of AI tools that can detect essays that have been created using GenAI and are taking their own steps to avoid being detected by these systems.

“Interestingly, students often use multiple AI systems to get around detection,” says Waldron. “They create an essay with ChatGPT and then via Copilot and DeepSeek.” There is now a metaphorical arms race between generative AI systems and AI detection tools as they seek to counter the other.

One of the key challenges of generative AI tools is that they are effectively a “black box”, in that that they produce content based on a series of text prompts from the user, but we never understand how they reach their conclusion. Each AI is trained separately, making it almost impossible to understand how AI tools arrive at their answer, since they do not provide supporting information that would allow their users to interrogate the solution.

“Proving its use is problematic and time consuming, but it’s a critical problem in knowing if students understand results,” says Waldron. “There are calls to go back to invigilated exams and handwritten essays. However, most solutions like this require a return to face-to-face learning and more intensive use of staff, which university management opposes due to increased costs and lack of flexibility for the international student market.”

In many ways, the challenge of detecting essays created using GenAI highlights a wider ongoing discussion about the role of AI in academia.

There now needs to be realignment of how we approach assessments, where alternative methods to essays are developed for determining a student’s knowledge and understanding of a subject. One example could be giving a presentation to defend their essay, but this might not be appropriate for in-depth topics. Attempting to fight AI would be akin to confronting a tsunami – instead, there needs to be a discussion about how we can prepare and adapt to the technology.

Ultimately, the AI sector remains a highly unregulated area, despite the widespread deployment of AI throughout a variety of industries. Without appropriate oversight, there could soon be significant problems with students being unfairly charged with plagiarism unless steps are taken to ensure the results provided by AI detection tools are sufficiently reliable.

“We need to rethink the pedagogy to teach critical use of AI in research and writing,” concludes Waldron.

The National Education Union and the National Union of Students were both approached for comment but had not responded at the date of publication.

Read more about technology and education

Read more on AI and storage