Back to Articles
Exams Written by ChatGPT: 'Virtually Undetectable' and Capable of Outperforming Students

9News

SKIPPED

Authors (1)

Description

<p>University exams written by artificial intelligence are "virtually undetectable" and capable of outperforming the exams completed by humans, a new UK-based study has found.</p>

Summary

A UK-based study, published in PLOS ONE, highlights the concerning ability of AI, specifically ChatGPT, to generate exam responses that are nearly indistinguishable from those written by students. This raises significant questions about academic integrity and the capability of AI systems to outperform human students in exams. The study found that AI-generated responses, which were undetected by university markers, often received higher grades than human submissions. These findings reflect rapid advances in AI's ability to mimic human-like responses, posing challenges for educational institutions in maintaining academic standards. While focused on educational implications, these advancements in frontier AI capability have broader relevance to global AI safety and governance, particularly in developing robust systems to detect AI-generated content.

Body

University exams written byartificial intelligenceare "virtually undetectable" and capable of outperforming the exams completed by humans, a new UK-based study has found.The findings will likely fuel fears within the education sector around academic integrity, amid the ever-increasing ability of AI tools to mimic the writings of real people.In the study published in PLOS ONE today, 33 fake students at the University of Reading submitted exam responses that were 100 per cent written bythe AI bot ChatGPT-4for five different subjects in an undergraduate psychology degree.READ MORE:Julian Assange pleads guilty in US court in plea dealThe university markers were unable to detect the difference between real students' work and those written by ChatGPT.(Getty Images/iStockphoto)The AI-written essays made up five per cent of the total papers submitted and the subject coordinators and exam markers were unaware of the experiment.All but two of the exams (94 per cent) went undetected as forgery by the markers.Moreover, the chatbot's exam papers received higher marks on average than those of real students, with 83.4 per cent receiving higher grades than a randomly selected group of the same number of submissions from students.The results demonstrate the rapid advancements that AI tools such as ChatGPT have made in their ability to mimic a real human voice and outsmart devices used to detect them.READ MORE:Tax cuts, energy prices and bigger bills: Everything changing on July 1The researchers used the AI bot ChatGPT to generate multiple different responses to the essay questions.(Jakub Porzycki/NurPhoto/Getty Images via CNN)The jobs most at risk of being replaced by artificial intelligenceView Gallery"From a perspective of academic integrity, 100 per cent AI written exam submissions being virtually undetectable is extremely concerning," the study authors note."Especially so as we left the content of the AI-generated answers unmodified and simply used the 'regenerate' button to produce multiple AI answers to the same question."They note that the ability of AI to go undetected will only increase over time, as its responses improve in both complexity and abstract reasoning."Clearly, we have no way to estimate the proportion of students in our sample who used AI to complete their submissions. However, with the huge media coverage of AI such as GPT-4... we struggle to conclude that its use would be anything other than widespread."