Even our own AI detector is not perfect, and it can produce false positives. These false positives can be very painful for anyone who creates original content. Whether you are a student who has been wrongly accused of using ChatGPT by TurnItIn or GPTZero or a writer who has received a false positive with Originality.ai, this article is meant to help.
Discover how AI detection works, the accuracy rates, what to do if you have been wrongly accused, and tips on how to avoid it in the future.
This article provides a brief overview of common questions about AI content detector false positives that may help if you have been accused of using AI. For a deeper discussion on each topic, view the recommended reading links within the sections.
If you truly created (without AI involvement) a piece of text and you are being accused of having created it with AI, here are steps you can take.
If you are a writer and you have been falsely accused of having AI create content, you can:
Here are tips to help prevent and resolve false positives.
"False positives in AI detection are a BIG deal and will not go away with the use of generative." AI continues to climb. Our hope is that this article will help people understand the limitations of detection tools and share strategies on their appropriate use and how to prove your work's Originality.
An AI content detector is an artificial intelligence trained to tell the increasingly subtle difference between AI-generated and human-generated text.
Most Common Misunderstanding:
A detection score of 60% AI and 40% Original should be read as “there is a 60% chance that the content was AI-generated,” and NOT that 60% of the article is AI-generated and 40% is Original.
A score of 60% Original and 40% AI, if you know the content was 100% created by you, is not a false positive. It correctly identified the content as Original.
AI Detectors are NOT 100% accurate and never will be. We have done extensive testing on Originality.ai’s accuracy rate, and it varies based on which Generative AI tool and large language model (LLM) is used to create the content.
With our latest model, Lite 1.0.0, released in July 2024, the accuracy is 98% with a 1% false positive rate. Lite also permits light AI editing such as Grammarly’s spelling and grammar tools, which makes it an excellent choice for web publishers, marketers, and education professionals.
For further information, see the complete AI Detection Accuracy Study.
Then, review a meta-analysis of six third-party studies that demonstrate Originality.ai’s exceptional AI detection.
Try our AI content checker here.
See our complete AI detector accuracy tests here.
A quick overview showing Originality.ai’s accuracy on GPT-4 in comparison to other AI detection tools (note this was completed with a previous model).
We used a confusion matrix to test the accuracy of an AI detector against a set of AI-generated and human-written text.
Below are the results of a study comparing the Accuracy and False Positives of Originality.ai vs Other AI Detectors
Many existing AI detectors have a very easy way to trick them. Simply swapping out some words using a paraphrasing tool like Quillbot results in the ability to bypass detection — except for Originality.ai. Originality.ai can detect whether the content is AI, original, or paraphrased.
A false positive is when an AI detector incorrectly identifies human-created content as being likely generated by an AI.
There is often some misunderstanding when it comes to false positives. For clarity, here is how Originality.ai aims to identify AI vs Original content under different content creation scenarios…
So if the content was outlined by an AI, then some of the content was written by a human and finally edited/expanded on by an AI; Originality.ai aims to identify this as AI-generated. This is not a false positive.
Similarly and more obviously, if ChatGPT creates a piece of content and then someone painstakingly edits it, but it still gets identified as AI-generated — this is not a false positive.
This is a tricky question and one that is still being debated as the use of AI continues to increase. The launch of our Lite 1.0.0 model permits light AI editing such as using popular tools like Grammarly’s spelling and grammar suggestions while editing.
Overall, it’s best to maintain transparency when using AI, whether it’s for content outlines or editing.
So, in a world where the rate of false positives isn’t (nor will ever be) 0%, how should AI detectors be used?
AI detection does not provide a proveable output for each piece of text — meaning no AI detector can say with proveable certainty that this is how we KNOW that your content was created with AI.
Therefore, it is best…
Many Originality.ai customers using this strategy have been able to successfully identify writers who were using AI (even though they had been asked not to). Additionally, the Originality.ai customers using this approach were confident they could safely ignore a suspected false positive.
Read case studies of how customers have found success with Originality.ai.
However, for academic disciplinary action, AI detection scores alone are simply not enough.
Some might question if it is responsible to have an AI detection tool if it is not perfect. At Originality.ai, we are confident in the detection rates of our tests and the additional steps we have taken to try and reduce/manage “False positives, including the free tools we offer.”
In a world where AI content is allowed to run wild unchecked, the impact on many of us would be significant.
ChatGPT and other AI tools like HuggingChat are here — there is no going back. Everyone involved with writing needs to adapt to this new world.
The rise of ChatGPT and AI detectors has led to some unfortunate situations in academia.
Here are some examples:
To be clear… at Originality.ai we don’t believe that an AI detection score alone is enough for disciplinary action. If a report indicates the potential use of AI, always review the content carefully and on a case-by-case basis.