AI Studies

AI Detection Case Study- Are GPT Detectors Biased Against Non-Native English Speakers?

Recently, author Janelle Shane published a case study on AI Weirdness highlighting AI detectors like Originality.AI and how they misclassify human-written text as AI-written, particularly when the text is written by non-native speakers. This was based on a Data Study conducted by university students. 

Jonathan Gillham

Recently, author Janelle Shane published a case study on AI Weirdness highlighting AI detectors like Originality.AI and how they misclassify human-written text as AI-written, particularly when the text is written by non-native speakers. This was based on a Data Study conducted by university students. 

First off- We welcome any/all AI detection accuracy tests to further help educate users on the limitations of AI detectors and how to properly use them

This article digs into the study and updates it based on an analysis of the very same data against Originality.AI’s latest version ( 1.4). The results are clear-  Originality 1.4 Is by far the most accurate detector and had the fewest false positives out of all AI Content Detectors. 

AI Weirdness Article

Taking a step back, this article  is meant to take a deeper look at false positives and their inherent bias against English-as-a-second-language students and issues with academic integrity. False positives do happen and it is an alarming claim that they may impact some students more than others, which is something that should be investigated. 

The author of this article compares her own writing along with flowery Seuss-style poetry alongside a study that compared standard U.S. 8th grader essays with TOEFL (Teaching of English as a Foreign Language) test results when run through a variety of AI detection platforms, including Originality.AI.

She mentioned that AI detectors had flagged her own writing from her book as “ very likely to be written by AI, but when we ran that same excerpt and scanned it through Originality.AI’s platform, it came back as 99% Original. You can see the scan with the excerpt here:​​

Next, she took her own writing and had it paraphrased by an AI tool. She claimed that when she ran it through an AI content detector that it “was now rated likely to be written entirely by a human". Again, we took that excerpt and ran it through Originality.AI’s platform and this was flagged correctly as 100% AI. See report here:

Lastly, she said that she also got a likelihood that it was written entirely by human rating if she had paraphrasing tools to rewrite the paragraphs in the metered rhyme of Dr Seuss, in Old English, or as a startup pitch. We tested this again against Originality.AI, which again, classified it correctly as AI generated. See report here:

These reports show that the updated version of Originality.AI (1.4) accurately classified sections of text that the author had previously received false positives on- showing large improvements. 

The Case Study 

AI content generation is, without a doubt disrupting and revolutionizing entire industries online. Many experts, professors and publishers rely on AI content detectors in order to distinguish between what is human-written and what’s written by AI both online and academically. 

With the launch and ongoing development of groundbreaking tools like Originality.AI, there are going to be numerous case studies conducted to evaluate the efficacy of these types of tools – as there should be. 

We will be showcasing our tool’s capabilities within publicly available data sets in the weeks to come. 

However, the “Don’t Use AI Detectors for Anything Important” case study was based on version 1.1 of Originality.AI. We have made significant changes and are currently on version 1.4 of our tool. 

We ran our tool against this exact data set and feel that the case study should be modified to reflect that the latest Originality.AI Model (1.4) detected AI-Written Content at 100% For All AI Data Sets. 

Had Our Latest Model Been Used for the Case Study, The Charts Would have looked a bit ( A LOT) different:

Marketing Use Vs Academic Use 

It’s also important to note the differences in false positives in terms of marketing versus academic use. We have repeatedly emphasized that Originality.AI is not for academic use. 

The Data that our AI has been trained on is closely tied to online content for the purpose of ranking on search engines- NOT academic papers. On our signup page it specifically says that it is built for publishers, agencies and writers, not for students….If we deploy an academic focused solution we will train our AI detection on more TOEFL essays to avoid this problem. But for now our stance is that we are not for academic use and are built specifically for serious content marketers and SEOs.

So What About Non-Native English Speaker Texts Being Flagged as False Positives? 

As one of the most popular AI writing detection services, we are continually developing our services to have the lowest false positive detection rate of any AI content detector. With our most recent update, we now have this number at less than 2.5%. See the study and comparison with other tools along with our latest GPT-4-trained detection model by clicking here.

When it comes to non-native English speaker texts being erroneously flagged as false positives, we’re looking deeply into the causes and correlations to find the underlying causes. One of the theses we are investigating is what we call “cyborg writing”.  

Cyborg writing happens when a writer uses too many writing assistant tools (many of which are powered by AI). For example, if a writer uses autocorrect while relying extensively on a grammar tool and runs their content through an outlining or content optimization tool, all of these tools leverage AI to some extent and that could be the underlying reason for these false positives.

Even if the student or content creator writes the words themselves, submitting it through different degrees of filtering and assistance can leave tell-tale AI “tracks” that systems trained to detect these tracks (like Originality.AI) will pick up.

But is this more of an issue among non-native English speakers? Or is it simply a more nuanced question of “What level of computer-aided assistance is allowed before something no longer becomes a writer’s original work?”  We believe there is no easy, one-size-fits-all answer to this and that the answer will depend on the situation. 

There Is No Perfect Solution (And There Never Will Be)

For this reason, even over the long-term, AI detectors will never be able to provide a perfectly clean 100% solid track record with zero false positives. However, being able to combine AI detection with the ability to visualize the content creation process is one of the main reasons why we built our free Google Chrome AI Detection Extension. This allows someone to see the creation of a Google Doc in order to prove that the writer did indeed create the content, rather than copying and pasting it from ChatGPT or another AI writing service.  

With all of these points in mind, and considering that the 1.4 version of Originality.AI accurately predicted AI-written (and human-written) content at nearly 100% across all data sets, we would like to invite the creators of this case study to rerun their data on our updated version and see the results firsthand for themselves.

Jonathan Gillham

Founder / CEO of Originality.AI I have been involved in the SEO and Content Marketing world for over a decade. My career started with a portfolio of content sites, recently I sold 2 content marketing agencies and I am the Co-Founder of, the leading place to buy and sell content websites. Through these experiences I understand what web publishers need when it comes to verifying content is original. I am not For or Against AI content, I think it has a place in everyones content strategy. However, I believe you as the publisher should be the one making the decision on when to use AI content. Our Originality checking tool has been built with serious web publishers in mind!

More From The Blog

AI Content Detector & Plagiarism Checker for Serious Content Publishers

Improve your content quality by accurately detecting duplicate content and artificially generated text.