OpenAI Text Classifier: ChatGPT’s Own AI Detection Review

You may already know that AI content writing tools enable people to produce engaging and informative content quickly. However, since AI is not perfect, there are often quality issues and the potential for plagiarism. This is when AI detection tools come into play. AI content detectors are capable of detecting content generated by AI-powered software,

August 22, 2024

*Update Aug 2024: OpenAI won't watermark ChatGPT Text

After OpenAI's failed attempt at building an ChatGPT text detector due to the reasons discussed below they have focused on a watermarking solution. Basically watermarking is the process leaving a fingerprint based on the selection of words the LLM has made when creating the text. This fingerprint can be identified after.

However, it has been reported that despite building this watermarking solution they have been reluctant to release it for the following reasons:

Although it would set the bar in terms of AI detection accuracy it still would not be 100%
Although ChatGPT text would be identifiable and have a watermark that would be easily removed by using another LLM like Claude or Llama to AI paraphrase the text
Given the choice some users would prefer to have their AI content be undetectable

‍

Update July 25, 2023: OpenAI Shuts Down Their Text Classifier

‍

We have some news and updates to share about this review of OpenAI’s Text Classifier.

OpenAI has made the tough call to discontinue their text classifier due to its “ poor rate of accuracy”, according to Techcrunch. You can see how OpenAI’s Text Classifier performed in our accuracy post here: https://originality.ai/blog/ai-content-detection-accuracy

This industry is challenging and we absolutely empathize with the team at OpenAI. As the creators of ChatGPT, we can imagine the considerable pressures they faced as being considered the source of "truth" within the application.

Analyzing the situation as a third party, we believe that with OpenAI being at the forefront of AI technology and essentially viewed as the creators of AI generated content, meant that their approach to AI detection likely had to be more conservative and cautious. As the creators of ChatGPT, the scrutiny over false positives that they would have faced would have been exceptionally difficult to navigate.

Developing an AI detection tool to identify ChatGPT generated text is not an easy task! It requires working through complex challenges as well as acknowledging that there is never going to be a 100% perfectly accurate solution, no matter what! Balancing the need for accuracy while avoiding the backlash of false positives is a delicate tightrope to walk, even more so for OpenAI than anyone else.

We believe that both transparency and accountability are crucial in this industry, and we commend OpenAI for making the tough call to shut it down when its performance fell short of the desired standard.

Take a Look at the video we did on OpenAI’s accuracy :

‍

You may already know that AI content writing tools enable people to produce engaging and informative content quickly. However, since AI is not perfect, there are often quality issues and the potential for plagiarism. This is when AI detection tools come into play.

AI content detectors are capable of detecting content generated by AI-powered software, even if it’s free of plagiarism. This service is especially useful now that Google penalizes websites that contain unnatural or artificially generated text.

From marketers and businesses buying content to educators concerned about the originality of their students’ work, AI content detection tools provide a straightforward way to learn whether a piece of text has been crafted with the aid of these specialized tools.

So today, we’re here to look at OpenAI Text Classifier, OpenAI’s own AI detector. OpenAI is well known for developing the AI transformer models ChatGPT and GPT-3. Let’s see how well its detection software performs and how its capabilities stack up against other platforms.

Features

With its intuitive user interface and powerful algorithms, OpenAI Text Classifier simplifies the process of identifying relevant topics in large collections of documents or conversations. Its key features include:

Easy-to-use interface: OpenAI makes AI detection intuitive and straightforward. All you have to do is copy and paste the text you want to examine into the box and press “submit” to get the detection results instantly.
Intensive detection training: OpenAI trained its AI Text Classifier on text from 34 different sources, including OpenAI itself. Along with this, they provided the classifier with similar human-written text from Wikipedia, websites from links shared on Reddit, and a set of “human demonstrations” obtained from a prior OpenAI text-generating system.
Free for all: By making the Text Classifier free of charge, OpenAI is helping ensure that everyone has access to the powerful technology they are working on while also encouraging creativity in the development space.

Keep in mind that the OpenAI Text Classifier will not work on all texts. The minimum number of characters it needs is 1,000, which translates to roughly 150-250 words.

Unfortunately, this AI cannot detect plagiarism despite being trained with text-generating capabilities.

Pros

Intuitive UI
Quick results
Unlimited free scanning

Cons

Only has a 26% success rate, according to its own tests
No plagiarism feature as of the moment
Not reliable for non-English texts
Lacks features

Testing OpenAI Text Classifier Accuracy

Here is where we get into the nitty-gritty. Let’s compare the performance of OpenAI Text Classifier and Originality.ai in an AI detection test using our sample texts generated with Jasper, a popular and powerful GPT2-based content generator.

As a top-ranking AI-detection tool, Originality.ai can identify and flag GPT2, GPT3, GPT3.5, and even ChatGPT material. It will be interesting to see how well these two platforms perform in detecting 100% AI-generated content.

OpenAI Text Classifier employs a different probability structure from other AI content detection tools. It labels text based on the likelihood that a piece of text was created by artificial intelligence, ranging from “very unlikely” (less than a 10% probability), “unlikely” (10%-45%), “unclear” (45%-90%), “possibly” (90%-98%), or “likely” (above 98%).

Originality.ai, however, provides results in terms of percentages that indicate how much of a text has been generated by AI and how much has been plagiarized.

Below is a side-by-side comparison of the OpenAI Text Classifier and Originality.ai.

From the get-go, OpenAI Text Classifier appears unable to accurately assess our samples as having been AI-generated.

OpenAI has been candid about the poor accuracy rate of their detection tool, reporting that it only correctly identified 26% of AI-written text when they used it on their testable data set. In the same test, false positives occurred 9% of the time. Among the seven samples used in our test, the tool was completely incorrect twice.

Meanwhile, Originality.ai confidently detected AI content in five out of seven samples, with only one error and one uncertain outcome.

It should be noted, however, that OpenAI determines its accuracy by how many times it predicted that the generated content was AI “with a very high degree of certainty.” In contrast, Originality.ai defines it as how many times it correctly predicted that the generated content was AI, explaining its high level of accuracy.

While comparable, OpenAI appears to be “less accurate,” not because its algorithm is inferior but because it is more cautious.

To give OpenAI Text Classifier some credit, OpenAI has made it clear that it should not be used as a primary decision-making tool but rather as a complement to other methods of determining text sources.

This makes Originality.ai, according to our results, the obvious choice for those seeking a precise and dependable AI content recognition program.

In-Depth OpenAI Text Classifier vs Originality.ai Test

We took an even deeper look at the OpenAI Text Classifiers’ performance vs Originality.ai’s detection AI.

Introduction

In this experiment, we will compare our AI Detector engine (Originality.ai) with OpenAI’s AI Text Classifier (https://platform.openai.com/ai-text-classifier)

Data sample

Based on 200 data samples, where:
100 original content written by human
20 content generated by GPT-3
20 content generated by GPT-J
20 content generated by GPT-Neo
20 content generated by GPT-2
20 content paraphrased from the original content

Evaluation

“Each document is labeled as either very unlikely, unlikely, unclear if it is, possibly, or likely AI-generated.” – OpenAI

Since we have 2 classes AI and Human, while OpenAI has 5 classes according to degrees, we will therefore include the ambiguous classes unclear if it is, possibly, or likely to be AI. The remaining classes are very unlikely, unlikely is Human

In the images below the Y-Axis is the Actual and the X-Axis is what the AI predicted.

1. All samples

2. 20 samples generated by GPT-3 and 20 samples written by human

3. 20 samples generated by GPT-J and 20 samples written by human

4. 20 samples generated by GPT-Neo and 20 samples written by human

5. 20 samples generated by GPT-2 and 20 samples written by human

6. 20 samples paraphrased from the original content (written by humans) and 20 samples written by human

Comparing OpenAI Text Classifier to Other Platforms

If we compare OpenAI Text Classifier to Originality.ai, we can observe a few distinctions in terms of their performance:

No plagiarism checker: There is a big difference between recognizing AI-generated content and detecting plagiarism. OpenAI lacks a plagiarism checker, unlike many similar services, including Originality.ai.
Inaccurate AI detection: The classifier has a success rate of 26% at the moment. With Originality.ai, we achieved a total detection rate of 79%, indicating that it’s a much more effective tool.
Inexact AI content assessment: While OpenAI provides a clear and transparent classification system, it does not provide insight into how its scores are calculated.

Advantages of OpenAI Text Classifier

If you’re simply interested in detecting AI-generated content using an accessible interface that provides fast, easy-to-understand results, OpenAI is certainly worth checking out.

As an added advantage, this tool was developed by the company that created ChatGPT, ensuring that it has considerable potential to grow and improve. As a complement to other AI-deception software, the classifier is also free to use.

Conclusion

If you’re looking for a quick and easy AI-detection tool that you can easily whip up on a browser, then OpenAI Text Classifier might well fit the bill. It’s essentially a finely tuned GPT model designed to determine whether the text has been generated by AI from a variety of sources, in particular from ChatGPT.

Using Originality.ai, however, will give you a more reliable and accurate analysis. The pay-as-you-go option gives users more flexibility when it comes to pricing, while its accuracy rate and advanced features set it apart from other options.

‍

Jonathan Gillham

View All Posts By Author

Founder / CEO of Originality.ai I have been involved in the SEO and Content Marketing world for over a decade. My career started with a portfolio of content sites, recently I sold 2 content marketing agencies and I am the Co-Founder of MotionInvest.com, the leading place to buy and sell content websites. Through these experiences I understand what web publishers need when it comes to verifying content is original. I am not For or Against AI content, I think it has a place in everyones content strategy. However, I believe you as the publisher should be the one making the decision on when to use AI content. Our Originality checking tool has been built with serious web publishers in mind!

More From The Blog

AI Content Detector & Plagiarism Checker for Marketers and Writers

Use our leading tools to ensure you can hit publish with integrity!