After OpenAI's failed attempt at building an ChatGPT text detector due to the reasons discussed below they have focused on a watermarking solution. Basically watermarking is the process leaving a fingerprint based on the selection of words the LLM has made when creating the text. This fingerprint can be identified after.
However, it has been reported that despite building this watermarking solution they have been reluctant to release it for the following reasons:
We have some news and updates to share about this review of OpenAI’s Text Classifier.
OpenAI has made the tough call to discontinue their text classifier due to its “ poor rate of accuracy”, according to Techcrunch. You can see how OpenAI’s Text Classifier performed in our accuracy post here: https://originality.ai/blog/ai-content-detection-accuracy
This industry is challenging and we absolutely empathize with the team at OpenAI. As the creators of ChatGPT, we can imagine the considerable pressures they faced as being considered the source of "truth" within the application.
Analyzing the situation as a third party, we believe that with OpenAI being at the forefront of AI technology and essentially viewed as the creators of AI generated content, meant that their approach to AI detection likely had to be more conservative and cautious. As the creators of ChatGPT, the scrutiny over false positives that they would have faced would have been exceptionally difficult to navigate.
Developing an AI detection tool to identify ChatGPT generated text is not an easy task! It requires working through complex challenges as well as acknowledging that there is never going to be a 100% perfectly accurate solution, no matter what! Balancing the need for accuracy while avoiding the backlash of false positives is a delicate tightrope to walk, even more so for OpenAI than anyone else.
We believe that both transparency and accountability are crucial in this industry, and we commend OpenAI for making the tough call to shut it down when its performance fell short of the desired standard.
Take a Look at the video we did on OpenAI’s accuracy :
You may already know that AI content writing tools enable people to produce engaging and informative content quickly. However, since AI is not perfect, there are often quality issues and the potential for plagiarism. This is when AI detection tools come into play.
AI content detectors are capable of detecting content generated by AI-powered software, even if it’s free of plagiarism. This service is especially useful now that Google penalizes websites that contain unnatural or artificially generated text.
From marketers and businesses buying content to educators concerned about the originality of their students’ work, AI content detection tools provide a straightforward way to learn whether a piece of text has been crafted with the aid of these specialized tools.
So today, we’re here to look at OpenAI Text Classifier, OpenAI’s own AI detector. OpenAI is well known for developing the AI transformer models ChatGPT and GPT-3. Let’s see how well its detection software performs and how its capabilities stack up against other platforms.
With its intuitive user interface and powerful algorithms, OpenAI Text Classifier simplifies the process of identifying relevant topics in large collections of documents or conversations. Its key features include:
Keep in mind that the OpenAI Text Classifier will not work on all texts. The minimum number of characters it needs is 1,000, which translates to roughly 150-250 words.
Unfortunately, this AI cannot detect plagiarism despite being trained with text-generating capabilities.
Here is where we get into the nitty-gritty. Let’s compare the performance of OpenAI Text Classifier and Originality.ai in an AI detection test using our sample texts generated with Jasper, a popular and powerful GPT2-based content generator.
As a top-ranking AI-detection tool, Originality.ai can identify and flag GPT2, GPT3, GPT3.5, and even ChatGPT material. It will be interesting to see how well these two platforms perform in detecting 100% AI-generated content.
OpenAI Text Classifier employs a different probability structure from other AI content detection tools. It labels text based on the likelihood that a piece of text was created by artificial intelligence, ranging from “very unlikely” (less than a 10% probability), “unlikely” (10%-45%), “unclear” (45%-90%), “possibly” (90%-98%), or “likely” (above 98%).
Originality.ai, however, provides results in terms of percentages that indicate how much of a text has been generated by AI and how much has been plagiarized.
Below is a side-by-side comparison of the OpenAI Text Classifier and Originality.ai.
From the get-go, OpenAI Text Classifier appears unable to accurately assess our samples as having been AI-generated.
OpenAI has been candid about the poor accuracy rate of their detection tool, reporting that it only correctly identified 26% of AI-written text when they used it on their testable data set. In the same test, false positives occurred 9% of the time. Among the seven samples used in our test, the tool was completely incorrect twice.
Meanwhile, Originality.ai confidently detected AI content in five out of seven samples, with only one error and one uncertain outcome.
It should be noted, however, that OpenAI determines its accuracy by how many times it predicted that the generated content was AI “with a very high degree of certainty.” In contrast, Originality.ai defines it as how many times it correctly predicted that the generated content was AI, explaining its high level of accuracy.
While comparable, OpenAI appears to be “less accurate,” not because its algorithm is inferior but because it is more cautious.
To give OpenAI Text Classifier some credit, OpenAI has made it clear that it should not be used as a primary decision-making tool but rather as a complement to other methods of determining text sources.
This makes Originality.ai, according to our results, the obvious choice for those seeking a precise and dependable AI content recognition program.
We took an even deeper look at the OpenAI Text Classifiers’ performance vs Originality.ai’s detection AI.
“Each document is labeled as either very unlikely, unlikely, unclear if it is, possibly, or likely AI-generated.” – OpenAI
Since we have 2 classes AI and Human, while OpenAI has 5 classes according to degrees, we will therefore include the ambiguous classes unclear if it is, possibly, or likely to be AI. The remaining classes are very unlikely, unlikely is Human
In the images below the Y-Axis is the Actual and the X-Axis is what the AI predicted.
1. All samples
2. 20 samples generated by GPT-3 and 20 samples written by human
3. 20 samples generated by GPT-J and 20 samples written by human
4. 20 samples generated by GPT-Neo and 20 samples written by human
5. 20 samples generated by GPT-2 and 20 samples written by human
6. 20 samples paraphrased from the original content (written by humans) and 20 samples written by human
If we compare OpenAI Text Classifier to Originality.ai, we can observe a few distinctions in terms of their performance:
If you’re simply interested in detecting AI-generated content using an accessible interface that provides fast, easy-to-understand results, OpenAI is certainly worth checking out.
As an added advantage, this tool was developed by the company that created ChatGPT, ensuring that it has considerable potential to grow and improve. As a complement to other AI-deception software, the classifier is also free to use.
If you’re looking for a quick and easy AI-detection tool that you can easily whip up on a browser, then OpenAI Text Classifier might well fit the bill. It’s essentially a finely tuned GPT model designed to determine whether the text has been generated by AI from a variety of sources, in particular from ChatGPT.
Using Originality.ai, however, will give you a more reliable and accurate analysis. The pay-as-you-go option gives users more flexibility when it comes to pricing, while its accuracy rate and advanced features set it apart from other options.