Categories: AI Content

by Jonathan Gillham


tech girl thinking about the mechanism of ai content detection tool and how it works

A lot of questions have come up about how our AI content detection works and achieves the 94%+ accuracy for all modern AI text generation tools. 

Here is the nitty-gritty details provided by our engineering team communicating how they trained the most accurate AI content detection tool. 

1. How the tool works?

The Problem

  • The tool’s core is based on a modified version of the BERT model. BERT, also known as Bidirectional Encoder Representations from Transformers, BERT is a Google-developed artificial intelligence (AI) system for processing natural language.
  • A lot of research has shown that it is increasingly difficult for humans to identify if a piece of text was created by a writer or AI-generated. As the generating model gets bigger and bigger and the quality of the generated text gets better and better. Therefore, the use of a Machine Learning (ML) model for detection (ML-based detection) is an urgent problem.
  • However, detecting the synthesis document is still a difficult task. The adversary can use datasets outside the training set distribution or use different sampling techniques to generate text and many other techniques. We did a lot of research and did our best to deliver an optimal model, in terms of architecture, data, and other techniques.


    • We used a modified version of the BERT model for classification. We tested on multiple model architectures and concluded that with the same capacity, the discriminate model with having a more flexible architecture than the generative model (e.g bidirectional) makes it more powerful in detection. Another thing we also learned is that the larger the model used, the more accurate and robust the prediction will be. The usage model is large enough but still meets the response time thanks to the most modern techniques in deployment.
  • This is one of the reasons our AI detection tool can outperform any alternative.
  • Of course, we did not use BERT to train the model but we trained a pre-training language model with a completely new architecture, and of course, it is still a kind of architecture based on Transformer.
    • We first trained a new language model that had a completely new architecture based on 160GB of text data. One technique that we apply to train the model is similar to ELECTRA. The model is trained by two models: generator and discriminator. After training discriminator is used as our language model.
    • Second, fine-tune the model on the training dataset that we built up to millions of samples.

Training data

  • It is obvious that the quality of the model depends a lot on the input data. With the collection and processing of a large amount of actual text from many sources, through the clean process, we have a large amount of real text to train.
  • With generated text, In addition to using the best pre-trained generated models available today (…..), we also build on that, fine-tuning with the existing real text dataset to make the generated text more natural, close to the real distribution. text as much as possible, so that the model can learn a good enough boundary for future inference connections.
  • Our training dataset is carefully generated using various sampling methods. It is manually reviewed by humans a lot.

Other techniques

  • To generate text, there are many different sampling techniques, like temperature, Top-K, and nucleus sampling. We expect more advanced generation strategies (such as nucleus sampling) could make detection more difficult than generations produced via Top-K truncation. Using a mixed dataset from different sampling techniques also helps to increase the accuracy of the model. In our training data generation, we try to generate in different ways and on many different models so that the data is diversified and the model understands the types of text generated by the AI in a way better.
  • One of the challenges of the detection model is detecting difficult short texts, although it is very short texts, we have also improved the accuracy of the model in these cases by using our different techniques. After using different methods and testing our model can improve accuracy by more than 15% compared to the previous model. Therefore, the current model has improved accuracy on much shorter texts, but according to our experiments, with lengths of 50 tokens or longer, the model has acceptable reliability.

Future work

  • Given the release of new complex text-generation models, this problem requires regular updating. The model needs to be retested and evaluated before we decide whether to use its results or retrain it using the data from those new text-generation models. However, with tests using the most recent SOTA models, the model we created can do its job very well
  • Taking our proven model training method and applying it to other languages is on our roadmap.
  • One of our future research is that we want the model to understand deeply each sentence and paragraph of the text and show which sentence/paragraph model is written by AI and which sentence/paragraph is written by AI/Human. We call this model the AI Highlight Detector.

Simplified Version of What is Under Development:

1% texts of a content indicate the originality with light green color while red markings with 99% showing that it's written by writing tools

2. How accurate is Originality.AI?

  • The model is trained on a million pieces of textual data labeled “human-generated” or “AI-generated”. Then during testing, we test the trained model on documents generated by artificial intelligence models, including GPT-3, GPT-J, and GPT-NEO (20 thousand data each). And the result is that our model successfully identified 94.06% of the text created by GPT-3, 94.14% of text written by GPT-J, and 95.64% of text generated by GPT-Neo.
  • The results show that the more powerful the models like GPT-J/3, the harder it is for the model to recognize that the human or AI is writing.

3. What does the score means?

  • This is a binary classification problem because the objective is to determine whether the sentence was produced by AI. After training, the model takes a text as input and determines whether it is likely to have been generated by AI or not. In order to determine the final outcome, we must first select a threshold; if the probability result of the AI-generated input sentence exceeds that threshold, the output is fake.
    • For instance, the model estimates a 0.6 probability of artificial intelligence (AI) generation and a 0.4 probability of human creation. If we use a threshold of 0.5, the output will be an AI-generated input text because 0.6 > 0.5.

binary classification model explaining how the ai-detection tool shows result whether it is fake or real with probability.

  • To evaluate the model, we use the accuracy measure of the label it predicts compared to the true label of the data. Informally, accuracy is the fraction of predictions our model got right.

Number of correct prediction divided by number of total observation deliver the result of how accurate the prediction.

  • Many other metrics are also taken to test and evaluate the model more specifically, in detail, and more accurately.

About the Author: 

Jonathan Gillham

Founder / CEO of Originality.AI

I have been involved in the SEO and Content Marketing world for over a decade. My career started with a portfolio of content sites, recently I sold 2 content marketing agencies and I am the Co-Founder of, the leading place to buy and sell content websites. Through these experiences I understand what web publishers need when it comes to verifying content is original. I am not For or Against AI content, I think it has a place in everyones content strategy. However, I believe you as the publisher should be the one making the decision on when to use AI content. Our Originality checking tool has been built with serious web publishers in mind!

Related Posts

  • The ChatGPT hype is real, and people are wondering how to avoid detection by AI software. Schools and universities have even attempted to ban students from using the tool for their essays. But there are ways to prevent plagiarism or detection of AI-generated content. SEO agencies use detection platforms such as to ensure that […]

  • GPTKit is an AI-based content detection tool developed by Dreamsoft Innovations. The tool is designed to analyze text and determine whether it contains AI-generated content, which has become increasingly prevalent in today’s digital age.  It utilizes six different AI-based content detection techniques to provide accurate results and offers reports on the authenticity and reality of […]

  • Grover’s AI-Detection tool is an advanced platform designed to identify and prevent the spread of “fake neural news” – a type of disinformation created using artificial intelligence.  It utilizes a combination of both generation and detection tactics to analyze the language and structure of articles to detect bias, false information, or other warning signs. Grover […]