With the release of OpenAI’s latest AI model GPT-4o “omnimodel” there is a need to understand our AI detectors accuracy.
This quick study looks at 1000 GPT-4o generated text results to answer if GPT-4o is able to be detected.
Try our AI Detector here.
In order to evaluate the detectability of GPT-4o, we prepared a dataset of 1000 GPT-4o-generated text samples.
For AI-text generation, we used GPT-4o based on three approaches given below:
To evaluate the efficacy we used the Open Source AI Detection Efficacy tool that we have released:
Originality.ai has two models namely Model 3.0 Turbo and Model 2.0 Standard for the purpose of AI Text Detection.
The open-source testing tool returns a variety of metrics for each detector you test, each of which reports on a different aspect of that detectors performance, including:
If you'd like a detailed discussion of these metrics, what they mean, how they're calculated, and why we chose them, check out our blog post on AI detector evaluation. For a succinct upshot, though, we think the confusion matrix is an excellent representation of a model's performance.
Below is an evaluation of both the models on the above dataset.
For this smaller test to be able to identify the ability for Originality.ai’s AI detector to identify GPT-4o content we look at True Positive Rate or the % of the time that the model correctly identified AI text as AI out of a 1000 sample GPT-4o content.
Model 2.0 Standard:
Model 3.0 Turbo:
In the constantly evolving realm of AI generated content the veracity of information is of utmost importance. With a couple of fact checking solutions available, discerning their efficacy becomes crucial. Originality.ai, revered for its transparency and accuracy in AI content detection had recently ventured into the domain of fact checking but how does our solution stack up against well established giants like ChatGPT or emerging contenders like Llama-2? This study aims to answer this question.
We believe that it is crucial for AI content detectors reported accuracy to be open, transparent, and accountable. The reality is, each person seeking AI-detection services deserves to know which detector is the most accurate for their specific use case.