We used thisopen-source AI detector accuracy tool to ensure each API received the exact same articles for continuity. This tool also automatically calculates the efficacy and complete statistical analysis of the results, providing us with an in-depth insight into the results for each tool.
Results are presented below, and the data of the test is available above
We use machine learning best practises to see how well a classifier works. Here's aguide that goes into the process in greater detail, including more details onthe accuracy of AI detectors as a whole.
When assessing the performance of an AI detector, the best way to do so is by looking at the confusion matrix (which you will see outlined for each detector in this article) and the F1 score. The F1 score is frequently used as a metric to convert the overall Confusion Matrix into a single figure.
Originality detected that 94.3% of the AI-written content was infact, AI-generated, mistakenly identifying it as human-written 5.7% of the time.
F1 score: 0.76
GPTZero did not perform as well as Originality.ai for this test, only correctly identifying 61.7% of the content as AI-generated, mistakenly attributing it as human-written 38.3% of the time.
F1 score: 0.78
CopyLeaks faired slightly better than GPTZero (+1.5% correct detections), but it was still only successful less than two-thirds of the time, with 36.8% of articles incorrectly identified as human-written.
F1 score: 0.86
Similarly to GPTZero and CopyLeaks, Sapling was only able to correctly identify about two thirds of the AI generated articles in this test (67.5%).
Based on the results of this 200-article test, it appears that Mixtral AI detectability is in keeping with other LLMs, such asGoogle Bard and ChatGPT
Overall, Originality.ai significantly outperformed GPTZero, CopyLeaks, and Sapling at detecting AI-generated content. While those three tools all sat between 62%-68% accuracy, Originality boasted a 95% accuracy rate.
It's important to bear in mind thatAI detectors all have flaws, and there are instances of false positives and missed AI content. However, this study further proofs the reliability of the Originality.ai tool.
it also highlights the importance of studies like this to continue our confidence in the content we are consuming, furthering the value of transparency. If you are interested in running your own study please reach out, as we are happy to offer research credits.
We are also still looking for a participant in our challenge for charity (do AI detectors work). If you'd like to get involved, please get in touch.
Founder / CEO of Originality.AI I have been involved in the SEO and Content Marketing world for over a decade. My career started with a portfolio of content sites, recently I sold 2 content marketing agencies and I am the Co-Founder of MotionInvest.com, the leading place to buy and sell content websites. Through these experiences I understand what web publishers need when it comes to verifying content is original. I am not For or Against AI content, I think it has a place in everyones content strategy. However, I believe you as the publisher should be the one making the decision on when to use AI content. Our Originality checking tool has been built with serious web publishers in mind!