Case Studies

Can Mixtral AI Content be Detected

Can AI tools spot Mixtral's content? New study reveals detection success rate! See Mixtral AI Content Detection findings now.

Jonathan Gillham

We're back with another study, this time focusing on whether Mixtral AI content can be detected by market-leading AI content detection tools, including Originality.ai.

In this study, we focus on the Mixtral AI model and Originality AI’s ability to detect its content as AI-generated.

Below are the top-level findings from this study, including a breakdown of how we executed the testing and a short analysis of the results.

You can also access the dataset here or the open-source AI detection testing/statistical analysis tool used for this study. 

We have made and open-sourced an AI detector efficacy research tool that is even easier for researchers to use to test AI detector efficacy against a dataset. 

Note:  this is ONLY a 200-sample test which is significantly too small for any conclusive answers. For a more complete AI detection accuracy study see this study

Short on time? Here are the key findings:

  1. Mixtral AI data can be identified at similar accuracies to existing LLM models:
  1. Originality.ai - 94.3% True Positive Rate
  2. GPTZero - 61.7% True Positive Rate
  3. CopyLeaks - 63.2% True Positive Rate
  4. Sapling - 67.5% True Positive Rate

Method:

  • We created a 200-sample text dataset using Mixtral on Jan 17
  • Using the APIs for AI detectors, we tested against multiple detectors:
  • We used this open-source AI detector accuracy tool to ensure each API received the exact same articles for continuity. This tool also automatically calculates the efficacy and complete statistical analysis of the results, providing us with an in-depth insight into the results for each tool.
  • Results are presented below, and the data of the test is available above 

Analysis:

‍We use machine learning best practises to see how well a classifier works. Here's a guide that goes into the process in greater detail, including more details on the accuracy of AI detectors as a whole.

When assessing the performance of an AI detector, the best way to do so is by looking at the confusion matrix (which you will see outlined for each detector in this article) and the F1 score. The F1 score is frequently used as a metric to convert the overall Confusion Matrix into a single figure.

Mixtral AI Detection Results:

Originality.ai

Originality.ai detected that 94.3 Percent of the AI-written content

F1 score: 0.97

Recall: 0.94

Accuracy: 0.94

Originality detected that 94.3% of the AI-written content was infact, AI-generated, mistakenly identifying it as human-written 5.7% of the time.

GPTZero

GPTZero detected that 61.7 Percent of the AI-written content

F1 score: 0.76

Recall: 0.62

Accuracy: 0.62

GPTZero did not perform as well as Originality.ai for this test, only correctly identifying 61.7% of the content as AI-generated, mistakenly attributing it as human-written 38.3% of the time.

CopyLeaks

CopyLeaks detected that 63.2 Percent of the AI-written content

F1 score: 0.78

Recall: 0.63

Accuracy: 0.63

CopyLeaks faired slightly better than GPTZero (+1.5% correct detections), but it was still only successful less than two-thirds of the time, with 36.8% of articles incorrectly identified as human-written.

Sapling

Sapling detected that 67.5 Percent of the AI-written content

F1 score: 0.86

Recall: 0.68

Accuracy: 0.68

Similarly to GPTZero and CopyLeaks, Sapling was only able to correctly identify about two thirds of the AI generated articles in this test (67.5%).

Summary

Based on the results of this 200-article test, it appears that Mixtral AI detectability is in keeping with other LLMs, such as Google Bard and ChatGPT

Overall, Originality.ai significantly outperformed GPTZero, CopyLeaks, and Sapling at detecting AI-generated content. While those three tools all sat between 62%-68% accuracy, Originality boasted a 95% accuracy rate.

It's important to bear in mind that AI detectors all have flaws, and there are instances of false positives and missed AI content. However, this study further proofs the reliability of the Originality.ai tool.

it also highlights the importance of studies like this to continue our confidence in the content we are consuming, furthering the value of transparency. If you are interested in running your own study please reach out, as we are happy to offer research credits.

We are also still looking for a participant in our challenge for charity (do AI detectors work). If you'd like to get involved, please get in touch.

Jonathan Gillham

Founder / CEO of Originality.AI I have been involved in the SEO and Content Marketing world for over a decade. My career started with a portfolio of content sites, recently I sold 2 content marketing agencies and I am the Co-Founder of MotionInvest.com, the leading place to buy and sell content websites. Through these experiences I understand what web publishers need when it comes to verifying content is original. I am not For or Against AI content, I think it has a place in everyones content strategy. However, I believe you as the publisher should be the one making the decision on when to use AI content. Our Originality checking tool has been built with serious web publishers in mind!

More From The Blog

AI Content Detector & Plagiarism Checker for Serious Content Publishers

Improve your content quality by accurately detecting duplicate content and artificially generated text.