With the release of Alibaba’s Qwen2.5-Max, a cutting-edge Mixture-of-Experts (MoE) AI model, the boundaries of language generation and understanding continue to evolve. This development is particularly exciting for industries that rely on nuanced language understanding, like translation services, customer support, and content creation.
As AI-generated content becomes more advanced, the question arises — can we still detect it?
This study examines 1,000 samples from Qwen2.5-Max to assess their detectability using our Turbo and Lite AI content detectors, alongside GPTZero and RapidAPI’s AI Content Detector.
Try our AI Detector here.
In order to evaluate the detectability of Qwen2.5-Max, we prepared a dataset of 1000 Qwen2.5-Max generated text samples.
For AI-text generation, we used Qwen2.5-Max based on three approaches given below:
To evaluate the efficacy we used the Open Source AI Detection Efficacy tool that we have released:
Originality.ai has two models namely Model 3.0.1 Turbo and Model 1.0.0 Lite for the purpose of AI Text Detection.
The open-source testing tool returns a variety of metrics for each detector you test, each of which reports on a different aspect of that detector’s performance, including:
If you'd like a detailed discussion of these metrics, what they mean, how they're calculated, and why we chose them, check out our blog post on AI detector evaluation. For a succinct snapshot, though, we think the confusion matrix is an excellent representation of a model's performance.
Below is an evaluation of both the models on the above dataset.
For this smaller test to be able to identify the ability of Originality.ai’s AI detector to identify Qwen2.5-Max content we look at True Positive Rate or the % of the time that the model correctly identified AI text as AI out of a 1000 Qwen2.5-Max content samples.
Model 1.0.0 Lite:
Model 3.0.1 Turbo:
GPTZero:
RapidAPI (AI Content Detector | AI/GPT):
Our study confirms that Qwen2.5-Max AI-generated text is highly detectable using our AI content detectors. Model 3.0.1 Turbo achieved an outstanding 99.8% recall, while Model 1.0.0 Lite followed closely with 96.9% recall.
Turbo 3.0.1, developed by Originality.ai surpassed GPTZero (97.9%) and significantly outperformed RapidAPI’s AI Content Detector (29.2%).
These results demonstrate the superior accuracy and reliability of our AI detection models in identifying AI-generated content, reinforcing their effectiveness as industry-leading tools for AI content detection.
To learn more about AI detection and its efficacy read our AI detection accuracy study and a meta-analysis of AI detection studies conducted by third parties.