Try the Most Accurate AI Detector on the Market
Our patented AI checker is the most accurate detector on the market! Don't believe us? Try it for yourself!
Try for FREE Here!
AI Studies

Is GPT-4.1 Content Detectable?

Using our proprietary Originality.ai AI detection tool, we analyzed OpenAI’s GPT-4.1 AI model to determine if it is detectable. These are our findings.

OpenAI has unveiled GPT-4.1, a powerful new AI model optimized for coding and instruction following. 

According to OpenAI, this model “outperforms GPT‑4o and GPT‑4o mini across the board.” 

With support for an unprecedented one million token context window, GPT-4.1 marks a major leap in processing capacity, far beyond the previous GPT‑4o models’ 128,000-token limit. 

In light of this release, we evaluated GPT-4.1 to test the accuracy of our AI Detector.

This quick study looks at 1000 GPT-4.1-generated text results to answer whether GPT-4.1 can be detected.

Is GPT-4.1 AI Content Detectable?

  • Yes — GPT-4.1 Text is Detectable with a 97.9% Accuracy for our 3.0.1 Turbo model and 94.5% Accuracy for our 1.0.0 Lite Model.

Try our AI Detector here.

Dataset

To evaluate the detectability of GPT-4.1, we prepared a dataset of 1000 GPT-4.1-generated text samples.

AI-Generated Text Data

For AI-text generation, we used GPT-4.1 based on three approaches given below:

  1. Rewrite prompts: Generating the content by providing the model with a customized prompt along with some articles (probably generated by LLMs) as a reference from which to rewrite. (450 Samples)
  2. Rewrite human-written text: Generating the content considering the provided prompt to bypass the AI Detection tool by rewriting the human-written text, which we fetched from an open-source dataset (325 Samples)
    1. One-Class Learning for AI-Generated Essay Detection
      1. Paper: https://www.mdpi.com/2076-3417/13/13/7901
      2. Dataset: https://github.com/rcorizzo/one-class-essay-detection
  3. Write articles from scratch: Generating the articles from scratch based on the given topics, ranging from fictional and non-fictional, diverse domains such as history, medicine, mental health, content marketing, social media, literature, robots, and the future, etc. (225 Samples)

Evaluation

To evaluate the efficacy, we used our Open Source AI Detection Efficacy tool:

Originality.ai has three models — Model 3.0.1 Turbo, 1.0.0 Lite, and Multi Language for AI text detection.

  • Version 3.0.1 Turbo — If your risk tolerance for AI is ZERO! It is designed to identify any use of AI, even light AI.
  • Version 1.0.0 Lite — If you are okay with slight use of AI (i.e., AI editing).
  • Multi-Language — Detect AI content across 30 languages.

Learn more about which AI detection model is best for you and your use case.

The open-source testing tool returns a variety of metrics for each detector tested, each of which reports on a different aspect of that detector’s performance, including:

  • Sensitivity (True Positive Rate): The percentage of the time the detector identifies AI text correctly.
  • Specificity (True Negative Rate): The percentage of the time the detector identifies human-written text correctly.
    Accuracy: The percentage of the detector’s predictions that were correct.
  • F1: The harmonic mean of Specificity and Precision, often used as an agglomerating metric when ranking the performance of multiple detectors (a performance measurement that combines recall and precision to evaluate models).

If you'd like a detailed discussion of these metrics, what they mean, how they're calculated, and why we chose them, check out our blog post on AI detector evaluation. For a succinct snapshot,  the confusion matrix is an excellent representation of a model's performance.

Below is an evaluation of both models on the above dataset. 

Confusion Matrix:

Figure 1. Confusion Matrix on AI-only dataset with Model 3.0.1 Turbo
Figure 2. Confusion Matrix on AI-only dataset with Model 1.0.0 Lite

Evaluation Metrics:

For this small test to reflect the Originality.ai AI Detector’s ability to identify GPT-4.1 content, we looked at the True Positive Rate, or the percentage of the time the model correctly identified AI text as AI out of a 1000-sample GPT-4.1 content. 

Model 3.0.1 Turbo:

  • Recall (True Positive Rate) = 97.9%

Model 1.0.0 Lite:

  • Recall (True Positive Rate) = 94.5%

Conclusion

Our study confirms that the content generated by GPT-4.1 AI-generated text is highly detectable with our AI detector. The Model 3.0.1 Turbo exhibited strong performance with 97.9% accuracy, while Model 1.0.0 Lite followed closely with 94.5%

These results highlight the effectiveness of the Originality.ai AI detector in identifying AI-generated content, even with the latest releases of popular AI models, like GPT-4.1, ensuring reliable detection across various text generation approaches.

Interested in learning more about AI detection? Check out our guides:

Jonathan Gillham

Jonathan Gillham

Founder / CEO of Originality.ai I have been involved in the SEO and Content Marketing world for over a decade. My career started with a portfolio of content sites, recently I sold 2 content marketing agencies and I am the Co-Founder of MotionInvest.com, the leading place to buy and sell content websites. Through these experiences I understand what web publishers need when it comes to verifying content is original. I am not For or Against AI content, I think it has a place in everyones content strategy. However, I believe you as the publisher should be the one making the decision on when to use AI content. Our Originality checking tool has been built with serious web publishers in mind!

More From The Blog

Al Content Detector & Plagiarism Checker for Marketers and Writers

Use our leading tools to ensure you can hit publish with integrity!