The Most Accurate AI Content Detector
Try Our AI Detector
AI Studies

Originality.ai Stands Out From the Competition in Detecting AI-Generated Text on the Esperanto Dataset

We tested Originality.ai using the same dataset and methods of the “ESPERANTO: Evaluating Synthesized Phrases to Enhance Robustness in AI Detection for Text Origination" study accessible via Cornell University. Here's how it stands out as a leader in the domain.

A recent study “ESPERANTO: Evaluating Synthesized Phrases to Enhance Robustness in AI Detection for Text Origination” accessible via Cornell University, highlights the vulnerabilities of existing detectors against evasion techniques, such as back-translation, and provides a dataset to evaluate their robustness. 

In this context, we tested Originality.ai’s AI Detector using the same dataset and methods to validate its capabilities. Here's how it stands out as a leader in the domain.

Key Findings (TL;DR)

  • Originality.ai achieved top-notch detection accuracy of 99.7% across unaltered AI-generated and human-written texts, outperforming many tools.
  • Originality.ai showed strong resilience to back-translation, maintaining a detection accuracy of 85.6% in manipulated text cases.

Learn more about Originality.ai’s efficacy in AI detection in our AI Detection Accuracy Study and a Meta-Analysis of Third-Party AI Detection Studies.

Study Details

The study introduces back-translation as a method to manipulate AI-generated texts – a technique where AI-generated text is translated into multiple languages and then back into the original language. This process retains the original meaning but can alter the text enough to evade detection.

Nine detection tools were evaluated on a newly built dataset of 720k texts. The results revealed significant performance gaps in many tools, especially when tested against back-translated texts. 

AI Detection Tools

Open-source tools: RADAR, LLMDet, Likelihood, Rank, Log-Rank, ESAS

Commercial tools: Pangram, GPTZero, ZeroGPT

Dataset Information

The dataset used in the study, named ESPERANTO, comprises:

  • Total Instances: 720,000 text samples. We used 72,000 text samples for an evaluation.
  • Text Categories:
    • News Articles: Reflecting journalistic writing styles.
    • Paper Abstracts: Exemplifying scientific writing.
    • Reddit Q&A: Capturing everyday conversational language. (Explain Like I’m Five - ELI5)
    • Product Reviews: Showcasing informative and opinionated content.
  • Sources: Both human-authored and AI-generated texts.
  • AI Models Used: Eight different LLMs, including GPT-3.5 Turbo, GPT-4 variants, Llama models, Mistral-7b, Phi-3, and others.
  • Back-Translation Languages: Ten languages were used for back-translation—Portuguese, Spanish, French, Italian, Chinese, Dutch, Danish, Japanese, German, and Korean.

Evaluation Criteria

True Positive Rate (TPR)

Originality.ai's Performance

Finding 1: Originality.ai showed an outstanding Accuracy in Unaltered Texts

Originality.ai consistently delivered near-perfect TPR scores (average of 99.7%) for detecting AI-generated texts across all categories. This indicates unmatched precision in identifying original versus generated content.

Finding 2: Originality.ai is robust against Back-Translation

Despite the challenge posed by manipulated back-translated texts, Originality.ai maintained an average TPR of 85.6%, outperforming competitors like GPTZero, ZeroGPT etc.

Finding 3: Originality.ai is effectively able to detect AI text across various Domains

Across categories such as News, Reddit QA, and scientific abstracts, it demonstrated superior performance, achieving high accuracy even in nuanced or paraphrased text styles.

Final Thoughts

In a world where detecting AI-generated content is becoming more critical than ever, Originality.ai is paving the way for reliable and effective solutions. Even under the challenging conditions of back-translation manipulation, its consistently high TPR, adaptability across various domains, and strong robustness highlight its superiority over existing tools. 

Jonathan Gillham

Founder / CEO of Originality.ai I have been involved in the SEO and Content Marketing world for over a decade. My career started with a portfolio of content sites, recently I sold 2 content marketing agencies and I am the Co-Founder of MotionInvest.com, the leading place to buy and sell content websites. Through these experiences I understand what web publishers need when it comes to verifying content is original. I am not For or Against AI content, I think it has a place in everyones content strategy. However, I believe you as the publisher should be the one making the decision on when to use AI content. Our Originality checking tool has been built with serious web publishers in mind!

More From The Blog

Al Content Detector & Plagiarism Checker for Marketers and Writers

Use our leading tools to ensure you can hit publish with integrity!