AI Detection in Arabic — Originality.ai Is the Most Accurate

We studied how Originality.ai’s multilingual AI detector stacked up to state-of-the-art AI content detectors across a range of Arabic datasets, as per the study “The Arabic AI Fingerprint: Stylometric Analysis and Detection of Large Language Models Text.” These are our findings.

November 12, 2025

As artificial intelligence becomes increasingly integrated into education, publishing, and digital communications, reliably detecting AI-generated content — especially in underrepresented languages like Arabic — has become crucial.

A major study, “The Arabic AI Fingerprint: Stylometric Analysis and Detection of Large Language Models Text,” rigorously tested the performance of state-of-the-art AI content detectors across a range of Arabic datasets.

Here’s how Originality.ai’s multilingual AI detector stacked up.

Key Findings (TL;DR)

Originality.ai AI Detector achieved near-perfect accuracy and F1-score in detecting AI-generated Arabic academic abstracts and OpenAI-generated social media content.
Originality.ai delivered consistent, high-precision results with minimal false positives.
Originality.ai outperformed (if not matched) the research’s best-in-class, fine-tuned multilingual models across most major metrics.

Learn more about AI detection and AI detection accuracy in our AI Detection Accuracy Review and a Meta-Analysis of Third-Party AI Detection Studies. Then, get further insight into our Multilingual AI Detector.

Study Details:

The study, conducted by researchers at King Fahd University of Petroleum and Minerals, set out to answer a pressing question: Can current AI detectors distinguish between human-written and AI-generated Arabic text?

It evaluated content from both academic and social media sources, generated by leading large language models (LLMs), using a suite of stylometric and machine learning methods.

AI Detection Tools Evaluated:

XLM-RoBERTa (fine-tuned multilingual model, used in the research as the detection baseline)
Four LLMs for text generation: ALLaM, Jais, LLaMA 3.1, and OpenAI GPT-4
Originaltiy.ai AI Detector (our multilanguage AI content detection tool)

Dataset:

Academic Abstracts: Curated from the Algerian Scientific Journals Platform (ASJP), with human-written abstracts and AI-generated variants (title-only, content-aware, polished).
Social Media Reviews: Combined from two major datasets: Book Reviews in Arabic (BRAD) and Hotel Arabic Reviews Dataset (HARD), with both original posts and AI-polished versions

Evaluation Criteria

The study and our benchmark used standard metrics for evaluating AI detection accuracy:

Accuracy (overall correct classifications)
Precision (AI detections that were truly AI)
Recall (AI samples correctly flagged)
F1-Score (balance of precision and recall)
False Positive Rate (FPR) (humans wrongly flagged as AI)

All metrics were calculated separately for each text source (Human, ALLaM, Jais, LLaMA, OpenAI).

Why Only FPR for Human Dataset?

In the research paper, the authors trained their detector as a multi-class classifier: for each input text, the model predicts whether it was written by a human or by one of the AI models (ALLaM, Jais, Llama, OpenAI).

This allows them to calculate per-class Precision, Recall, and F1-score for each label — including “Human” — since the classifier can make various types of mistakes (e.g., calling a human text “ALLaM” or “Jais”).

However, for fair comparison with Originality.ai (which only distinguishes “AI” vs “Human”), it makes sense to simplify the evaluation for human data:

Precision, Recall, and F1 are not directly meaningful in the human-only scenario because all examples are human; you can only assess whether the detector ever mislabels a human text as AI.
Therefore, we report only FPR (False Positive Rate): The percentage of human texts that were wrongly flagged as anything other than “Human” (i.e., any AI class).

Definition: FPR (False Positive Rate) – For the human dataset, if the model predicts any label other than “Human” for a sample, it is counted as a false positive.

Evaluation Tables

Originality.ai’s Performance Highlights

1. Unmatched Accuracy in Academic Writing

Originality.ai achieved perfect (100%) or near-perfect accuracy and F1-score on all AI-generated academic abstract datasets, outperforming the research’s own fine-tuned detectors.

2. Top Performance on OpenAI Social Media Content

For OpenAI social posts, Originality.ai reached F1-scores over 99% — higher than the research baseline.

3. Minimal False Positives on Human Content

Across both academic and social datasets, Originality.ai kept the false positive rate extremely low (as low as 1.09% in academic abstracts and 4.37% in social media), ensuring human writing is rarely misclassified.

Final Thoughts

Originality.ai’s multilingual AI detection tool isn’t just a contender — it’s a leader in the detection of Arabic AI-generated text. Its results consistently match or exceed the best academic models, achieving industry-leading accuracy with minimal false positives.

For educators, publishers, and institutions looking to maintain integrity in detecting AI Arabic content, Originality.ai’s AI detector is the most accurate.

Further Reading:

Jonathan Gillham

View All Posts By Author

Founder / CEO of Originality.ai I have been involved in the SEO and Content Marketing world for over a decade. My career started with a portfolio of content sites, recently I sold 2 content marketing agencies and I am the Co-Founder of MotionInvest.com, the leading place to buy and sell content websites. Through these experiences I understand what web publishers need when it comes to verifying content is original. I am not For or Against AI content, I think it has a place in everyones content strategy. However, I believe you as the publisher should be the one making the decision on when to use AI content. Our Originality checking tool has been built with serious web publishers in mind!

More From The Blog

AI Studies

30% of Public Feedback on Higher Education Act Changes (Student Financial Aid) was Likely AI

In 2025, the US Department of Education held a public hearing, accepting comments and feedback about changes to the Higher Education Act for Federal student financial assistance programs. At Originality.ai, we analyzed how much of that feedback was Likely AI.

Madeleine Lambert

March 16, 2026

Is GPT-5.3 Instant and GPT-5.4 Content Detectable?

OpenAI just released GPT-5.3 Instant and GPT-5.4 in March 2026. Is GPT-5.3 Instant and GPT-5.4-generated text still detectable as AI by the industry-leading Originality.ai AI Checker? These are our findings.

Jonathan Gillham

March 11, 2026