Many writers, students, publishers, and SEO teams wonder whether translating AI-generated text can make it harder for AI detectors to identify or if translating human-written content makes it look more AI-generated.
Here at Originality.ai, we've taken a dataset of 498 human-written and 498 GPT-4o-written samples that were first created in English and then translated them into Spanish and Portuguese. We then did a round-trip back into English.
Afterward, we took the dataset and analyzed each iteration of the samples to identify false positives (human-written text incorrectly detected as AI) and true positives (GPT-4o-written text correctly detected as AI) rates with Originality.ai AI detection.
For context, prior research has studied multilingual AI detection and back-translation as a detector bypass method. This study focuses on a practical localization workflow:

Translation is a critical part of many business workflows. It helps with website localization, SEO content expansion, publisher/editorial review of translated submissions, and more.
All of these scenarios have a similar solution in mind: using automated translations to provide information to a large multilingual audience in language(s) the owners of the information are not proficient in.
At the same time, AI content can have notable considerations associated with it. Content teams, educators, and publishers need to know whether:
A portion of the samples from this dataset was used for the English human-written and GPT-4o-written content. This dataset came from a study called A Comprehensive Dataset for Human vs. AI Generated Text Detection.
Of the original dataset, 498 samples of human-written content and 498 samples of GPT-4o-written content were collected and then processed through our translation/AI scan pipeline.
These initial datasets were translated using Google Sheets’ Google Translate feature, from English to Portuguese and Spanish, and then back to English.
Each iteration of the dataset had an AI detection rate calculated.
English samples were scanned for AI using our Lite 1.0.2 model, and non-English results were scanned for AI using our multi-language model.
The results showed two clear patterns.
For human-written content, the original English samples had a false positive rate of 0.40%.
After direct translation, the false positive rate remained low: 2.21% for Spanish and 0.40% for Portuguese. This suggests that direct translation into these languages did not, on its own, cause a major increase in human-written content being misclassified as AI-generated.
The more noticeable change happened after round-trip translation.
When the human-written English samples were translated into Spanish or Portuguese and then translated back into English, the false positive rate increased to 28.02% for the Spanish round-trip workflow and 27.84% for the Portuguese round-trip workflow. This suggests that repeated automated translation can introduce wording changes, sentence restructuring, or stylistic smoothing that makes some human-written text appear more machine-like to an AI detector.
This distinction is important. A direct translation workflow, such as translating an English website page into Spanish or Portuguese, is different from a round-trip workflow where the text is translated and then translated back into English.
In this study, the direct translation workflow had a low false positive rate, while the round-trip workflow added more noise to the text.
For GPT-4o-written content, the results were consistent across every tested translation path.
The original English samples were identified as AI-written at a 100% true positive rate, and the translated Spanish and Portuguese versions were also identified at a 100% true positive rate.
The same was true after the samples were translated back into English. In other words, translation did NOT function as an effective bypass method for GPT-4o-written content in this study.
The charts below summarize these findings.
Human-written false positives remained low after direct translation but increased after round-trip translation.

By contrast, GPT-4o-written true positives remained consistent across the original English, direct translation, and round-trip English conditions.

Overall, the results suggest that AI translation and AI generation should be interpreted differently.
Translating human-written content can affect AI detection scores, especially after multiple automated translation steps, but direct translation into Spanish and Portuguese remained low-risk in this test.
At the same time, translating GPT-4o-written content did not make it harder for Originality.ai’s models to identify it as AI-written.
The biggest takeaway from this study is that translation did not function as an effective AI detection bypass for GPT-4o-written content.
Across the tested Spanish and Portuguese workflows, AI-written samples remained consistently flagged as AI-written, including after direct translation and round-trip translation back into English.
For companies using translation as part of normal content workflows, the results are also encouraging.
However, the round-trip results are important.
A human-written article translated into another language has a different content history than an article generated from scratch by an AI model. AI detection scores on translated content should be interpreted with that context in mind, especially when the content has gone through multiple rounds of automated translation.
For publishers, educators, businesses, and content teams, the best approach is to treat AI detection as one signal in a broader review process.
Directly translated human content should not be assumed to be AI-generated simply because a translation tool was used. At the same time, translation should not be assumed to hide AI-written content, as the GPT-4o-written samples in this study remained detectable throughout the tested translation workflows.
Our study found that translating GPT-4o-written content into Spanish or Portuguese did not reduce AI detection performance. AI-written samples remained consistently flagged as AI-written after direct translation and after round-trip translation back into English.
For human-written content, the initial translation into Spanish and Portuguese produced low false positive rates.
However, round-trip translation back into English increased false positives, suggesting that repeated automated translation can add noise or machine-like phrases to otherwise human-written text.
The main takeaway is that AI translation and AI authorship should be treated differently.
Translating human-written content can change the style of the text, especially after multiple translation steps, but direct translation is not the same as generating new content with AI.
For content teams, publishers, educators, and businesses, translated content should be reviewed with context, while AI-written content should not be expected to bypass detection simply because it has been translated.
Read More AI Detection Studies:
Previous research has already looked at the relationship between translation and AI detection, especially in the context of bypassing detectors.
One paper, ESPERANTO, tested whether back-translation could make AI-generated text harder to detect.
Back-translation means translating text into another language, or multiple languages, and then translating it back into the original language.
The researchers found that this type of translation-based rewriting could reduce detection performance for several detectors tested at the time. However, those results should be read in context: the study did not test Originality.ai’s current models, and detector performance can change significantly as models are updated.
Further, when our Originality.ai research team ran an extension of the study on Originality.ai models, Originality.ai’s AI detector continued to show robust and resilient performance.
Other research has focused on the broader challenge of multilingual AI detection. The MULTITuDE benchmark was created to test AI-generated text detection across multiple languages, including Spanish and Portuguese.
Its main relevance for this study is that AI detection should not be assumed to work the same way in every language.
Multilingual detection needs to be tested directly, especially as both generators and detectors continue to improve. Read more about Originality.ai’s multilingual accuracy.
A related area of research also separates fully AI-generated text from other types of machine-assisted writing, such as machine-polished or machine-translated text.
For example, the HERO paper approached machine-translated human writing as a different category than text generated from scratch by an AI model.
That distinction matters for publishers, educators, and businesses reviewing translated content.
A human-written article translated with a tool like Google Translate has a different content history than an article created directly by GPT-4o.
Taken together, previous research suggests that translation can affect AI detection results, but the impact depends on the detector, language, translation process, and model being tested.
In contrast to the risks identified in earlier research, this Originality.ai study found consistent results across the tested translation workflows:
This study focused specifically on English content translated into Spanish and Portuguese using Google Translate. Results may differ with other languages, other translation tools, or more complex translation workflows.
The AI-written samples in this study were generated with GPT-4o. Other AI models may produce different writing patterns, and future models may behave differently as generation quality continues to improve.
The study also used different Originality.ai models depending on the language being scanned. English samples were scanned with Lite 1.0.2, while Spanish and Portuguese samples were scanned with the multi-language model. This reflects a practical real-world usage pattern, but it also means the English and non-English scans are not a one-to-one comparison of the same detector model.
Another limitation is that the dataset was based on a specific type of written content. Results may vary for short-form content, student essays, technical documentation, marketing copy, product reviews, or highly edited publisher content.
Finally, round-trip translation is not the same as a typical localization workflow. Most businesses translating website content from English into Spanish or Portuguese would not translate that content back into English before publishing. For that reason, the round-trip results are most useful for understanding how repeated automated translation can change text, rather than as a direct reflection of normal website translation.

Unlock the potential of AI-powered image detection! Explore its applications in biology, medicine, and environmental sciences. Learn how it accelerates research, aids early disease diagnosis, and enhances safety in autonomous vehicles. Discover the transformative impact of AI image recognition beyond academia.