Originality.AI is thrilled to introduce a step towards a more inclusive platform with the launch of our Multilanguage Release. This release supports 15 languages in total. The decision for this release aims to address some of the existing gaps that have stood between diverse languages and robust AI Detection Technology.
In this blog post we dive into how our Multilanguage Release was built, how we tested it and what future work we will be doing to enhance this feature.
Key Takeaways:
The foundation of our Multilingual AI Detector lies within its extensive dataset which boasts millions of data samples that were curated for our analysis. The dataset is split into two categories.
1. Content penned by humans
2. Content generated by some of the most prominent AI language models, including text-davinci (GPT3), GPT 3.5-turbo, chat-GPT, and the state-of-the-art GPT-4.
Spanning fifteen different languages, this dataset presented a unique challenge in multilingual AI detection.
2. Experiments: Evaluating Performance
In order to conduct a thorough evaluation of the Multilingual AI Detector’s performance, we did some experiments using a benchmark dataset of up to millions of samples. The results represented below in the confusion matrix represent reliable scores
3. Results Per Language
The performance of the Multilingual AI Detector performs very well across all languages, consistently surpassing a 90% accuracy rate. Below is a graph of the accuracy achieved for each language:
4. Conclusion: A Win For Languages
After lots of experimentation and testing,our Multilingual AI Detector - Version 1.0 emerges with its accuracy consistently above 90%. The accuracy rate is approximately 95%, while the false positive rate hovers around 4.5%. There is still room for improvement, but we believe this is a step in the right direction, bringing together different languages, and smart detection methods.
5. How to Access the Model
With no change to the existing Originality.AI workflow and no action required by the user, the Multilanguage AI Detection model automatically detects if the English language is not being used and will toggle automatically to the Multilinguage Model to deliver a seamless user experience.
6. A Roadmap Ahead
Our development plans don’t stop here for our Multilingual AI Detector. Here's what's planned ahead:
We hope you enjoy our latest release!