Try the Most Accurate AI Detector on the Market
Our patented AI checker is the most accurate detector on the market! Don't believe us? Try it for yourself!
Try for FREE Here!
AI Studies

10.4% of AI Overview Citations are AI-Generated

We studied AI Overview citations to find out how many AIO citations are AI-generated within and outside of the top-100 SERPs. These are our findings.

AI Overviews are reshaping how we find and consume information online. In short, they select a handful of key sources and synthesize a short overview of the main facts about a user’s search. 

Typically, linked citations are included along with the overview, where users can then click to access more information about the cited source.

However, given AI’s tendency to produce AI hallucinations, such as with the AI book scandal, it raises important questions: 

  • To what extent are AI Overview sources AI-generated?
  • How much of the citation pool is made up of machine-written content?

In this study, we focused specifically on Google’s AI Overviews.

2 Key Findings

  1. 10.4% of citations are AI-generated
  2. 48% of citations come from the top 100 organic results

Methodology: how we conducted our research

To answer these questions, we built a dataset and analysis pipeline in three steps:

Selecting queries

We started with the MS MARCO Web Search dataset, which contains real search queries typed by Bing users. 

Using OpenAI’s gpt-4.1-nano, we classified each query as either YMYL or non-YMYL, and then divided the YMYL queries into four sub-categories: Health & Safety, Finance, Legal, and Politics. 

From this set, we randomly sampled 29,000 YMYL queries for analysis.

Collecting results

We used SerpAPI, which provides structured access to Google SERPs, to retrieve AI Overview presence and citations, as well as the top-100 organic results for each query.

Classifying documents

Every citation and organic URL was then classified as AI-generated or human-written using the Originality.ai AI Detection Lite 1.0.1 model. 

Documents that could not be confidently classified (for example, broken links, PDFs, videos, or pages with too little text) were grouped into an unclassifiable category.

The Analysis: Our Findings

10.4% of Cited Documents are AI-Generated

We began with the overall distribution of cited documents. 

  • 10.4% are AI-generated
  • 74.4% are Human-written
  • 15.2% fall into the unclassifiable category (more on that below)
10.4 percent of Cited Documents are Likely AI in Google AI Overviews

Most Unclassifiable Documents Had Too Little Text or Broken Links

Within the unclassifiable group, the most common reasons were:

  • 44.1% too little text
  • 20.0% broken links
  • 19.1% video pages
  • 16.8% PDFs
Documents that couldn't be classified had too little text or broken links

Citations in the Top-100 SERPs vs. Outside the Top 100

We then analyzed the prevalence of citations in the Top-100 SERPs vs. outside (or absent from) the Top-100 SERP results. 

52 percent of AI citations are outside the top 100 SERPs
  • 48% of citations were present in the Top-100 SERPs
  • 52% came from outside the Top-100 SERPs

The next key finding was that citations from outside the Top-100 SERPs also contained a higher proportion of AI-generated content.

12.8 percent of Citations Outside the Top 100 SERPs are Likely AI
AI, Human, or Unclassifiable? Top-100 SERPs Outside the Top-100 SERPs
AI-Generated 7.7% 12.8%
Human-Written 78.9% 70.4%
Unclassifiable 13.4% 16.8%
Source: Originality.ai Analysis of SERP Rankings

Final Thoughts

About 1 in 10 citations in Google’s AI Overviews is AI-generated. 

More than half of the citations come from outside the top-100 search results, and these sources are more likely to include AI-written material.

Since this study looked only at YMYL queries (health, finance, law, and politics), the findings are more serious. 

Most citations are still human-written, but even a small share of AI-generated content in these areas raises concerns about reliability and trust.

How could this impact AI models?

Beyond immediate citation quality, there is a longer-term risk of model collapse

AI Overviews themselves are not part of training data, but by surfacing AI-generated sources, they boost those sources’ visibility and credibility. 

This, in turn, increases the likelihood that such material is crawled into future training sets. 

Over time, models risk learning from outputs of earlier models rather than from human-authored knowledge. That recursive feedback loop can amplify errors, reduce diversity of perspectives, and ultimately degrade the reliability of online information.

Maintain transparency in the age of AI with the Originality.ai AI Checker.

Does Google ranking correlate with the probability of an AI Citation? Find out in our study on Rankings and AI Citations.

Then, learn more about AI Search and AI Detection Accuracy:

Madeleine Lambert

Madeleine Lambert

Madeleine Lambert is the Director of Marketing and Sales at Originality.ai, with over a decade of experience in SEO and content creation. She previously owned and operated a successful content marketing agency, which she scaled and exited. Madeleine specializes in digital PR—contact her for media inquiries and story collaborations.

More From The Blog

Al Content Detector & Plagiarism Checker for Marketers and Writers

Use our leading tools to ensure you can hit publish with integrity!