AI Studies

Most Commonly Used ChatGPT Words & Phrases

Are humans as good at detecting AI content as they think? We analyzed 10 million ChatGPT terms to find out! Discover ChatGPT's most common words and phrases.

With the release of ChatGPT in November 2022, many content marketers and experts have come up with ways to easily identify AI content. Most recently, numerous online discussions and blogs have cropped up, claiming to have captured its most commonly used words. But can humans accurately identify ChatGPT-generated content? Or do we just think we can?

To see how helpful and accurate ‘ChatGPT’s Most Common Words/Phrases’ roundups are, we collected articles, blogs, and Reddit posts that listed ChatGPT’s most common outputs and compared them to our extensive dataset of AI-generated text. 

Summary (TLDR)

  • Current posts and studies showing “common” ChatGPT words and phrases are inaccurate due to insufficient data.
  • We gathered and analyzed over 10 million words/phrases to form an accurate dataset of commonly used ChatGPT terms.
  • Humans cannot use common words and/or phrases as classifiers to identify ChatGPT-generated content.
  • Some words/phrases do show up more frequently than others, however, they are not easily distinguishable from human writing. Below are some unique phrases that we found to be more common than others:

Overconfidence in AI-Detecting Abilities: Humans are not good at identifying AI

Multiple studies have shown that humans are not as good at detecting AI content as they think they are.

A number of reviews, blog posts, and social posts have claimed to be able to easily identify ChatGPT-generated content through certain words and phrases — i.e. many suggest that the word “delve” is used in almost every ChatGPT-generated text. 

But can we really use these lists to help detect AI-generated text?  

While you may be tempted to answer ‘yes,’ there are multiple data points that suggest otherwise. For example, a study by the University of Washington and this similar one, shows us that humans are likely overconfident in their AI-detecting abilities. In the first study, researchers found “that [human] evaluators were unable to distinguish between GPT-3 and human-authored text across… domains”.

Ai-Generated Text Detection Accuracy Dataset
Source: https://arxiv.org/pdf/2107.00061

Ironically, the human evaluators’ confidence, measured in the percentage of “Definitely” responses, remained high even as their accuracy scores plummeted across various testing conditions. Humans reached opposite conclusions for the same observations — proving that humans detect AI subjectively, with personal bias.

Human and Machine Generated Contents
Credits: https://arxiv.org/pdf/2107.00061

Above is an excerpt of two evaluators’ reasonings for why the text is human-generated (left) or GPT-generated (right). Humans reached opposite conclusions for the same observations — proving that humans aren’t as good as they think they are at identifying AI-written content.

Datasets for words and phrases

To have a comprehensive dataset of the words and phrases ChatGPT is actually using, we collected and cleaned seven datasets of AI-generated text, then tallied up the most commonly used words and phrases in those outputs. Here is where our compiled dataset was gathered: 

What are the most commonly used ChatGPT words and phrases?

Commonly used ChatGPT words

Below is a list of all the most commonly used ChatGPT words, from highest (most used) to lowest (least used).

Commonly used ChatGPT phrases

Below is a list of all the most commonly used ChatGPT phrases, from highest (most used) to lowest (least used).

What Does This Tell Us? … Not Much

There aren’t many words that stand out or are unique when compared to human-generated content. For example, some of the most common words are: 

  • Which
  • Such
  • More
  • Create
  • Their
  • Example 

These aren’t very unique or easily identifiable — most of these words are actually commonly used in human-written content. 

When we look at the most common phrases, we see: 

  • of the
  • in the
  • to the
  • is a
  • you can
  • on the
  • such as
  • and the
  • can be
  • it is

Again, very common phrases that are not easily identified as ChatGPT text.

Moving down the list (to the less common words), we can see some more unique words, such as ‘designed,’ ‘reflect,’ ‘confident,’ and ‘healthier.’ We also see more unique phrases such as “perform operations,” “benefits and drawbacks,” and “nature of reality.” However, these were not found in our datasets very often and therefore aren’t nearly as common in ChatGPT outputs.

At first glance, this dataset doesn’t show us any major differences between the words ChatGPT uses and the words used in human writing. In other words, ChatGPT does not use many words that can easily be spotted by humans.

This theory is backed up by both our dataset and other studies such as Elizabeth Clark’s study which looks at how likely it is that humans can detect AI-generated content. This study determined that even after training, humans were about 55% accurate in detecting AI content — basically a coin flip. When looking at the common words used by ChatGPT in our dataset, we can see that this theory stands up. ChatGPT doesn’t tend to use unique or easily identifiable words in its outputs, which makes it difficult to identify. 

Digging deeper into the data

Are there any words or phrases that are unique to ChatGPT like so many blogs and experts claim?

According to several blogs and social posts, ChatGPT-produced text is easy to identify because it commonly uses unusual words. To see if this is indeed true, we gathered all these words and phrases from publicly available articles and posts online. Here is a list of the so-called commonly used words and phrases that experts claim to be easily identifiable:

To verify whether or not they were actually being used in ChatGPT outputs, we compared them to our extensive dataset. 

Most common ChatGPT words according to blogs

Here are the results:

Results and Findings:

The findings from this table are interesting — there was a lot of overlap between the online lists we found and our dataset. In fact, almost every word was used in our dataset. Here’s the caveat — most words were NOT frequently used by ChatGPT. 

  • The words: “unique,” “additionally,” “finally,” “array,” “deep,” “conclusion,” “journey,” “difference,” “certainly,” “crucial,” “navigate,” “beyond,” and “enter” were the only words used over 1,000 times in our dataset (between 1,074 and 4,479 times). some text
    • Some of these could be identified as unique words - such as “journey,” “navigate,” or “array.” However, most are fairly common — such as “additionally” or “difference.” 
    • Note — how people identify words as unique is subjective to personal opinion and up for debate. However, even with that being said, nothing major stood out to us here. 
  • Many other unique words that experts claim to be commonly used, were actually NOT used much at all. Here are some that stood out:some text
    • “Delve” - only used 146 times
    • “Embark” - only used 139 times
    • “Nuances” - only used 109 times
    • “Imperative” - only used 85 times
    • “Beacon” - only used 85 times
    • “Endeavor” - only used 82 times
    • “Whimsical” - only used 63 times
    • “Unleash” - only used 60 times
    • “Elevated” - only used 53 times
    • “Game-changer” - only used 49 times
    • “Paramount” - only used 44 times
    • “Plethora” - only used 38 times
    • “Myriad” - only used 36 times
    • “Trivial” - only used 32 times
    • “Meticulously” - only used 16 times
    • “Dazzle” - only used 2 times

The most popular word — “delve” — which according to multiple sources is the most easily identifiable culprit behind ChatGPT-generated text, was only used 146 times. This is minuscule compared to the tens of millions of words and phrases we analyzed in our dataset. 

In other words, although there are some unique identifying words that might be used commonly by ChatGPT, from this list we can see that many aren’t — and even those that are can be difficult to identify and distinguish from human-written text. 

Most commonly used ChatGPT phrases

Here are the results:

Results and Findings:

  • The phrases: “but a,” “on the other hand,” “in summary,” “in the world,” “not just,” “for instance,” “informed decision,” “in the end,” “its essential,” “in the world of,” “in this article,” “is key to,” and “dive in,” were the only words used over 100 times in our dataset (between 109 and 2,319 times). some text
    • Some of these could be identified as unique words — such as “in the world” or “informed decision.” However, most are fairly common — such as “but a” or “for instance.”

  • Many other unique phrases that experts claim to be commonly used, were actually NOT used often at all. Here are some that stood out:some text
    • “Plays a crucial role” - only used 91 times
    • “Delve into” - only used 80 times
    • “Treasure trove” - only used 16 times
    • “First and foremost” - only used 16 times
    • “In a nutshell” - only used 14 times
    • “Shedding light” - only used 7 times
    • “Unsung hero” - only used 6 times
    • “I hope you are doing well” - only used 5 times
    • “In the era of” - only used 3 times
    • “Top tier” - only used 3 times
    • “In this digital age” - only used 3 times
    • “Ever evolving” - only used 2 times

Similar to the words dataset, our research on phrases highlights that what experts claim are common ChatGPT-generated phrases — aren’t actually that common at all!

The datasets demonstrate that there aren’t any words or phrases that help humans easily identify ChatGPT-generated content.

Conclusion: ChatGPT is Harder to Identify Than We Think

Our initial findings show that in contrast to their confidence levels, humans are actually NOT good at identifying AI-generated content

Humans tend to be overconfident regarding their ability to correctly spot AI-generated content and lists containing ‘ChatGPT’s Top Words and Phrases’ are more or less useless.

AI detectors like Originality.ai — which are significantly more accurate than humans at identifying AI content — should be used to help verify when and if AI is being used. 

Frequently Asked Questions

Are there any common words or phrases used by ChatGPT?

In short, no. Surprisingly, ChatGPT rarely uses words from the ‘most-common’ roundups. To the contrary, its most frequently used words and phrases — according to our extensive dataset — are generic and typical of human writing. 

While it is true that many of the words from the blog roundups showed up in our dataset, they did not appear nearly as many times as you would expect, given how often they were mentioned across multiple articles and blogs. 

Can humans detect AI-generated content themselves?

No. According to the University of Washington study —  a study of how likely humans are to detect AI-generated content — even after training, people were only able to guess the origins of a text 55% of the time — basically a coin flip. Without training, human accuracy scores were even lower — and this was for ChatGPT3, the less-sophisticated counterpart to ChatGPT4.

Are roundups of ChatGPT’s most common words helpful?

Not really. This is because they are not backed by reliable data — they usually represent the author's personal experience using ChatGPT.  The ‘most-common’ words depend on:

  • The context of ChatGPTs response — different contexts utilize different words.
  • The specific ChatGPT model — different models use different words.
  • The version of ChatGPT — later versions are better trained to seem human. 

If I come across a word listed in one of these roundups in a text, does that mean it is AI-generated?

The short answer is no. Humans use many of these words too! If however, you see an unusual word that humans rarely use, like “tapestry,” it is more likely your text was produced by ChatGPT. Since you can never be sure, we recommend you scan the text with a trusted AI detector, such as Originality.ai.

Our workflow

  • In order to find the most common phrases, we collected datasets containing ChatGPT responses to a variety of questions.
  • After exploring them, the ChatGPT-generated responses were extensively cleaned. The datasets originally totaled 12,517,475 words; after cleaning them to remove non-english text and punctuation, we had only 10,784,010 words of ChatGPT text to analyze.

  • Lastly, the pre-processed text was analyzed to find the most common words and  phrases. For each phrase, we also calculated the PMI Score (Pointwise Mutual Information) and recorded the number of times the phrase appeared across all datasets. For each word we simply recorded its count in our extensive dataset.

Jonathan Gillham

Founder / CEO of Originality.ai I have been involved in the SEO and Content Marketing world for over a decade. My career started with a portfolio of content sites, recently I sold 2 content marketing agencies and I am the Co-Founder of MotionInvest.com, the leading place to buy and sell content websites. Through these experiences I understand what web publishers need when it comes to verifying content is original. I am not For or Against AI content, I think it has a place in everyones content strategy. However, I believe you as the publisher should be the one making the decision on when to use AI content. Our Originality checking tool has been built with serious web publishers in mind!

More From The Blog

AI Content Detector & Plagiarism Checker for Serious Content Publishers

Improve your content quality by accurately detecting duplicate content and artificially generated text.