Reddit, a place where users can share their thoughts, seek advice from peers, tell their stories and receive judgment. As AI progresses in daily life it has the potential to shape the conversations and interactions that unfold on this site. In this we will analyze AI usage in popular subreddits including TrueOffMyChest, RelationshipAdvice, and AmITheAsshole between the years 2019-202 and within top filtered posts.
To explore the trends, patterns, and impact of AI within the subreddit communities. Is it TrueOffMyChest or TrueOffMyCircuits, AmITheAsshole or AITheAsshole, Relationship Advice or Relationship Advice from an AI (I’ve run out of puns).
-> AI Content is highest in 2020, and not present in 2021.
-> AI Content begins to rise again in 2022 and by 2023 AI Content accounts for 1.69% of the dataset.
-> Thus far into 2024 AI Content accounts for 1.60% of the dataset.
-> Within the daily top posts, AI Content accounts for 2.26% of the dataset.
-> As it progresses to weekly top posts the AI Content drops down to 0.54% of the dataset, holding the lowest percentage stories chosen from this filter have the highest chance of being human generated.
-> Within the top monthly posts we see the highest amount of AI content where it accounts for 2.90% of the dataset.
-> Within the yearly top posts it decreases once again to 1.02% of the data.
-> Finally within the Top All Time posts, AI Content accounts for 1.53% of the dataset.
-> AI Content is highest in the year 2022, where 7.52% of the dataset is AI Generated. This coincides with the launch of ChatGPT 3.5.
-> AI Content is higher in 2019 then it was in 2020 and 2021.
-> AI Content dropped to 2.33% of the dataset in 2023.
-> Thus far into 2024 AI Content accounts for 4.60% of the dataset.
-> Within 2020 Ai Content is only detected within the month of March.
-> Within 2021 Ai Content is only detected within the month of May.
-> AI Content was not detected in the dataset before May.
-> AI Content was not detected in the datasets for August and October.
-> The Range where AI Content was detected varies from 9.09% - 27.27%
-> For the majority AI Content accounts for around 10% of the dataset, with the exception of July and November where AI Content spikes to 27.27 percent.
-> The November Spike in AI Content corresponds with the launch of Chatgpt 3.5
-> AI Content is not present within the monthly datasets for January, March, April, and from October onwards.
-> There is a spike in AI Content in February which coincides with the launch of ChatGPT Plus.
-> There is a spike in AI Content in May which coincides with upgrades to ChatGPT
-> There is a spike in AI Content in September which coincides with the introduction of ChatGPT language support.
While analyzing the percentage of the following, we separate them into the following categories.
-> Family Based Advice - This corresponds to disagreements between family members.
-> Friendship Based Advice - This corresponds to disagreements between friends.
-> Dating Based Advice - This corresponds to advice for trying to start a relationship, disagreements/issues within relationships, or recently broken up relationships.
-> Engagement Based Advice - This corresponds to disagreements/issues between couples who are engaged.
-> Marital Based Advice - This corresponds to disagreements/issues between married couples.
While analyzing the distribution of categories we can notice the following.
-> Within the 2019 dataset the issues of potential AI generated posts are either Friendship or Marriage related. With Marriage related posts taking the majority of around 67% of the dataset.
-> In 2020, 100% of the AI generated content is Engagement related.
-> In 2021, 100% of the AI generated content in Dating related.
-> In 2022, Dating related posts make up 60% of the AI generated content, while Marriage accounts for 30% of the total dataset. Finally Friendship related content accounts for 10% of the dataset.
-> In 2023, Dating and Engagement related posts are the only AI generated content within the dataset with engagement related. With engagement related posts holding the majority of around 67% of the dataset.
-> In 2024, we begin to see more variety within the data set with the appearance of family related posts, however this accounts for around 4.35% of the dataset, similar in percentage to the engagement content. Friendship related content is also present in this year's dataset making up 13% of the dataset. However, dating related posts hold the majority this year with around 78% of the dataset.
-> AI Content is most present in the top daily posts, and least present in top all time posts.
-> AI Content in top weekly, monthly, and yearly posts are all approximately around 3% of the total dataset.
-> AI Content accounts for 8% of the dataset in 2019
-> AI Content isn’t present in the dataset between the years 2020 to 2022.
-> AI Content spikes to 12.54% of the dataset in 2023
-> Thus far into 2024 AI Content accounts for 4.50% of the dataset.
-> AI Content was not detected in the January or December datasets.
-> AI Content ranges from 4.55-30% of the dataset in months where it was present.
-> AI Content was higher during the months of November, August, and June.
-> AI Content was lower during the months of May, July, and September
-> As there was no AI Content detected in 2021 and 2022, those years are omitted from the graph.
-> In 2019, posts where OP was judged Not the Asshole (or NTA) account for a majority of the data at 50%, while posts where OP was judged to be an Asshole take 37.50% of the dataset, finally Unknown judgements account for 12.50% of the dataset for that year.
-> In 2023 we see that posts where OP was judged to be an Asshole shrink to account for 9.30% of the data, while NTA judgements grew to 86% of the dataset. We also see the inclusion of ESH (Everyone Seems Horrible) where the reddit community has judged that everyone mentioned in the post is kind of an asshole, this accounting for 2.33% of the data. Finally we also have the inclusion of Updates: which have no judgement.
-> In 2024 we see that NTA judgements shrank slightly, where they now account for 78% of the dataset, the same goes for Asshole judgements which shrank as well to 8.70% of the dataset. We also see the inclusion of NAH (No Assholes), where the reddit community has judged that no parties mentioned in the post were behaving like assholes.
-> AI Content is most present when filtering by top year, where it accounts for 16.26% of the dataset.
-> AI Content is least present when filtering by top all time, where it accounts for 0.81% of the dataset.
-> AI Content ranges from 1.82% - 2.58% of the dataset when filtering by top daily, weekly, and monthly.
-> The majority of judgments lean towards Not the Asshole.
-> In Top Month and Top Today, the Judgements are either NTA or NAH
-> In Top Week, the Judgments are split into Asshole and NTA, with NTA taking 1/3 of the dataset, we also see the only inclusion of Unknown judgements in the Top filters.
-> While filtering by Top Year we see that the large majority of posts are NTA, we also can note the addition of ESH (everyone seems horrible) and Updates to previous posts.
In conclusion, while AI Content varies in nature, presence, and percentage we can take the following home with us. All in all, AI content seems to be on the rise with the percentages thus far into 2024. Within the analyzed subreddits we can conclude the following: RelationshipAdvice played matchmaker with users and AI, TrueOffMyChest has a love-hate relationship with AI, and most assholes from AITA are in fact human, so at least artificial intelligence can’t take that away from us.
We believe that it is crucial for AI content detectors reported accuracy to be open, transparent and accountable. The reality is, each person seeking AI-detection services deserves to know which detector is the most accurate for their specific use case.