Everyday you read claims and might need to cite information in your work. But how can you know that information is accurate? The rise of generative AI adds a new potential source of misinformation. What can happen when AI programs give you “facts” that are more artificial than intelligent?
In this article we look at times that the use of generative AI (such as ChatGPT) led to serious problems for the user because the AI hallucinated or provided factually inaccurate information.
In the past year, there have been high-profile examples of false AI-generated information causing embarrassment, possible injury, and, in one case, the threat of legal sanctions.
AI-generated writing was suspected when Microsoft Start’s travel pages published a guide of places to visit in the Canadian capital of Ottawa. While there were errors in details about some locations, most of the commentary about the article was about how it included the Ottawa Food Bank as a “tourist hotspot,” encouraging readers to visit on “an empty stomach.”
Higher awareness of the 50 reporters recently laid off due to increased use of generative AI for Microsoft News’s articles, public embarrassment, weakened trust
Originality.ai’s Fact Checker flagged the claim as inaccurate and inappropriate.
A Texas A&M University-Commerce teacher gave his entire class a grade of "Incomplete" because when he asked ChatGPT if the students' final essays were AI-generated, the tool told him they all were, even though detecting such text is outside
ChatGPT's abilities or intended use.
Students protested that they were innocent, and the university investigated both the students and the teacher. The university has issued a number of policies in response.
The Fact Checker notes some of ChatGPT’s limitations that suggest the teacher’s use of it is inappropriate.
In February, Google’s Bard AI Google found out how its own Bard generative AI could produce errors in the program’s first public demo, where Bard stated that the James Webb Space Telescope “took the very first pictures of a planet outside of our own solar system,” when the first such photo was taken 16 years before the JWST was launched.
Once the error became known, Google’s stock price lost as much as 7.7%, or $100-billion, in the next day of trading.
The day after Bard debuted, Microsoft’s Bing Chat A.I. gave a similar public demo, complete with factual errors. Bing Chat gave inaccurate figures about the Gap’s recent earnings report and Lululemon’s financial data.
Public embarrassment, weakened trust
ChatGPT invented a number of court cases to be used as legal precedents in a legal brief Steven A. Schwartz submitted in a case. The judge tried to find the cited cases, but found they did not exist.
Schwartz, another lawyer, and his law firm were fined $5,000 by the court. As his legal team noted, “Mr. Schwartz and the Firm have already become the poster children for the perils of dabbling with new technology; their lesson has been learned.”
A Bloomberg reporter tested both Bard and Bing Chat about the current conflict between Israel and Gaza, and both falsely claimed a ceasefire had been declared, likely based on news from May 2023. When the reporter asked a follow-up question, Bard did backtrack, saying, “No, I am not sure that is right. I apologize for my previous response,” but also made up casualty numbers for two days into the future.
Public embarrassment, weakened trust
Amazon’s Kindle Direct Publishing sold likely AI-written guides to foraging for edible mushrooms. One e-book encouraged gathering and eating species that are protected by law. Another mushroom guide had instructions at odds with accepted best practices to identify mushrooms that are safe to eat.
Public embarrassment, weakened trust
The Chronicle of Higher Education reported that a university librarian was asked to produce articles from a list of references a professor provided. When she concluded the articles did not exist, the professor revealed that ChatGPT had provided them. In academia, researchers are finding that generative AI understands the form of what a good reference should look like, but that doesn’t mean that the articles exist. ChatGPT can make up convincing references with coherent titles attached to authors prominent in the field of interest. Studies by the National Institutes of Health have found that up to 47% of ChatGPT references are inaccurate.
Public embarrassment, loss of trust, loss of potential market.
The Fact Checker app offers a system to assess if a claim is potentially false. Fact Checker highlights individual passages, and then provides links to sources that support or counter a claim in the passage. It will tell you the likelihood that a statement is true or false. Future public embarrassments and legal troubles could be avoided with some diligence. AI has the potential to aid many tasks, but users need to understand its limitations and potential pitfalls. Originality.ai has the tools to let you check for accuracy, plagiarism, and the likelihood of AI-generated text in documents.