In artificial intelligence, a hallucination or artificial hallucination is a confident response by an artificial intelligence that does not seem to be justified by its training data when the model has a tendency of "hallucinating" deceptive data. The term is derived from the psychological concept of hallucination because they share similar characteristics. One of the dangers of hallucinations is that the output of the model will look correct even if it is wrong.

In natural language processing

natural language processing Natural language processing (NLP) is an interdisciplinary subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language, in particular how to program computers to pro ...

, a hallucination is often defined as "generated content that is nonsensical or unfaithful to the provided source content". Errors in encoding and decoding between text and representations can cause hallucinations. AI training to produce diverse responses can also lead to hallucination. Hallucinations can also occur when the AI is trained on a dataset wherein labeled summaries, despite being factually accurate, are not directly grounded in the labeled data purportedly being "summarized". Larger datasets can create a problem of parametric knowledge (knowledge that is hard-wired in learned system parameters), creating hallucinations if the system is overconfident in its hardwired knowledge. In systems such as GPT-3, an AI generates each next word based on a sequence of previous words (including the words it has itself previously generated in the current response), causing a cascade of possible hallucination as the response grows longer. By 2022, papers such as the '' New York Times'' expressed concern that, as adoption of bots based on large language models continued to grow, unwarranted user confidence in bot output could lead to problems. In August 2022,

Meta Meta (from the Greek μετά, '' meta'', meaning "after" or "beyond") is a prefix meaning "more comprehensive" or "transcending". In modern nomenclature, ''meta''- can also serve as a prefix meaning self-referential, as a field of study or ende ...

warned during its release of BlenderBot 3 that the system was prone to "hallucinations", which Meta defined as "confident statements that are not true". On 15 November 2022, Meta unveiled a demo of Galactica, designed to "store, combine and reason about scientific knowledge". Content generated by Galactica came with the warning "Outputs may be unreliable! Language Models are prone to hallucinate text." In one case, when asked to draft a paper on creating avatars, Galactica cited a fictitious paper from a real author who works in the relevant area. Meta withdrew Galactica on 17 November due to offensiveness and inaccuracy. OpenAI's ChatGPT, released in beta-version to public in December 2022, is based on the GPT-3.5 family of large language models. Professor Ethan Mollick of

Wharton Wharton may refer to: Academic institutions * Wharton School of the University of Pennsylvania * Wharton County Junior College * Paul R. Wharton High School * Wharton Center for Performing Arts, at Michigan State University Places * Wharton, Ch ...

has called ChatGPT an "omniscient, eager-to-please intern who sometimes lies to you". Data scientist Teresa Kubacka has recounted deliberately making up the phrase "cycloidal inverted electromagnon" and testing ChatGPT by asking ChatGPT about the (nonexistent) phenomenon. ChatGPT invented a plausible-sounding answer backed with plausible-looking citations that compelled her to double-check whether she had accidentally typed in the name of a real phenomenon. Other scholars such as Oren Etzioni have joined Kubacka in assessing that such software can often give you "a very impressive-sounding answer that's just dead wrong". Mike Pearl of ''

Mashable Mashable is a digital media platform, news website and entertainment company founded by Pete Cashmore in 2005. History Mashable was founded by Pete Cashmore while living in Aberdeen, Scotland, in July 2005. Early iterations of the site were a ...

'' tested ChatGPT with multiple questions. In one example, he asked the model for "the largest country in Central America that isn't Mexico". ChatGPT responded with

Guatemala Guatemala ( ; ), officially the Republic of Guatemala ( es, República de Guatemala, links=no), is a country in Central America. It is bordered to the north and west by Mexico; to the northeast by Belize and the Caribbean; to the east by H ...

, when the answer is instead Nicaragua. When CNBC asked ChatGPT for the lyrics to "The Ballad of Dwight Fry", ChatGPT supplied invented lyrics rather than the actual lyrics. In the process of writing a review for the new iPhone 14 Pro, ChatGPT incorrectly volunteered the relevant chipset as the A15 Bionic rather than the A16 Bionic, although this can be attributed to the fact that ChatGPT was trained on a dataset ending in 2021, a year before the release of the iPhone 14 Pro. Asked questions about New Brunswick, ChatGPT got many answers right but incorrectly classified Samantha Bee as a "person from New Brunswick". Asked about astrophysical magnetic fields, ChatGPT incorrectly volunteered that "(strong) magnetic fields of

black holes A black hole is a region of spacetime where gravity is so strong that nothing, including light or other electromagnetic waves, has enough energy to escape it. The theory of general relativity predicts that a sufficiently compact mass can def ...

are generated by the extremely strong gravitational forces in their vicinity". (In reality, as a consequence of the no-hair theorem, a black hole without an accretion disk is believed to have no magnetic field.) '' Fast Company'' asked ChatGPT to generate a news article on Tesla's last financial quarter; ChatGPT created a coherent article, but made up the financial numbers contained within. Other examples involve baiting ChatGPT with a false premise to see if it embellishes upon the premise. When asked about "

Harold Coward Harold Coward (born 1936) is a Canadian scholar of bioethics and religious studies. A Bachelor in Divinity (Christian Theology), he earned a doctoral degree in Philosophy in 1973 from the McMaster University. He was a professor at University of ...

's idea of dynamic canonicity", ChatGPT fabricated that Coward wrote a book titled ''Dynamic Canoicity: A Model for Biblical and Theological Interpretation'' arguing that religious principles are actually in a constant state of change. When pressed, ChatGPT continued to insist that the book was real. Asked for proof that dinosaurs built a civilization, ChatGPT claimed there were fossil remains of dinosaur tools and stated "Some species of dinosaurs even developed primitive forms of art, such as engravings on stones". When prompted that "Scientists have recently discovered churros, the delicious fried-dough pastries... (are) ideal tools for home surgery", ChatGPT claimed that a "study published in the journal '' Science''" found that the dough is pliable enough to form into surgical instruments that can get into hard-to-reach places, and that the flavor has a calming effect on patients. It is considered that there are a lot of possible reasons for natural language models to hallucinate data. For example: * Hallucination from data: There are divergences in the source content (which would often happen with large training data), *Hallucination from training: Hallucination still occurs when there is little divergence in the data set. In that case, it derives from the way the model is trained. A lot of reasons can contribute to this type of hallucination, such as: ** An erroneous decoding from the transformer ** A bias from the historical sequences that the model previously generated ** A bias generated from the way the model encodes its knowledge in its parameters

In other artificial intelligence

The concept of "hallucination" is applied more broadly than just natural language processing. A confident response from any AI that seems unjustified by the training data can be labeled a hallucination. '' Wired'' noted in 2018 that, despite no recorded attacks "in the wild" (that is, outside of proof-of-concept attacks by researchers), there was "little dispute" that consumer gadgets, and systems such as automated driving, were susceptible to adversarial attacks that could cause AI to hallucinate. Examples included a stop sign rendered invisible to computer vision; an audio clip engineered to sound innocuous to humans, but that software transcribed as "evil dot com"; and an image of two men on skis, that Google Cloud Vision identified as 91% likely to be "a dog".

Analysis

Various researchers cited by ''Wired'' have classified adversarial hallucinations as a high-dimensional statistical phenomenon, or have attributed hallucinations to insufficient training data. Some researchers believe that some "incorrect" AI responses classified by humans as "hallucinations" in the case of object detection may in fact be justified by the training data, or even that an AI may be giving the "correct" answer that the human reviewers are failing to see. For example, an adversarial image that looks, to a human, like an ordinary image of a dog, may in fact be seen by the AI to contain tiny patterns that (in authentic images) would only appear when viewing a cat. The AI is detecting real-world visual patterns that humans are insensitive to. However, these findings have been challenged by other researchers. For example it was objected that the models can be biased towards superficial statistics, leading adversarial training to not be robust in real-world scenarios.

Mitigation methods

The hallucination phenomenon is still not completely understood. Therefore there is still ongoing research to try to mitigate its apparition. Particularly, it was shown that language models not only hallucinate but also amplify hallucinations, even for those which were designed to alleviate this issue.

References

{{Authority control Artificial intelligence Computational linguistics Natural language processing Unsupervised learning

In natural language processing

In other artificial intelligence

Analysis

Mitigation methods

See also

References