Question answering (QA) is a

computer science Computer science is the study of computation, information, and automation. Computer science spans Theoretical computer science, theoretical disciplines (such as algorithms, theory of computation, and information theory) to Applied science, ...

discipline within the fields of

information retrieval Information retrieval (IR) in computing and information science is the task of identifying and retrieving information system resources that are relevant to an Information needs, information need. The information need can be specified in the form ...

and

natural language processing Natural language processing (NLP) is a subfield of computer science and especially artificial intelligence. It is primarily concerned with providing computers with the ability to process data encoded in natural language and is thus closely related ...

(NLP) that is concerned with building systems that automatically answer

questions A question is an utterance which serves as a request for information. Questions are sometimes distinguished from interrogatives, which are the grammar, grammatical forms, typically used to express them. Rhetorical questions, for instance, are i ...

that are posed by humans in a

natural language A natural language or ordinary language is a language that occurs naturally in a human community by a process of use, repetition, and change. It can take different forms, typically either a spoken language or a sign language. Natural languages ...

Overview

A question-answering implementation, usually a computer program, may construct its answers by querying a structured

database In computing, a database is an organized collection of data or a type of data store based on the use of a database management system (DBMS), the software that interacts with end users, applications, and the database itself to capture and a ...

of knowledge or information, usually a

knowledge base In computer science, a knowledge base (KB) is a set of sentences, each sentence given in a knowledge representation language, with interfaces to tell new sentences and to ask questions about what is known, where either of these interfaces migh ...

. More commonly, question-answering systems can pull answers from an unstructured collection of natural language documents. Some examples of natural language document collections used for question answering systems include: * a collection of reference texts * internal organization documents and web pages * compiled

newswire A news agency is an organization that gathers news reports and sells them to subscribing news organizations, such as newspapers, magazines and radio and television broadcasters. A news agency may also be referred to as a wire service, newswir ...

reports * a set of

Wikipedia Wikipedia is a free content, free Online content, online encyclopedia that is written and maintained by a community of volunteers, known as Wikipedians, through open collaboration and the wiki software MediaWiki. Founded by Jimmy Wales and La ...

pages * a subset of

World Wide Web The World Wide Web (WWW or simply the Web) is an information system that enables Content (media), content sharing over the Internet through user-friendly ways meant to appeal to users beyond Information technology, IT specialists and hobbyis ...

pages

Types of question answering

Question-answering research attempts to develop ways of answering a wide range of question types, including fact, list,

definition A definition is a statement of the meaning of a term (a word, phrase, or other set of symbols). Definitions can be classified into two large categories: intensional definitions (which try to give the sense of a term), and extensional definitio ...

, how, why, hypothetical, semantically constrained, and cross-lingual questions. * Answering questions related to an article in order to evaluate

reading comprehension Reading comprehension is the ability to process written text, understanding, understand its meaning, and to integrate with what the reader already knows. Reading Comprehension of spoken language, comprehension relies on two abilities that are co ...

is one of the simpler form of question answering, since a given article is relatively short compared to the domains of other types of question-answering problems. An example of such a question is "What did Albert Einstein win the Nobel Prize for?" after an article about this subject is given to the system. * ''Closed-book'' question answering is when a system has memorized some facts during training and can answer questions without explicitly being given a context. This is similar to humans taking closed-book exams. * ''Closed-domain'' question answering deals with questions under a specific domain (for example, medicine or automotive maintenance) and can exploit domain-specific knowledge frequently formalized in

ontologies In information science, an ontology encompasses a representation, formal naming, and definitions of the categories, properties, and relations between the concepts, data, or entities that pertain to one, many, or all domains of discourse. More ...

. Alternatively, "closed-domain" might refer to a situation where only a limited type of questions are accepted, such as questions asking for

descriptive In the study of language, description or descriptive linguistics is the work of objectively analyzing and describing how language is actually used (or how it was used in the past) by a speech community. François & Ponsonnet (2013). All aca ...

rather than procedural information. Question answering systems machine reading applications have also been constructed in the medical domain, for instance Alzheimer's disease. * '' Open-domain'' question answering deals with questions about nearly anything and can only rely on general ontologies and world knowledge. Systems designed for open-domain question answering usually have much more data available from which to extract the answer. An example of an open-domain question is "What did Albert Einstein win the Nobel Prize for?" while no article about this subject is given to the system. Another way to categorize question-answering systems is by the technical approach used. There are a number of different types of QA systems, including *

rule-based system In computer science, a rule-based system is a computer system in which domain-specific knowledge is represented in the form of rules and general-purpose reasoning is used to solve problems in the domain. Two different kinds of rule-based systems ...

s, * statistical systems, and *

hybrid systems A hybrid system is a dynamical system that exhibits both continuous and discrete dynamic behavior – a system that can both ''flow'' (described by a differential equation) and ''jump'' (described by a state machine, automaton, or a differen ...

. Rule-based systems use a set of rules to determine the correct answer to a question. Statistical systems use statistical methods to find the most likely answer to a question. Hybrid systems use a combination of rule-based and statistical methods.

History

Two early question answering systems were BASEBALL and LUNAR. BASEBALL answered questions about Major League Baseball over a period of one year. LUNAR answered questions about the geological analysis of rocks returned by the Apollo Moon missions. Both question answering systems were very effective in their chosen domains. LUNAR was demonstrated at a lunar science convention in 1971 and it was able to answer 90% of the questions in its domain that were posed by people untrained on the system. Further restricted-domain question answering systems were developed in the following years. The common feature of all these systems is that they had a core database or knowledge system that was hand-written by experts of the chosen domain. The language abilities of BASEBALL and LUNAR used techniques similar to

ELIZA ELIZA is an early natural language processing computer program developed from 1964 to 1967 at MIT by Joseph Weizenbaum. Created to explore communication between humans and machines, ELIZA simulated conversation by using a pattern matching and ...

and

DOCTOR Doctor, Doctors, The Doctor or The Doctors may refer to: Titles and occupations * Physician, a medical practitioner * Doctor (title), an academic title for the holder of a doctoral-level degree ** Doctorate ** List of doctoral degrees awarded b ...

, the first chatterbot programs.

SHRDLU SHRDLU is an early natural-language understanding computer program that was developed by Terry Winograd at MIT in 1968–1970. In the program, the user carries on a conversation with the computer, moving objects, naming collections and query ...

was a successful question-answering program developed by

Terry Winograd Terry Allen Winograd (born February 24, 1946) is an American computer scientist. He is a professor at Stanford University, and co-director of the Stanford Human–Computer Interaction Group. He is known within the philosophy of mind and artificia ...

in the late 1960s and early 1970s. It simulated the operation of a robot in a toy world (the "blocks world"), and it offered the possibility of asking the robot questions about the state of the world. The strength of this system was the choice of a very specific domain and a very simple world with rules of physics that were easy to encode in a computer program. In the 1970s,

s were developed that targeted narrower domains of knowledge. The question answering systems developed to interface with these

expert system In artificial intelligence (AI), an expert system is a computer system emulating the decision-making ability of a human expert. Expert systems are designed to solve complex problems by reasoning through bodies of knowledge, represented mainly as ...

s produced and valid responses to questions within an area of knowledge. These expert systems closely resembled modern question answering systems except in their internal architecture. Expert systems rely heavily on expert-constructed and organized

s, whereas many modern question answering systems rely on statistical processing of a large, unstructured, natural language text corpus. The 1970s and 1980s saw the development of comprehensive theories in

computational linguistics Computational linguistics is an interdisciplinary field concerned with the computational modelling of natural language, as well as the study of appropriate computational approaches to linguistic questions. In general, computational linguistics ...

, which led to the development of ambitious projects in text comprehension and question answering. One example was the Unix Consultant (UC), developed by

Robert Wilensky Robert Wilensky (26 March 1951 – 15 March 2013) was an American computer scientist and emeritus professor at the UC Berkeley School of Information, with his main focus of research in artificial intelligence. Academic career In 1971, Wilensky ...

at U.C. Berkeley in the late 1980s. The system answered questions pertaining to the

Unix Unix (, ; trademarked as UNIX) is a family of multitasking, multi-user computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, a ...

operating system. It had a comprehensive, hand-crafted knowledge base of its domain, and it aimed at phrasing the answer to accommodate various types of users. Another project was LILOG, a text-understanding system that operated on the domain of tourism information in a German city. The systems developed in the UC and LILOG projects never went past the stage of simple demonstrations, but they helped the development of theories on computational linguistics and reasoning. Specialized natural-language question answering systems have been developed, such as EAGLi for health and life scientists.

Applications

QA systems are used in a variety of applications, including *

Fact-checking Fact-checking is the process of verifying the factual accuracy of questioned reporting and statements. Fact-checking can be conducted before or after the text or content is published or otherwise disseminated. Internal fact-checking is such che ...

if a fact is verified, by posing a question like: is fact ''X'' true or false? * customer service, * technical support, * market research, * generating reports or conducting research.

Architecture

, question-answering systems typically included a ''question classifier'' module that determined the type of question and the type of answer. Different types of question-answering systems employ different architectures. For example, modern open-domain question answering systems may use a retriever-reader architecture. The retriever is aimed at retrieving relevant documents related to a given question, while the reader is used to infer the answer from the retrieved documents. Systems such as

GPT-3 Generative Pre-trained Transformer 3 (GPT-3) is a large language model released by OpenAI in 2020. Like its predecessor, GPT-2, it is a decoder-only transformer model of deep neural network, which supersedes recurrence and convolution-based ...

, T5, and BART use an end-to-end architecture in which a transformer-based architecture stores large-scale textual data in the underlying parameters. Such models can answer questions without accessing any external knowledge sources.

Question answering methods

Question answering is dependent on a good search

corpus Corpus (plural ''corpora'') is Latin for "body". It may refer to: Linguistics * Text corpus, in linguistics, a large and structured set of texts * Speech corpus, in linguistics, a large set of speech audio files * Corpus linguistics, a branch of ...

; without documents containing the answer, there is little any question answering system can do. Larger collections generally mean better question answering performance, unless the question domain is orthogonal to the collection.

Data redundancy In computer main memory, auxiliary storage and computer buses, data redundancy is the existence of data that is additional to the actual data and permits correction of errors in stored or transmitted data. The additional data can simply be a com ...

in massive collections, such as the web, means that nuggets of information are likely to be phrased in many different ways in differing contexts and documents, leading to two benefits: # If the right information appears in many forms, the question answering system needs to perform fewer complex NLP techniques to understand the text. # Correct answers can be filtered from

false positive A false positive is an error in binary classification in which a test result incorrectly indicates the presence of a condition (such as a disease when the disease is not present), while a false negative is the opposite error, where the test resu ...

s because the system can rely on versions of the correct answer appearing more times in the corpus than incorrect ones. Some question answering systems rely heavily on

automated reasoning In computer science, in particular in knowledge representation and reasoning and metalogic, the area of automated reasoning is dedicated to understanding different aspects of reasoning. The study of automated reasoning helps produce computer progr ...

Open domain question answering

, an open-domain question answering system tries to return an answer in response to the user's question. The returned answer is in the form of short texts rather than a list of relevant documents. The system finds answers by using a combination of techniques from

, and

knowledge representation Knowledge representation (KR) aims to model information in a structured manner to formally represent it as knowledge in knowledge-based systems whereas knowledge representation and reasoning (KRR, KR&R, or KR²) also aims to understand, reason, and ...

. The system takes a

question as an input rather than a set of keywords, for example: "When is the national day of China?" It then transforms this input sentence into a query in its

logical form In logic, the logical form of a statement is a precisely specified semantic version of that statement in a formal system. Informally, the logical form attempts to formalize a possibly ambiguous statement into a statement with a precise, unamb ...

. Accepting natural language questions makes the system more user-friendly, but harder to implement, as there are a variety of question types and the system will have to identify the correct one in order to give a sensible answer. Assigning a question type to the question is a crucial task; the entire answer extraction process relies on finding the correct question type and hence the correct answer type. Keyword extraction is the first step in identifying the input question type. In some cases, words clearly indicate the question type, e.g., "Who", "Where", "When", or "How many"—these words might suggest to the system that the answers should be of type "Person", "Location", "Date", or "Number", respectively. POS (part-of-speech) tagging and syntactic parsing techniques can also determine the answer type. In the example above, the subject is "Chinese National Day", the predicate is "is" and the adverbial modifier is "when", therefore the answer type is "Date". Unfortunately, some interrogative words like "Which", "What", or "How" do not correspond to unambiguous answer types: Each can represent more than one type. In situations like this, other words in the question need to be considered. A lexical dictionary such as

WordNet WordNet is a lexical database of semantic relations between words that links words into semantic relations including synonyms, hyponyms, and meronyms. The synonyms are grouped into ''synsets'' with short definitions and usage examples. It can thu ...

can be used for understanding the context. Once the system identifies the question type, it uses an

system to find a set of documents that contain the correct keywords. A tagger and NP/Verb Group chunker can verify whether the correct entities and relations are mentioned in the found documents. For questions such as "Who" or "Where", a named-entity recogniser finds relevant "Person" and "Location" names from the retrieved documents. A

vector space model Vector space model or term vector model is an algebraic model for representing text documents (or more generally, items) as vector space, vectors such that the distance between vectors represents the relevance between the documents. It is used in i ...

can classify the candidate answers. Check if the answer is of the correct type as determined in the question type analysis stage. An inference technique can validate the candidate answers. A score is then given to each of these candidates according to the number of question words it contains and how close these words are to the candidate—the more and the closer the better. The answer is then translated by parsing into a compact and meaningful representation. In the previous example, the expected output answer is "1st Oct."

Mathematical question answering

An open-source, math-aware, question answering system called

MathQA Question answering (QA) is a computer science discipline within the fields of information retrieval and natural language processing (NLP) that is concerned with building systems that automatically answer Question, questions that are posed by hum ...

, based on Ask Platypus and

Wikidata Wikidata is a collaboratively edited multilingual knowledge graph hosted by the Wikimedia Foundation. It is a common source of open data that Wikimedia projects such as Wikipedia, and anyone else, are able to use under the CC0 public domain ...

, was published in 2018. MathQA takes an English or Hindi natural language question as input and returns a mathematical formula retrieved from Wikidata as a succinct answer, translated into a computable form that allows the user to insert values for the variables. The system retrieves names and values of variables and common constants from Wikidata if those are available. It is claimed that the system outperforms a commercial computational mathematical knowledge engine on a test set. MathQA is hosted by Wikimedia at https://mathqa.wmflabs.org/. In 2022, it was extended to answer 15 math question types. MathQA methods need to combine natural and formula language. One possible approach is to perform supervised annotation via

Entity Linking In natural language processing, Entity Linking, also referred to as named-entity disambiguation (NED), named-entity recognition and disambiguation (NERD), named-entity normalization (NEN), or Concept Recognition, is the task of assigning a unique ...

. The "ARQMath Task" at

CLEF A clef (from French: 'key') is a musical symbol used to indicate which notes are represented by the lines and spaces on a musical staff. Placing a clef on a staff assigns a particular pitch to one of the five lines or four spaces, whic ...

2020 was launched to address the problem of linking newly posted questions from the platform Math

Stack Exchange Stack Exchange is a network of question-and-answer (Q&A) websites on topics in diverse fields, each site covering a specific topic, where questions, answers, and users are subject to a reputation award process. The reputation system allows t ...

to existing ones that were already answered by the community. Providing hyperlinks to already answered, semantically related questions helps users to get answers earlier but is a challenging problem because semantic relatedness is not trivial. The lab was motivated by the fact that 20% of mathematical queries in general-purpose search engines are expressed as well-formed questions. The challenge contained two separate sub-tasks. Task 1: "Answer retrieval" matching old post answers to newly posed questions, and Task 2: "Formula retrieval" matching old post formulae to new questions. Starting with the domain of mathematics, which involves formula language, the goal is to later extend the task to other domains (e.g., STEM disciplines, such as chemistry, biology, etc.), which employ other types of special notation (e.g., chemical formulae). The inverse of mathematical question answering—mathematical question generation—has also been researched. The PhysWikiQuiz physics question generation and test engine retrieves mathematical formulae from Wikidata together with semantic information about their constituting identifiers (names and values of variables). The formulae are then rearranged to generate a set of formula variants. Subsequently, the variables are substituted with random values to generate a large number of different questions suitable for individual student tests. PhysWikiquiz is hosted by Wikimedia at https://physwikiquiz.wmflabs.org/.

Progress

Question answering systems have been extended in recent years to encompass additional domains of knowledge For example, systems have been developed to automatically answer temporal and geospatial questions, questions of definition and terminology, biographical questions, multilingual questions, and questions about the content of audio, images, and video. Current question answering research topics include: * interactivity—clarification of questions or answers * answer reuse or caching *

semantic parsing Semantic parsing is the task of converting a natural language utterance to a logical form: a machine-understandable representation of its meaning. Semantic parsing can thus be understood as extracting the precise meaning of an utterance. Applicat ...

* answer presentation *

and semantic

entailment Logical consequence (also entailment or logical implication) is a fundamental concept in logic which describes the relationship between statements that hold true when one statement logically ''follows from'' one or more statements. A valid l ...

* social media analysis with question answering systems *

sentiment analysis Sentiment analysis (also known as opinion mining or emotion AI) is the use of natural language processing, text analysis, computational linguistics, and biometrics to systematically identify, extract, quantify, and study affective states and subje ...

* utilization of thematic roles * Image captioning for visual question answeringAnderson, Peter, et al.
Bottom-up and top-down attention for image captioning and visual question answering
" Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018. * Embodied question answering In 2011, Watson, a question answering computer system developed by

IBM International Business Machines Corporation (using the trademark IBM), nicknamed Big Blue, is an American Multinational corporation, multinational technology company headquartered in Armonk, New York, and present in over 175 countries. It is ...

, competed in two exhibition matches of ''

Jeopardy! ''Jeopardy!'' is an American television game show created by Merv Griffin. The show is a quiz competition that reverses the traditional question-and-answer format of many quiz shows. Rather than being given questions, contestants are instead g ...

'' against Brad Rutter and

Ken Jennings Kenneth Wayne Jennings III (born May 23, 1974) is an American game show host, former contestant, and author. He is best known for his work on the syndicated quiz show ''Jeopardy!'' as a contestant and later its host. Jennings was born in Edm ...

, winning by a significant margin.

Facebook Research Onavo, Inc. was an Israeli mobile web analytics company that was purchased by Facebook, Inc. (now Meta Platforms), who changed the company's name to Facebook Israel. The company primarily performed its activities via consumer mobile apps, includ ...

made their DrQA system available under an

open source license Open-source licenses are software licenses that allow content to be used, modified, and shared. They facilitate free and open-source software (FOSS) development. Intellectual property (IP) laws restrict the modification and sharing of creative ...

. This system uses

as knowledge source. The

open source Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use and view the source code, design documents, or content of the product. The open source model is a decentrali ...

framework Haystack by deepset combines open-domain question answering with generative question answering and supports the of the

language models A language model is a model of the human brain's ability to produce natural language. Language models are useful for a variety of tasks, including speech recognition, machine translation,Andreas, Jacob, Andreas Vlachos, and Stephen Clark (2013)"S ...

for . Large Language Models (LLMs)^{6/sup> like GPT-4^{7/sup>, Gemini^{8/sup> are examples of successful QA systems that are enabling more sophisticated understanding and generation of text. When coupled with Multimodal^{9/sup> QA Systems, which can process and understand information from various modalities like text, images, and audio, LLMs significantly improve the capabilities of QA systems.

References

Further reading

* Dragomir R. Radev, John Prager, and Valerie Samn
Ranking suspected answers to natural language questions using predictive annotation
. In Proceedings of the 6th Conference on Applied Natural Language Processing, Seattle, WA, May 2000.
* John Prager, Eric Brown, Anni Coden, and Dragomir Radev
Question-answering by predictive annotation
. In Proceedings, 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Athens, Greece, July 2000.
*
* L. Fortnow, Steve Homer (2002/2003).
A Short History of Computational Complexity
In D. van Dalen, J. Dawson, and A. Kanamori, editors, ''The History of Mathematical Logic''. North-Holland, Amsterdam.
*

External links

Question Answering Evaluation at CLEF

{{Natural Language Processing

Tasks of natural language processing
Computational linguistics
Information retrieval genres
Deep learning}}}}