HOME

TheInfoList



OR:

In information extraction, a named entity is a real-world object, such as a person, location, organization, product, etc., that can be denoted with a
proper name A proper noun is a noun that identifies a single entity and is used to refer to that entity (''Africa''; ''Jupiter''; ''Sarah''; ''Walmart'') as distinguished from a common noun, which is a noun that refers to a class of entities (''continent, pl ...
. It can be abstract or have a physical existence. Examples of named entities include
Barack Obama Barack Hussein Obama II (born August 4, 1961) is an American politician who was the 44th president of the United States from 2009 to 2017. A member of the Democratic Party, he was the first African American president in American history. O ...
,
New York City New York, often called New York City (NYC), is the most populous city in the United States, located at the southern tip of New York State on one of the world's largest natural harbors. The city comprises five boroughs, each coextensive w ...
,
Volkswagen Golf The Volkswagen Golf () is a compact car/ small family car ( C-segment) produced by the German automotive manufacturer Volkswagen since 1974, marketed worldwide across eight generations, in various body configurations and under various nameplate ...
, or anything else that can be named. Named entities can simply be viewed as entity instances (e.g.,
New York City New York, often called New York City (NYC), is the most populous city in the United States, located at the southern tip of New York State on one of the world's largest natural harbors. The city comprises five boroughs, each coextensive w ...
is an instance of a
city A city is a human settlement of a substantial size. The term "city" has different meanings around the world and in some places the settlement can be very small. Even where the term is limited to larger settlements, there is no universally agree ...
). From a historical perspective, the term ''Named Entity'' was coined during the MUC-6 evaluation campaign and contained ENAMEX (entity name expressions e.g. persons, locations and organizations) and NUMEX (numerical expression). A more formal definition can be derived from the
rigid designator In modal logic and the philosophy of language, a term is said to be a rigid designator or absolute substantial term when it designates (picks out, denotes, refers to) the same thing in ''all possible worlds'' in which that thing exists. A designato ...
by
Saul Kripke Saul Aaron Kripke (; November 13, 1940 – September 15, 2022) was an American analytic philosophy, analytic philosopher and logician. He was Distinguished Professor of Philosophy at the Graduate Center of the City University of New York and emer ...
. In the expression "Named Entity", the word "Named" aims to restrict the possible set of entities to only those for which one or many rigid designators stands for the referent. A designator is rigid when it designates the same thing in every possible world. On the contrary, flaccid designators may designate different things in different possible worlds. As an example, consider the sentence, "Biden is the president of the United States". Both "Biden" and the "United States" are named entities since they refer to specific objects (
Joe Biden Joseph Robinette Biden Jr. (born November 20, 1942) is an American politician who was the 46th president of the United States from 2021 to 2025. A member of the Democratic Party (United States), Democratic Party, he served as the 47th vice p ...
and
United States The United States of America (USA), also known as the United States (U.S.) or America, is a country primarily located in North America. It is a federal republic of 50 U.S. state, states and a federal capital district, Washington, D.C. The 48 ...
). However, "president" is not a named entity since it can be used to refer to many different objects in different worlds (in different presidential periods referring to different persons, or even in different countries or organizations referring to different people). Rigid designators usually include proper names as well as certain natural terms like biological species and substances. There is also a general agreement in the
Named Entity Recognition Named-entity recognition (NER) (also known as (named) entity identification, entity chunking, and entity extraction) is a subtask of information extraction that seeks to locate and classify named entities mentioned in unstructured text into pr ...
community to consider temporal and numerical expressions as named entities, such as amounts of money and other types of units, which may violate the rigid designator perspective. The task of recognizing named entities in text is
Named Entity Recognition Named-entity recognition (NER) (also known as (named) entity identification, entity chunking, and entity extraction) is a subtask of information extraction that seeks to locate and classify named entities mentioned in unstructured text into pr ...
while the task of determining the identity of the named entities mentioned in text is called Named Entity Disambiguation. Both tasks require dedicated algorithms and resources to be addressed.


See also

*
Named-entity recognition Named-entity recognition (NER) (also known as (named) entity identification, entity chunking, and entity extraction) is a subtask of information extraction that seeks to locate and classify named entities mentioned in unstructured text into pr ...
(also referred to as entity identification, entity chunking and entity extraction) *
Entity linking In natural language processing, Entity Linking, also referred to as named-entity disambiguation (NED), named-entity recognition and disambiguation (NERD), named-entity normalization (NEN), or Concept Recognition, is the task of assigning a unique ...
(also referred to as named entity linking (NEL), named entity disambiguation (NED), named entity recognition and disambiguation (NERD) or named entity normalization) * Information extraction *
Knowledge extraction Knowledge extraction is the creation of knowledge from structured ( relational databases, XML) and unstructured (text, documents, images) sources. The resulting knowledge needs to be in a machine-readable and machine-interpretable format and must ...
*
Text mining Text mining, text data mining (TDM) or text analytics is the process of deriving high-quality information from text. It involves "the discovery by computer of new, previously unknown information, by automatically extracting information from differe ...
(also referred to as text data mining) *
Truecasing Truecasing, also called capitalization recovery, capitalization correction, or case restoration, is the problem in natural language processing (NLP) of determining the proper capitalization of words where such information is unavailable. This comm ...
*
Apache OpenNLP The Apache OpenNLP library is a machine learning based toolkit for the processing of natural language text. It supports the most common NLP tasks, such as language detection, tokenization, sentence segmentation, part-of-speech tagging, named e ...
* spaCy *
General Architecture for Text Engineering General Architecture for Text Engineering (GATE) is a Java suite of natural language processing (NLP) tools for man tasks, including information extraction in many languages. It is now used worldwide by a wide community of scientists, companies, t ...
*
Natural Language Toolkit The Natural Language Toolkit, or more commonly NLTK, is a suite of libraries and programs for symbolic and statistical natural language processing (NLP) for English written in the Python programming language. It supports classification, tokeniza ...


References

{{Reflist zh-yue:有名實體 Natural language processing Computational linguistics