Never-Ending Language Learning
   HOME

TheInfoList



OR:

Never-Ending Language Learning system (NELL) is a
semantic Semantics is the study of linguistic Meaning (philosophy), meaning. It examines what meaning is, how words get their meaning, and how the meaning of a complex expression depends on its parts. Part of this process involves the distinction betwee ...
machine learning Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of Computational statistics, statistical algorithms that can learn from data and generalise to unseen data, and thus perform Task ( ...
system A system is a group of interacting or interrelated elements that act according to a set of rules to form a unified whole. A system, surrounded and influenced by its open system (systems theory), environment, is described by its boundaries, str ...
that as of 2010 was being developed by a research team at
Carnegie Mellon University Carnegie Mellon University (CMU) is a private research university in Pittsburgh, Pennsylvania, United States. The institution was established in 1900 by Andrew Carnegie as the Carnegie Technical Schools. In 1912, it became the Carnegie Institu ...
, and supported by grants from
DARPA The Defense Advanced Research Projects Agency (DARPA) is a research and development agency of the United States Department of Defense responsible for the development of emerging technologies for use by the military. Originally known as the Adva ...
,
Google Google LLC (, ) is an American multinational corporation and technology company focusing on online advertising, search engine technology, cloud computing, computer software, quantum computing, e-commerce, consumer electronics, and artificial ...
,
NSF NSF may stand for: Political organizations *National Socialist Front, a Swedish National Socialist party *NS-Frauenschaft, the women's wing of the former German Nazi party * National Students Federation, a leftist Pakistani students' political g ...
, and
CNPq The National Council for Scientific and Technological Development (CNPq, , earlier ) is a government agency under the Ministry of Science and Technology of the Brazilian federal government. The council is dedicated to the promotion of scientific ...
with portions of the system running on a
supercomputing A supercomputer is a type of computer with a high level of performance as compared to a general-purpose computer. The performance of a supercomputer is commonly measured in floating-point operations per second (FLOPS) instead of million instruc ...
cluster may refer to: Science and technology Astronomy * Cluster (spacecraft), constellation of four European Space Agency spacecraft * Cluster II (spacecraft), a European Space Agency mission to study the magnetosphere * Asteroid cluster, a small ...
provided by
Yahoo! Yahoo (, styled yahoo''!'' in its logo) is an American web portal that provides the search engine Yahoo Search and related services including My Yahoo, Yahoo Mail, Yahoo News, Yahoo Finance, Yahoo Sports, y!entertainment, yahoo!life, and its a ...
.


Process and goals

NELL was programmed by its developers to be able to identify a basic set of fundamental semantic relationships between a few hundred predefined categories of data, such as cities, companies, emotions and sports teams. Since the beginning of 2010, the Carnegie Mellon research team has been running NELL around the clock, sifting through hundreds of millions of web pages looking for connections between the information it already knows and what it finds through its search process – to make new connections in a manner that is intended to mimic the way humans learn new information. For example, in encountering the word pair "Pikes Peak", NELL would notice that both words are capitalized and deduce from the second word that it was the name of a mountain, and then build on the relationship of words surrounding those two words to deduce other connections. The goal of NELL and other semantic learning systems, such as
IBM International Business Machines Corporation (using the trademark IBM), nicknamed Big Blue, is an American Multinational corporation, multinational technology company headquartered in Armonk, New York, and present in over 175 countries. It is ...
's Watson system, is to be able to develop means of answering questions posed by users in natural language with no human intervention in the process.
Oren Etzioni Oren Etzioni (born 1964) is Professor Emeritus of Computer Science at the University of Washington, and founding CEO of the Allen Institute for Artificial Intelligence (AI2). Etzioni is a co-founder oVercept an AI startup. Etzioni is the found ...
of the
University of Washington The University of Washington (UW and informally U-Dub or U Dub) is a public research university in Seattle, Washington, United States. Founded in 1861, the University of Washington is one of the oldest universities on the West Coast of the Uni ...
lauded the system's "continuous learning, as if NELL is exercising curiosity on its own, with little human help". By October 2010, NELL has doubled the number of relationships it has available in its knowledge base and has learned 440,000 new facts, with an accuracy of 87%. Team leader
Tom M. Mitchell Tom Michael Mitchell (born August 9, 1951) is an American computer scientist and the Founders University Professor at Carnegie Mellon University (CMU). He is a founder and former chair of the Machine Learning Department at CMU. Mitchell is known ...
, chairman of the machine learning department at Carnegie Mellon described how NELL "self-corrects when it has more information, as it learns more", though it does sometimes arrive at incorrect conclusions. Accumulated errors, such as the deduction that
Internet cookies HTTP cookie (also called web cookie, Internet cookie, browser cookie, or simply cookie) is a small block of data (computing), data created by a web server while a user (computing), user is browsing a website and placed on the user's computer o ...
were a kind of baked good, led NELL to deduce from the phrases "I deleted my Internet cookies" and "I deleted my files" that "
computer file A computer file is a System resource, resource for recording Data (computing), data on a Computer data storage, computer storage device, primarily identified by its filename. Just as words can be written on paper, so too can data be written to a ...
s" also belonged in the baked goods category. Clear errors like these are corrected every few weeks by the members of the research team and the system is allowed to continue its learning process. By 2018, NELL had "acquired a knowledge base with 120mn diverse, confidence-weighted beliefs (e.g., ''servedWith(tea,biscuits)''), while learning thousands of interrelated functions that continually improve its reading competence over time." As of September 2023, the project's most recently gathered facts dated from February 2019 (according to its Twitter feed) or September 2018 (according to its home page).


Reception

In his 2019 book " Human Compatible", Stuart Russell commented that 'Unfortunately NELL has confidence in only 3 percent of its beliefs and relies on human experts to clean out false or meaningless beliefs on a regular basis—such as its beliefs that “Nepal is a country also known as United States” and "value is an agricultural product that is usually cut into basis."' A 2023 paper commented that "While the ''never-ending'' part seems like the right approach, NELL still had the drawback that its focus remained much too grounded on object-language descriptions, and relied on web pages as its only source, which significantly influenced the type of grammar, symbolism, slang, etc. analysed."


See also

*
Cognitive architecture A cognitive architecture is both a theory about the structure of the human mind and to a computational instantiation of such a theory used in the fields of artificial intelligence (AI) and computational cognitive science. These formalized models ...
* Computational models of language acquisition *
Cyc Cyc (pronounced ) is a long-term artificial intelligence (AI) project that aims to assemble a comprehensive ontology and knowledge base that spans the basic concepts and rules about how the world works. Hoping to capture common sense knowledge ...
*
Darwin among the Machines Darwin may refer to: Common meanings * Charles Darwin (1809–1882), English naturalist and writer, best known as the originator of the theory of biological evolution by natural selection * Darwin, Northern Territory, a capital city in Australia, ...
* The Adolescence of P-1


References

{{reflist


External links


Project homepage
Natural language processing software Data mining and machine learning software