A knowledge base (KB) is a technology used to
store complex
structured and
unstructured information used by a
computer system. The initial use of the term was in connection with
expert systems, which were the first
knowledge-based systems.
Original usage of the term
The original use of the term knowledge base was to describe one of the two sub-systems of an
expert system. A
knowledge-based system
A knowledge-based system (KBS) is a computer program that automated reasoning, reasons and uses a knowledge base to problem solving, solve complex systems, complex problems. The term is broad and refers to many different kinds of systems. The one c ...
consists of a knowledge-base representing facts about the world and ways of
reasoning
Reason is the capacity of consciously applying logic by drawing conclusions from new or existing information, with the aim of seeking the truth. It is closely associated with such characteristically human activities as philosophy, science, lang ...
about those facts to deduce new facts or highlight inconsistencies.
Properties
The term "knowledge-base" was coined to distinguish this form of knowledge store from the more common and widely used term ''
database
In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases spa ...
''. During the 1970s, virtually all large
management information systems stored their
data
In the pursuit of knowledge, data (; ) is a collection of discrete values that convey information, describing quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpret ...
in some type of
hierarchical
A hierarchy (from Greek: , from , 'president of sacred rites') is an arrangement of items (objects, names, values, categories, etc.) that are represented as being "above", "below", or "at the same level as" one another. Hierarchy is an important ...
or
relational database
In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases spa ...
. At this point in the history of
information technology
Information technology (IT) is the use of computers to create, process, store, retrieve, and exchange all kinds of data . and information. IT forms part of information and communications technology (ICT). An information technology system ...
, the distinction between a database and a knowledge-base was clear and unambiguous.
A database had the following properties:
*
Flat data: Data was usually represented in a tabular format with strings or numbers in each field.
* Multiple users: A conventional database needed to support more than one user or system logged into the same data at the same time.
*
Transactions: An essential requirement for a database was to maintain integrity and consistency among data accessed by
concurrent users. These are the so-called
ACID properties: Atomicity, Consistency, Isolation, and Durability.
* Large, long-lived data: A corporate database needed to support not just thousands but hundreds of thousands or more rows of data. Such a database usually needed to persist past the specific uses of any individual program; it needed to store data for years and decades rather than for the life of a program.
The first knowledge-based systems had data needs that were the opposite of these database requirements. An expert system requires
structured data. Not just tables with numbers and strings, but pointers to other objects that in turn have additional pointers. The ideal representation for a knowledge base is an object model (often called an
ontology
In metaphysics, ontology is the philosophical study of being, as well as related concepts such as existence, becoming, and reality.
Ontology addresses questions like how entities are grouped into categories and which of these entities ...
in
artificial intelligence
Artificial intelligence (AI) is intelligence—perceiving, synthesizing, and inferring information—demonstrated by machine
A machine is a physical system using Power (physics), power to apply Force, forces and control Motion, moveme ...
literature) with classes, subclasses and instances.
Early expert systems also had little need for multiple users or the complexity that comes with requiring transactional properties on data. The data for the early expert systems was used to arrive at a specific answer, such as a medical diagnosis, the design of a molecule, or a response to an emergency.
Once the solution to the problem was known, there was not a critical demand to store large amounts of data back to a permanent memory store. A more precise statement would be that given the technologies available, researchers compromised and did without these capabilities because they realized they were beyond what could be expected, and they could develop useful solutions to non-trivial problems without them. Even from the beginning, the more astute researchers realized the potential benefits of being able to store, analyze, and reuse knowledge. For example, see the discussion of Corporate Memory in the earliest work of the
Knowledge-Based Software Assistant program by
Cordell Green et al.
The volume requirements were also different for a knowledge-base compared to a conventional database. The knowledge-base needed to know facts about the world. For example, to represent the statement that "All humans are mortal". A database typically could not represent this general knowledge but instead would need to store information about thousands of tables that represented information about specific humans. Representing that all humans are mortal and being able to reason about any given human that they are mortal is the work of a knowledge-base. Representing that George, Mary, Sam, Jenna, Mike,... and hundreds of thousands of other customers are all humans with specific ages, sex, address, etc. is the work for a database.
As expert systems moved from being prototypes to systems deployed in corporate environments the requirements for their data storage rapidly started to overlap with the standard database requirements for multiple, distributed users with support for transactions. Initially, the demand could be seen in two different but competitive markets. From the
AI and
Object-Oriented
Object-oriented programming (OOP) is a programming paradigm based on the concept of " objects", which can contain data and code. The data is in the form of fields (often known as attributes or ''properties''), and the code is in the form of ...
communities,
object-oriented databases
An object database or object-oriented database is a database management system in which information is represented in the form of objects as used in object-oriented programming. Object databases are different from relational databases which are ...
such as
Versant
The Versant suite of tests are computerized tests of spoken language available from Pearson PLC. Versant tests were the first fully automated tests of spoken language to use advanced speech processing technology (including speech recognition) to ...
emerged. These were systems designed from the ground up to have support for object-oriented capabilities but also to support standard database services as well. On the other hand, the large database vendors such as
Oracle
An oracle is a person or agency considered to provide wise and insightful counsel or prophetic predictions, most notably including precognition of the future, inspired by deities. As such, it is a form of divination.
Description
The wor ...
added capabilities to their products that provided support for knowledge-base requirements such as class-subclass relations and rules.
Internet as a knowledge base
The next evolution for the term knowledge-base was the
Internet
The Internet (or internet) is the global system of interconnected computer networks that uses the Internet protocol suite (TCP/IP) to communicate between networks and devices. It is a ''internetworking, network of networks'' that consists ...
. With the rise of the Internet, documents,
hypertext
Hypertext is text displayed on a computer display or other electronic devices with references ( hyperlinks) to other text that the reader can immediately access. Hypertext documents are interconnected by hyperlinks, which are typicall ...
, and multimedia support were now critical for any corporate database. It was no longer enough to support large tables of data or relatively small objects that lived primarily in computer memory. Support for corporate web sites required persistence and transactions for documents. This created a whole new discipline known as
Web Content Management.
The other driver for document support was the rise of
knowledge management
Knowledge management (KM) is the collection of methods relating to creating, sharing, using and managing the knowledge and information of an organization. It refers to a multidisciplinary approach to achieve organisational objectives by making ...
vendors such as
Lotus Notes
HCL Notes (formerly IBM Notes and Lotus Notes; see Branding below) and HCL Domino (formerly IBM Domino and Lotus Domino) are the client and server, respectively, of a collaborative client-server software platform formerly sold by IBM, now by HCL ...
.
Knowledge Management
Knowledge management (KM) is the collection of methods relating to creating, sharing, using and managing the knowledge and information of an organization. It refers to a multidisciplinary approach to achieve organisational objectives by making ...
actually predated the Internet but with the Internet there was great synergy between the two areas. Knowledge management products adopted the term "knowledge-base" to describe their
repositories but the meaning had a big difference. In the case of previous knowledge-based systems, the knowledge was primarily for the use of an automated system, to reason about and draw conclusions about the world. With knowledge management products, the knowledge was primarily meant for humans, for example to serve as a repository of manuals, procedures, policies, best practices, reusable designs and code, etc. In both cases the distinctions between the uses and kinds of systems were ill-defined. As the technology scaled up it was rare to find a system that could really be cleanly classified as knowledge-based in the sense of an expert system that performed automated reasoning and knowledge-based in the sense of knowledge management that provided knowledge in the form of documents and media that could be leveraged by humans.
See also
*
Content management
*
Database
In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases spa ...
*
Enterprise bookmarking
*
Information repository
*
Knowledge-based system
A knowledge-based system (KBS) is a computer program that automated reasoning, reasons and uses a knowledge base to problem solving, solve complex systems, complex problems. The term is broad and refers to many different kinds of systems. The one c ...
*
Knowledge graph
The Google Knowledge Graph is a knowledge base from which Google serves relevant information in an infobox beside its search results. This allows the user to see the answer in a glance. The data is generated automatically from a variety of so ...
*
Knowledge management
Knowledge management (KM) is the collection of methods relating to creating, sharing, using and managing the knowledge and information of an organization. It refers to a multidisciplinary approach to achieve organisational objectives by making ...
*
Microsoft Knowledge Base
*
Diffbot
*
Ontology engineering
In computer science, information science and systems engineering, ontology engineering is a field which studies the methods and methodologies for building ontologies, which encompasses a representation, formal naming and definition of the categ ...
*
Semantic network
A semantic network, or frame network is a knowledge base that represents semantic relations between concepts in a network. This is often used as a form of knowledge representation. It is a directed or undirected graph consisting of vertices ...
*
Symbolic artificial intelligence
*
Text mining
*
Vadalog
The Vadalog system is a ''Knowledge Graph Management System'' (KGMS) that offers a language for performing complex logic reasoning tasks over knowledge graphs. At the same time, Vadalog delivers a platform to support the entire spectrum of data s ...
*
Wikidata
Wikidata is a collaboratively edited multilingual knowledge graph hosted by the Wikimedia Foundation. It is a common source of open data that Wikimedia projects such as Wikipedia, and anyone else, can use under the CC0 public domain licen ...
*
YAGO
References
External links
Technical communication
{{logic-stub