computer science
Computer science is the study of computation, information, and automation. Computer science spans Theoretical computer science, theoretical disciplines (such as algorithms, theory of computation, and information theory) to Applied science, ...
systems engineering
Systems engineering is an interdisciplinary field of engineering and engineering management that focuses on how to design, integrate, and manage complex systems over their Enterprise life cycle, life cycles. At its core, systems engineering uti ...
, ontology engineering is a field which studies the methods and methodologies for building
ontologies
In information science, an ontology encompasses a representation, formal naming, and definitions of the categories, properties, and relations between the concepts, data, or entities that pertain to one, many, or all domains of discourse. More ...
, which encompasses a representation, formal naming and definition of the categories, properties and relations between the concepts, data and entities of a given domain of interest. In a broader sense, this field also includes a knowledge construction of the domain using formal ontology representations such as OWL/RDF.
A large-scale representation of abstract concepts such as actions, time, physical objects and beliefs would be an example of ontological engineering. Ontology engineering is one of the areas of applied ontology, and can be seen as an application of philosophical ontology. Core ideas and objectives of ontology engineering are also central in
conceptual modeling
The term conceptual model refers to any model that is formed after a conceptualization or generalization process. Conceptual models are often abstractions of things in the real world, whether physical or social. Semantic studies are relevant to var ...
.
Automated processing of information not interpretable by
software agents
In computer science, a software agent is a computer program that acts for a user or another program in a relationship of agency.
The term ''agent'' is derived from the Latin ''agere'' (to do): an agreement to act on one's behalf. Such "action on ...
can be improved by adding rich
semantics
Semantics is the study of linguistic Meaning (philosophy), meaning. It examines what meaning is, how words get their meaning, and how the meaning of a complex expression depends on its parts. Part of this process involves the distinction betwee ...
to the corresponding resources, such as video files. One of the approaches for the formal
conceptualization
A concept is an abstract idea that serves as a foundation for more concrete principles, thoughts, and beliefs.
Concepts play an important role in all aspects of cognition. As such, concepts are studied within such disciplines as linguistics, psy ...
of represented
knowledge domain
Knowledge is an awareness of facts, a familiarity with individuals and situations, or a practical skill. Knowledge of facts, also called propositional knowledge, is often characterized as true belief that is distinct from opinion or gues ...
s is the use of machine-interpretable ontologies, which provide
structured data
A data model is an abstract model that organizes elements of data and standardizes how they relate to one another and to the properties of real-world entities. For instance, a data model may specify that the data element representing a car be ...
RDFS
RDF Schema (Resource Description Framework Schema, variously abbreviated as RDFS, , RDF-S, or RDF/S) is a set of classes with certain properties using the RDF extensible knowledge representation data model, providing basic elements for the descr ...
, and
OWL
Owls are birds from the order Strigiformes (), which includes over 200 species of mostly solitary and nocturnal birds of prey typified by an upright stance, a large, broad head, binocular vision, binaural hearing, sharp talons, and feathers a ...
. Ontology engineering is the design and creation of such ontologies, which can contain more than just the list of terms (
controlled vocabulary
A controlled vocabulary provides a way to organize knowledge for subsequent retrieval. Controlled vocabularies are used in subject indexing schemes, subject headings, thesauri, taxonomies and other knowledge organization systems. Controlled v ...
); they contain terminological, assertional, and relational
axioms
An axiom, postulate, or assumption is a statement that is taken to be true, to serve as a premise or starting point for further reasoning and arguments. The word comes from the Ancient Greek word (), meaning 'that which is thought worthy or f ...
to define concepts (classes), individuals, and roles (properties) ( TBox, ABox, and RBox, respectively). Ontology engineering is a relatively new field of study concerning the ontology development process, the ontology life cycle, the methods and methodologies for building ontologies,Asunción Gómez-Pérez, Mariano Fernández-López, Oscar Corcho (2004). Ontological Engineering: With Examples from the Areas of Knowledge Management, E-commerce and the Semantic Web '. Springer, 2004. and the tool suites and languages that support them.
A common way to provide the logical underpinning of ontologies is to formalize the axioms with
description logic
Description logics (DL) are a family of formal knowledge representation languages. Many DLs are more expressive than propositional logic but less expressive than first-order logic. In contrast to the latter, the core reasoning problems for DLs are ...
RDF/XML
RDF/XML is a syntax,RDF/XML Syntax Specification
Turtle
Turtles are reptiles of the order (biology), order Testudines, characterized by a special turtle shell, shell developed mainly from their ribs. Modern turtles are divided into two major groups, the Pleurodira (side necked turtles) and Crypt ...
. Beyond the description logic axioms, ontologies might also contain SWRL rules. The concept definitions can be mapped to any kind of resource or resource segment in RDF, such as images, videos, and regions of interest, to annotate objects, persons, etc., and interlink them with related resources across
knowledge base
In computer science, a knowledge base (KB) is a set of sentences, each sentence given in a knowledge representation language, with interfaces to tell new sentences and to ask questions about what is known, where either of these interfaces migh ...
s, ontologies, and
LOD
Lod (, ), also known as Lydda () and Lidd (, or ), is a city southeast of Tel Aviv and northwest of Jerusalem in the Central District of Israel. It is situated between the lower Shephelah on the east and the coastal plain on the west. The ci ...
datasets. This information, based on human experience and knowledge, is valuable for reasoners for the automated interpretation of sophisticated and ambiguous contents, such as the visual content of multimedia resources. Application areas of ontology-based reasoning include, but are not limited to,
information retrieval
Information retrieval (IR) in computing and information science is the task of identifying and retrieving information system resources that are relevant to an Information needs, information need. The information need can be specified in the form ...
, automated scene interpretation, and
knowledge discovery
Knowledge extraction is the creation of knowledge from structured ( relational databases, XML) and unstructured (text, documents, images) sources. The resulting knowledge needs to be in a machine-readable and machine-interpretable format and must ...
.
Languages
An
ontology language
In computer science and artificial intelligence, ontology languages are formal languages used to construct ontologies. They allow the encoding of knowledge about specific domains and often include reasoning rules that support the processing of ...
is a
formal language
In logic, mathematics, computer science, and linguistics, a formal language is a set of strings whose symbols are taken from a set called "alphabet".
The alphabet of a formal language consists of symbols that concatenate into strings (also c ...
used to encode the ontology. There are a number of such languages for ontologies, both proprietary and standards-based:
*
Common logic
Common Logic (CL) is a framework for a family of logic languages, based on first-order logic, intended to facilitate the exchange and transmission of knowledge in computer-based systems.
The CL definition permits and encourages the development ...
is ISO standard 24707, a specification for a family of ontology languages that can be accurately translated into each other.
* The
Cyc
Cyc (pronounced ) is a long-term artificial intelligence (AI) project that aims to assemble a comprehensive ontology and knowledge base that spans the basic concepts and rules about how the world works. Hoping to capture common sense knowledge ...
project has its own ontology language called
CycL
CycL in computer science and artificial intelligence, is an ontology language used by Douglas Lenat's Cyc artificial intelligence project. Ramanathan V. Guha was instrumental in designing early versions of the language. A close CycL variant exi ...
, based on
first-order predicate calculus
First-order logic, also called predicate logic, predicate calculus, or quantificational logic, is a collection of formal systems used in mathematics, philosophy, linguistics, and computer science. First-order logic uses quantified variables over ...
with some higher-order extensions.
* The
Gellish
Gellish is an ontology language for data storage and communication, designed and developed by Andries van Renssen since mid-1990s. It started out as an engineering modeling language ("Generic Engineering Language", giving it the name, "Gellish") b ...
language includes rules for its own extension and thus integrates an ontology with an ontology language.
*
IDEF5
IDEF5 (''Integrated Definition for Ontology Description Capture Method'') is a software engineering method to develop and maintain usable, accurate domain ontologies.Perakath C. Benjamin et al. (1994)''IDEF5 Method Report''. Knowledge Based Systems ...
is a
software engineering
Software engineering is a branch of both computer science and engineering focused on designing, developing, testing, and maintaining Application software, software applications. It involves applying engineering design process, engineering principl ...
method to develop and maintain usable, accurate, domain ontologies.
* KIF is a syntax for
first-order logic
First-order logic, also called predicate logic, predicate calculus, or quantificational logic, is a collection of formal systems used in mathematics, philosophy, linguistics, and computer science. First-order logic uses quantified variables over ...
that is based on
S-expression
In computer programming, an S-expression (or symbolic expression, abbreviated as sexpr or sexp) is an expression in a like-named notation for nested List (computing), list (Tree (data structure), tree-structured) data. S-expressions were invented ...
s.
*
Rule Interchange Format
The Rule Interchange Format (RIF) is a W3C Recommendation. RIF is part of the infrastructure for the semantic web, along with (principally) SPARQL, RDF and OWL. Although originally envisioned by many as a "rules layer" for the semantic web, in r ...
(RIF),
F-Logic F-logic (Frame logic) is a knowledge representation and ontology language. It combines the advantages of conceptual modeling with Object-oriented programming, object-oriented, Frame (artificial intelligence), frame-based languages, and offers a Decl ...
and its successor ObjectLogic combine ontologies and rules.
*
OWL
Owls are birds from the order Strigiformes (), which includes over 200 species of mostly solitary and nocturnal birds of prey typified by an upright stance, a large, broad head, binocular vision, binaural hearing, sharp talons, and feathers a ...
is a language for making ontological statements, developed as a follow-on from RDF and
RDFS
RDF Schema (Resource Description Framework Schema, variously abbreviated as RDFS, , RDF-S, or RDF/S) is a set of classes with certain properties using the RDF extensible knowledge representation data model, providing basic elements for the descr ...
, as well as earlier ontology language projects including
OIL
An oil is any nonpolar chemical substance that is composed primarily of hydrocarbons and is hydrophobic (does not mix with water) and lipophilic (mixes with other oils). Oils are usually flammable and surface active. Most oils are unsaturate ...
DAML+OIL
The Web Ontology Language (OWL) is a family of knowledge representation languages for authoring ontologies. Ontologies are a formal way to describe taxonomies and classification networks, essentially defining the structure of knowledge for vario ...
. OWL is intended to be used over the
World Wide Web
The World Wide Web (WWW or simply the Web) is an information system that enables Content (media), content sharing over the Internet through user-friendly ways meant to appeal to users beyond Information technology, IT specialists and hobbyis ...
, and all its elements (classes, properties and individuals) are defined as RDF
resource
''Resource'' refers to all the materials available in our environment which are Technology, technologically accessible, Economics, economically feasible and Culture, culturally Sustainability, sustainable and help us to satisfy our needs and want ...
s, and identified by
URI
Uri may refer to:
Places
* Canton of Uri, a canton in Switzerland
* Úri, a village and commune in Hungary
* Uri, Iran, a village in East Azerbaijan Province
* Uri, Jammu and Kashmir, a town in India
* Uri (island), off Malakula Island in V ...
s.
* OntoUML is a well-founded language for specifying reference ontologies.
*
SHACL
Shapes Constraint Language (SHACL) is a World Wide Web Consortium (W3C) standard language for describing Resource Description Framework (RDF) graphs. SHACL has been designed to enhance the semantic and technical interoperability layers of ontolog ...
(RDF SHapes Constraints Language) is a language for describing structure of RDF data. It can be used together with RDFS and OWL or it can be used independently from them.
*
XBRL
XBRL (eXtensible Business Reporting Language) is a freely available global framework for exchanging business information. XBRL allows the expression of semantics commonly required in business reporting. The standard was originally based on X ...
(Extensible Business Reporting Language) is a syntax for expressing business semantics.
Methodologies and tools
*
DOGMA
Dogma, in its broadest sense, is any belief held definitively and without the possibility of reform. It may be in the form of an official system of principles or doctrines of a religion, such as Judaism, Roman Catholicism, Protestantism, or Islam ...
*
KAON
In particle physics, a kaon, also called a K meson and denoted , is any of a group of four mesons distinguished by a quantum number called strangeness. In the quark model they are understood to be bound states of a strange quark (or antiquark ...
*
OntoClean OntoClean is a methodology for analyzing ontologies based on formal, domain-independent properties of classes (the metaproperties) developed by Nicola Guarino and Chris Welty.
Overview and History
OntoClean was the first attempt to formalize noti ...
Protégé (software)
Protégé is a free, open source ontology editor and a knowledge management system. The Protégé meta-tool was first built by Mark Musen in 1987 and has since been developed by a team at Stanford University. The software is the most popular and ...
*
Large language models
A large language model (LLM) is a language model trained with Self-supervised learning, self-supervised machine learning on a vast amount of text, designed for natural language processing tasks, especially Natural language generation, language g ...
In life sciences
Life sciences is flourishing with ontologies that biologists use to make sense of their experiments. For inferring correct conclusions from experiments, ontologies have to be structured optimally against the knowledge base they represent. The structure of an ontology needs to be changed continuously so that it is an accurate representation of the underlying
domain
A domain is a geographic area controlled by a single person or organization. Domain may also refer to:
Law and human geography
* Demesne, in English common law and other Medieval European contexts, lands directly managed by their holder rather ...
.
Recently, an automated method was introduced for engineering ontologies in life sciences such as
Gene Ontology
The Gene Ontology (GO) is a major bioinformatics initiative to unify the representation of gene and gene product attributes across all species. More specifically, the project aims to: 1) maintain and develop its controlled vocabulary of gene and ...
(GO), one of the most successful and widely used biomedical ontology. Based on information theory, it restructures ontologies so that the levels represent the desired specificity of the concepts. Similar information theoretic approaches have also been used for optimal partition of Gene Ontology. Given the mathematical nature of such engineering
algorithms
In mathematics and computer science, an algorithm () is a finite sequence of mathematically rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for per ...
, these optimizations can be automated to produce a principled and scalable architecture to restructure ontologies such as GO.
Open Biomedical Ontologies
The Open Biological and Biomedical Ontologies (OBO) Foundry is a group of people who build and maintain ontologies related to the life sciences. The OBO Foundry establishes a set of principles for ontology development for creating a suite of ...
(OBO), a 2006 initiative of the U.S. National Center for Biomedical Ontology, provides a common 'foundry' for various ontology initiatives, amongst which are:
*The Generic Model Organism Project (GMOD)
*
Gene Ontology
The Gene Ontology (GO) is a major bioinformatics initiative to unify the representation of gene and gene product attributes across all species. More specifically, the project aims to: 1) maintain and develop its controlled vocabulary of gene and ...
Consortium
*Sequence Ontology
*Ontology Lookup Service
*The Plant Ontology Consortium
*Standards and Ontologies for
Functional Genomics
Functional genomics is a field of molecular biology that attempts to describe gene (and protein) functions and interactions. Functional genomics make use of the vast data generated by genomic and transcriptomic projects (such as genome sequen ...
Ontology (information science)
In information science, an ontology encompasses a representation, formal naming, and definitions of the categories, properties, and relations between the concepts, data, or entities that pertain to one, many, or all domains of discourse. More ...
*
Ontology components
Contemporary Ontology (information science), ontologies share many structural similarities, regardless of the ontology language in which they are expressed. Most ontologies describe individuals (instances), classes (concepts), attributes, and rel ...
*
Ontology double articulation
Ontology double articulation refers to the methodological principle in ontology engineering, that an ontology should be built as separate domain axiomatizations and application axiomatizations. According to this principle, an application axiomatiza ...
Ontology modularization The notion of ontology modularization refers to a methodological principle in ontology engineering
In computer science, information science and systems engineering, ontology engineering is a field which studies the methods and methodologies for b ...
*
Semantic decision table A semantic decision table uses modern ontology engineering technologies to enhance traditional a decision table. The term "semantic decision table" was coined by Yan Tang and Prof. Robert Meersman from VUB STARLab ( Free University of Brussels) in 2 ...
*
Semantic integration
Semantic integration is the process of interrelating information from diverse sources, for example calendars and to do lists, email archives, presence information (physical, psychological, and social), documents of all sorts, contacts (including ...
*
Semantic technology
The ultimate goal of semantic technology is to help machines understand data. To enable the encoding of semantics with the data, well-known technologies are RDF (Resource Description Framework) and OWL (Web Ontology Language). These technologies ...
*
Semantic Web
The Semantic Web, sometimes known as Web 3.0, is an extension of the World Wide Web through standards set by the World Wide Web Consortium (W3C). The goal of the Semantic Web is to make Internet data machine-readable.
To enable the encoding o ...
*
Linked data
In computing, linked data is structured data which is interlinked with other data so it becomes more useful through semantic queries. It builds upon standard Web technologies such as HTTP, RDF and URIs, but rather than using them to serve web ...
Journal of Universal Computer Science
The ''Journal of Universal Computer Science'' is a monthly peer-reviewed open-access scientific journal covering all aspects of computer science.
History
The journal was established in 1994 and is published by the J.UCS Consortium, formed by ni ...