Semantic integration is the process of interrelating information from diverse sources, for example calendars and to do lists, email archives, presence information (physical, psychological, and social), documents of all sorts, contacts (including
social graph
A social graph is a graph that represents social relations between entities. It is a model or representation of a social network. The social graph has been referred to as "the global mapping of everybody and how they're related".
The term w ...
s), search results, and advertising and marketing relevance derived from them. In this regard,
semantics
Semantics is the study of linguistic Meaning (philosophy), meaning. It examines what meaning is, how words get their meaning, and how the meaning of a complex expression depends on its parts. Part of this process involves the distinction betwee ...
focuses on the organization of and action upon
information
Information is an Abstraction, abstract concept that refers to something which has the power Communication, to inform. At the most fundamental level, it pertains to the Interpretation (philosophy), interpretation (perhaps Interpretation (log ...
by acting as an intermediary between heterogeneous data sources, which may conflict not only by structure but also context or value.
Applications and methods
In
enterprise application integration (EAI), semantic integration can facilitate or even automate the communication between computer systems using
metadata publishing Metadata publishing is the process of making metadata data elements available to external users, both people and machines using a formal review process and a commitment to change control processes.
Metadata publishing is the foundation upon which a ...
. Metadata publishing potentially offers the ability to automatically link
ontologies
In information science, an ontology encompasses a representation, formal naming, and definitions of the categories, properties, and relations between the concepts, data, or entities that pertain to one, many, or all domains of discourse. More ...
. One approach to (semi-)automated ontology mapping requires the definition of a semantic distance or its inverse,
semantic similarity
Semantic similarity is a metric defined over a set of documents or terms, where the idea of distance between items is based on the likeness of their meaning or semantic content as opposed to lexicographical similarity. These are mathematical tool ...
and appropriate rules. Other approaches include so-called ''lexical methods'', as well as methodologies that rely on exploiting the structures of the ontologies. For explicitly stating similarity/equality, there exist special properties or relationships in most ontology languages.
OWL
Owls are birds from the order Strigiformes (), which includes over 200 species of mostly solitary and nocturnal birds of prey typified by an upright stance, a large, broad head, binocular vision, binaural hearing, sharp talons, and feathers a ...
, for example has "owl:equivalentClass", "owl:equivalentProperty" and "owl:
sameAs
In data science, sameAs or exactMatch is a method of indicating that the subject of, or entity represented by, two resources is considered to be one and the same thing. It is a key part of the Semantic Web.
Uses
The concept of sameAs exists ...
".
Eventually system designs may see the advent of composable architectures where published semantic-based interfaces are joined together to enable new and meaningful capabilities. These could predominately be described by means of design-time declarative specifications, that could ultimately be rendered and executed at run-time.
Semantic integration can also be used to facilitate design-time activities of interface design and mapping. In this model, semantics are only explicitly applied to design and the run-time systems work at the
syntax
In linguistics, syntax ( ) is the study of how words and morphemes combine to form larger units such as phrases and sentences. Central concerns of syntax include word order, grammatical relations, hierarchical sentence structure (constituenc ...
level. This "early semantic binding" approach can improve overall system performance while retaining the benefits of semantic driven design.
Semantic integration situations
From the industry use case, it has been observed that the semantic mappings were performed only within the scope of the ontology class or the
datatype
In computer science and computer programming, a data type (or simply type) is a collection or grouping of data values, usually specified by a set of possible values, a set of allowed operations on these values, and/or a representation of these ...
property. These identified semantic integrations are (1) integration of ontology class instances into another ontology class without any constraint, (2) integration of selected instances in one ontology class into another ontology class by the range constraint of the property value and (3) integration of ontology class instances into another ontology class with the value transformation of the instance property. Each of them requires a particular mapping relationship, which is respectively: (1) equivalent or subsumption mapping relationship, (2) conditional mapping relationship that constraints the value of property (data range) and (3) transformation mapping relationship that transforms the value of property (unit transformation). Each identified mapping relationship can be defined as either (1) direct mapping type, (2) data range mapping type or (3) unit transformation mapping type.
KG vs. RDB approaches
In the case of integrating supplemental data source,
* KG(
Knowledge graph
In knowledge representation and reasoning, a knowledge graph is a knowledge base that uses a Graph (discrete mathematics), graph-structured data model or topology to represent and operate on data. Knowledge graphs are often used to store interl ...
) formally represents the meaning involved in information by describing concepts, relationships between things, and categories of things. These embedded semantics with the data offer significant advantages such as reasoning over data and dealing with heterogeneous data sources. The rules can be applied on KG more efficiently using graph query. For example, the graph query does the data inference through the connected relations, instead of repeated full search of the tables in relational database. KG facilitates the integration of new heterogeneous data by just adding new relationships between existing information and new entities. This facilitation is emphasized for the integration with existing popular linked open data source such as Wikidata.org.
*
SQL
Structured Query Language (SQL) (pronounced ''S-Q-L''; or alternatively as "sequel")
is a domain-specific language used to manage data, especially in a relational database management system (RDBMS). It is particularly useful in handling s ...
query is tightly coupled and rigidly constrained by datatype within the specific database and can join tables and extract data from tables, and the result is generally a table, and a query can join tables by any columns which match by datatype.
SPARQL
SPARQL (pronounced ":wikt:sparkle, sparkle", a recursive acronym for SPARQL Protocol and RDF Query Language) is an RDF query language—that is, a Semantic Query, semantic query language for databases—able to retrieve and manipulate data sto ...
query is the standard query language and protocol for Linked Open Data on the web and loosely coupled with the database so that it facilitates the reusability and can extract data through the relations free from the datatype, and not only extract but also generate additional knowledge graph with more sophisticated operations(logic: transitive/symmetric/inverseOf/functional). The inference based query (query on the existing asserted facts without the generation of new facts by logic) can be fast comparing to the reasoning based query (query on the existing plus the generated/discovered facts based on logic).
* The information integration of heterogeneous data sources in traditional database is intricate, which requires the redesign of the database table such as changing the structure and/or addition of new data. In the case of semantic query, SPARQL query reflects the relationships between entities in a way that aligned with human's understanding of the domain, so the semantic intention of the query can be seen on the query itself. Unlike SPARQL, SQL query, which reflects the specific structure of the database and derived from matching the relevant primary and foreign keys of tables, loses the semantics of the query by missing the relationships between entities. Below is the example that compares SPARQL and SQL queries for medications that treats "TB of vertebra".
SELECT ?medication
WHERE
SELECT DRUG.medID
FROM DIAGNOSIS, DRUG, DRUG_DIAGNOSIS
WHERE DIAGNOSIS.diagnosisID=DRUG_DIAGNOSIS.diagnosisID
AND DRUG.medID=DRUG_DIAGNOSIS.medID
AND DIAGNOSIS.name=”TB of vertebra”
Examples
The
Pacific Symposium on Biocomputing
The Pacific Symposium on Biocomputing (PSB) is an annual multidisciplinary scientific meeting co-founded in 1996 by Dr. Teri Klein, Dr. Lawrence Hunter and Sharon Surles. The conference is to presentation and discuss research in the theory and a ...
has been a venue for the popularization of the ontology mapping task in the biomedical domain, and a number of papers on the subject can be found in its proceedings.
See also
*
Data integration
Data integration refers to the process of combining, sharing, or synchronizing data from multiple sources to provide users with a unified view.
There are a wide range of possible applications for data integration, from commercial (such as when a ...
*
Dataspaces
A dataspace is an abstraction in data management that aims to overcome some of the problems encountered in a data integration system. A dataspace is defined as a set of "participants", or data sources, and the relations between them: for example ...
*
Enterprise integration
Enterprise integration is a technical field of enterprise architecture, which is focused on the study of topics such as system interconnection, electronic data interchange, product data exchange and distributed computing environments.
It is a con ...
*
Ontology-based data integration
Ontology-based data integration involves the use of one or more ontologies to effectively combine data or information from multiple heterogeneous sources. It is one of the multiple data integration approaches and may be classified as Global-As-View ...
*
Ontology alignment
Ontology alignment, or ontology matching, is the process of determining correspondences between concepts in ontologies. A set of correspondences is also called an alignment. The phrase takes on a slightly different meaning, in computer science, c ...
*
Ontology engineering
In computer science, information science and systems engineering, ontology engineering is a field which studies the methods and methodologies for building Ontology (information science), ontologies, which encompasses a representation, formal nami ...
*
Ontology matching
*
Semantic heterogeneity
*
Semantic technology
The ultimate goal of semantic technology is to help machines understand data. To enable the encoding of semantics with the data, well-known technologies are RDF (Resource Description Framework) and OWL (Web Ontology Language). These technologies ...
*
Semantic translation
Semantic translation is the process of using semantic information to aid in the translation of data in one representation or data model to another representation or data model. Semantic translation takes advantage of semantics that associate mean ...
*
Semantic unification
Semantic unification is the process of unifying lexically different concept representations that are judged to have the same semantic content (i.e., meaning). In business processes, the conceptual semantic unification is defined as "the mapping ...
References
External links
Semantic Integration: Loosely Coupling the Meaning of DataOntology Mapping: The State of the Art(2005 paper)
2010 paper by Carl HewittOpenCyc to Oracle Interface
{{DEFAULTSORT:Semantic Integration
Ontology (information science)
Data management
Semantics
Bioinformatics