Semantic publishing on the
Web
Web most often refers to:
* Spider web, a silken structure created by the animal
* World Wide Web or the Web, an Internet-based hypertext system
Web, WEB, or the Web may also refer to:
Computing
* WEB, a literate programming system created by ...
, or
semantic web
The Semantic Web, sometimes known as Web 3.0, is an extension of the World Wide Web through standards set by the World Wide Web Consortium (W3C). The goal of the Semantic Web is to make Internet data machine-readable.
To enable the encoding o ...
publishing, refers to publishing information on the web as documents accompanied by
semantic markup. Semantic publication provides a way for computers to understand the structure and even the meaning of the published information, making information search and
data integration
Data integration refers to the process of combining, sharing, or synchronizing data from multiple sources to provide users with a unified view.
There are a wide range of possible applications for data integration, from commercial (such as when a ...
more efficient.
Although semantic publishing is not specific to the Web, it has been driven by the rising of the semantic web. In the semantic web, published information is accompanied by metadata describing the information, providing a "semantic" context.
Although semantic publishing has the potential to change the face of
web publishing, acceptance depends on the emergence of compelling applications. Web sites can already be built with all contents in both
HTML
Hypertext Markup Language (HTML) is the standard markup language for documents designed to be displayed in a web browser. It defines the content and structure of web content. It is often assisted by technologies such as Cascading Style Sheets ( ...
format and semantic format.
RSS
RSS ( RDF Site Summary or Really Simple Syndication) is a web feed that allows users and applications to access updates to websites in a standardized, computer-readable format. Subscribing to RSS feeds can allow a user to keep track of many ...
1.0, uses
RDF (a semantic web standard) format, although it has become less popular than RSS2.0 and
Atom
Atoms are the basic particles of the chemical elements. An atom consists of a atomic nucleus, nucleus of protons and generally neutrons, surrounded by an electromagnetically bound swarm of electrons. The chemical elements are distinguished fr ...
.
[
Web2express.org applies RDF to various data feeds. Anyone can use their service:
to create and provide RDF data resources and datafeeds for products, news, events, jobs and studies.]
Semantic publishing has the potential to revolutionize
scientific publishing
Scientific literature encompasses a vast body of academic papers that spans various disciplines within the natural and social sciences. It primarily consists of academic papers that present original empirical research and theoretical ...
.
Tim Berners-Lee
Sir Timothy John Berners-Lee (born 8 June 1955), also known as TimBL, is an English computer scientist best known as the inventor of the World Wide Web, the HTML markup language, the URL system, and HTTP. He is a professorial research fellow a ...
predicted in 2001 that the semantic web "will likely profoundly change the very nature of how
scientific knowledge
Science is a systematic discipline that builds and organises knowledge in the form of testable hypotheses and predictions about the universe. Modern science is typically divided into twoor threemajor branches: the natural sciences, which stu ...
is produced and shared, in ways that we can now barely imagine". Revisiting the semantic web in 2006, he and his colleagues believed the semantic web "could bring about a revolution in how, for example, scientific content is managed throughout its life cycle". Researchers could directly self-publish their experiment data in "semantic" format on the web. Semantic search engines could then make these data widely available. The
W3C
The World Wide Web Consortium (W3C) is the main international standards organization for the World Wide Web. Founded in 1994 by Tim Berners-Lee, the consortium is made up of member organizations that maintain full-time staff working together in ...
interest group in healthcare and life sciences is exploring this idea.
Two approaches
*Publish information as data objects using semantic web languages like
RDF and
OWL.
Ontology
Ontology is the philosophical study of existence, being. It is traditionally understood as the subdiscipline of metaphysics focused on the most general features of reality. As one of the most fundamental concepts, being encompasses all of realit ...
is usually developed for a specific information domain, which can formally represent the data in its domain. Semantic publishing of more general information like product information, news, and job openings uses so-called
shallow ontology.
[ The SWEO Linking Open Data Project maintains a list of data sources that follow this approach as well as a list of Semantic Publishing Tools.
*Express structured data in ]markup languages
A markup language is a text-encoding system which specifies the structure and formatting of a document and potentially the relationships among its parts. Markup can control the display of a document or enrich its content to facilitate automated p ...
with RDFa
RDFa or Resource Description Framework in Attributes is a W3C Recommendation that adds a set of attribute-level extensions to HTML, XHTML and various XML-based document types for embedding rich metadata within web documents. The Resource Descript ...
, embed or publish information using JSON-LD
JSON-LD (JavaScript Object Notation for Linked Data) is a method of encoding linked data using JSON. One goal for JSON-LD was to require as little effort as possible from developers to transform their existing JSON to JSON-LD. JSON-LD allows data ...
, Turtle
Turtles are reptiles of the order (biology), order Testudines, characterized by a special turtle shell, shell developed mainly from their ribs. Modern turtles are divided into two major groups, the Pleurodira (side necked turtles) and Crypt ...
, TriG syntaxes.
Examples
Examples of free or open source tools and services
Ambra Project
is open source software designed to publish open access
Open access (OA) is a set of principles and a range of practices through which nominally copyrightable publications are delivered to readers free of access charges or other barriers. With open access strictly defined (according to the 2001 de ...
journals with RDF. Used by PLoS
PLOS (for Public Library of Science; PLoS until 2012) is a nonprofit publisher of open-access journals in science, technology, and medicine and other scientific literature, under an open-content license. It was founded in 2000 and launched it ...
.
* Semantic MediaWiki: An extension to the wiki application MediaWiki
MediaWiki is free and open-source wiki software originally developed by Magnus Manske for use on Wikipedia on January 25, 2002, and further improved by Lee Daniel Crocker,mailarchive:wikipedia-l/2001-August/000382.html, Magnus Manske's announc ...
that allows users to semantically annotate data on the wiki, and then publish it in formats such as RDF XML.
D2R Server
Tool for publishing relational databases on the Semantic Web as Linked Data
In computing, linked data is structured data which is interlinked with other data so it becomes more useful through semantic queries. It builds upon standard Web technologies such as HTTP, RDF and URIs, but rather than using them to serve web ...
and SPARQL
SPARQL (pronounced ":wikt:sparkle, sparkle", a recursive acronym for SPARQL Protocol and RDF Query Language) is an RDF query language—that is, a Semantic Query, semantic query language for databases—able to retrieve and manipulate data sto ...
endpoints.
* Utopia Documents Interactive documents
dokieli
is a client-side editor for decentralized
Decentralization or decentralisation is the process by which the activities of an organization, particularly those related to planning and decision-making, are distributed or delegated away from a central, authoritative location or group and gi ...
article publishing in HTML+RDFa (and embeddable TriG, Turtle, JSON-LD), annotations and social interactions. It implements W3C specifications: Web Annotation
Web annotation can refer to online annotations of web resources such as web pages or parts of them, or a set of World Wide Web Consortium, W3C W3C recommendation, standards developed for this purpose. The term can also refer to the creations of an ...
, Linked Data Notifications
Activity Streams 2.0
ActivityPub. Employs WebID
WebID is a method for internet services and members to know who they are communicating with. The WebID specifications define a set oto prepare the process of standardization for identity, identification and authentication on HTTP-based networks. W ...
+ TLS and WebID+ OIDC for authentication, Web access control list
In computer security, an access-control list (ACL) is a list of permissions associated with a system resource (object or facility). An ACL specifies which users or system processes are granted access to resources, as well as what operations are ...
and compliant with Linked Data Platform. Articles and annotations can be individually assigned with a Creative Commons license
A Creative Commons (CC) license is one of several public copyright licenses that enable the free distribution of an otherwise copyrighted "work". A CC license is used when an author wants to give other people the right to share, use, and bu ...
as well as a language
Language is a structured system of communication that consists of grammar and vocabulary. It is the primary means by which humans convey meaning, both in spoken and signed language, signed forms, and may also be conveyed through writing syste ...
. It
source code
uses the Apache License
The Apache License is a permissive free software license written by the Apache Software Foundation (ASF). It allows users to use the software for any purpose, to distribute it, to modify it, and to distribute modified versions of the software ...
, Version 2.0.
See also
* JSON-LD
JSON-LD (JavaScript Object Notation for Linked Data) is a method of encoding linked data using JSON. One goal for JSON-LD was to require as little effort as possible from developers to transform their existing JSON to JSON-LD. JSON-LD allows data ...
* Metadata
Metadata (or metainformation) is "data that provides information about other data", but not the content of the data itself, such as the text of a message or the image itself. There are many distinct types of metadata, including:
* Descriptive ...
* Metadata publishing Metadata publishing is the process of making metadata data elements available to external users, both people and machines using a formal review process and a commitment to change control processes.
Metadata publishing is the foundation upon which a ...
* Open Semantic Framework
* Semantic technology
The ultimate goal of semantic technology is to help machines understand data. To enable the encoding of semantics with the data, well-known technologies are RDF (Resource Description Framework) and OWL (Web Ontology Language). These technologies ...
* RDF feed
* Data feed
References
Further reading
Tutorial on How to publish Linked Data on the Web
Resources for semantic publishing
SePublica 2011, the first international workshop on semantic publishing
{{DEFAULTSORT:Semantic Publishing
Academic publishing
Electronic publishing
Metadata publishing
Semantic Web