Microdata is a
WHATWG
The Web Hypertext Application Technology Working Group (WHATWG) is a community of people interested in evolving HTML and related technologies. The WHATWG was founded by individuals from Apple Inc., the Mozilla Foundation and Opera Software, ...
HTML
The HyperText Markup Language or HTML is the standard markup language for documents designed to be displayed in a web browser. It can be assisted by technologies such as Cascading Style Sheets (CSS) and scripting languages such as JavaScri ...
specification used to nest
metadata within existing content on web pages.
Search engines
A search engine is a software system designed to carry out web searches. They search the World Wide Web in a systematic way for particular information specified in a textual web search query. The search results are generally presented in a l ...
,
web crawlers
A Web crawler, sometimes called a spider or spiderbot and often shortened to crawler, is an Internet bot that systematically browses the World Wide Web and that is typically operated by search engines for the purpose of Web indexing (''web spi ...
, and
browsers can extract and process Microdata from a web page and use it to provide a richer browsing experience for users. Search engines benefit greatly from direct access to this structured data because it allows them to understand the information on web pages and provide more relevant
results
A result is the outcome of an event.
Result or Results may also refer to: Music
* ''Results'' (album), a 1989 album by Liza Minnelli
* ''Results'', a 2012 album by Murder Construct
* "The Result", a single by The Upsetters
* "The Result", a so ...
to users. Microdata uses a supporting vocabulary to describe an item and name-value pairs to assign values to its properties.
Microdata is an attempt to provide a simpler way of annotating
HTML element
An HTML element is a type of HTML (HyperText Markup Language) document component, one of several types of HTML nodes (there are also text nodes, comment nodes and others). The first used version of HTML was written by Tim Berners-Lee in 1993 ...
s with machine-readable tags than the similar approaches of using
RDFa and
microformat
Microformats (μF) are a set of defined HTML classes created to serve as consistent and descriptive metadata about an element, designating it as representing a certain type of data (such as contact information, geographic coordinates, even ...
s.
In 2013, because the W3C HTML Working Group failed to find someone to serve as an editor for the Microdata HTML specification, its development was terminated with a 'Note'. However, since that time, two new editors were selected, and five newer versions of the working draft have been published,
the most recent bein
Working Draft 26 April 2018
Vocabularies
Microdata vocabularies do not provide the
semantics
Semantics (from grc, σημαντικός ''sēmantikós'', "significant") is the study of reference, meaning, or truth. The term can be used to refer to subfields of several distinct disciplines, including philosophy, linguistics and compu ...
, or meaning of an Item. Web developers can design a custom vocabulary or use vocabularies available on the web. A collection of commonly used markup vocabularies are provided by
Schema.org schemas which include: ''Person'', "''Place''", ''Event'', ''Organization'', ''Product'', ''Review'', ''Review-aggregate'', ''Breadcrumb'', ''Offer'', ''Offer-aggregate''. The website schema.org was established by search engine operators like
Google
Google LLC () is an American Multinational corporation, multinational technology company focusing on Search Engine, search engine technology, online advertising, cloud computing, software, computer software, quantum computing, e-commerce, ar ...
,
Microsoft
Microsoft Corporation is an American multinational corporation, multinational technology company, technology corporation producing Software, computer software, consumer electronics, personal computers, and related services headquartered at th ...
,
Yahoo!
Yahoo! (, styled yahoo''!'' in its logo) is an American web services provider. It is headquartered in Sunnyvale, California and operated by the namesake company Yahoo! Inc. (2017–present), Yahoo Inc., which is 90% owned by investment funds ma ...
, and
Yandex
Yandex LLC (russian: link=no, Яндекс, p=ˈjandəks) is a Russian multinational technology company providing Internet-related products and services, including an Internet search engine, information services, e-commerce, transportation, map ...
, which use microdata markup to improve search results.
For some purposes, an ad-hoc vocabulary is adequate. For others, a vocabulary will need to be designed. Where possible, authors are encouraged to re-use existing vocabularies, as this makes content re-use easier.
Localization
In some cases, search engines covering specific regions may provide locally-specific extensions of microdata. For example,
Yandex
Yandex LLC (russian: link=no, Яндекс, p=ˈjandəks) is a Russian multinational technology company providing Internet-related products and services, including an Internet search engine, information services, e-commerce, transportation, map ...
, a major search engine in Russia, supports
microformats
Microformats (μF) are a set of defined HTML classes created to serve as consistent and descriptive metadata about an element, designating it as representing a certain type of data (such as contact information, geographic coordinates, even ...
such as
hCard
hCard is a microformat for publishing the contact details (which might be no more than the name) of people, companies, organizations, and places, in HTML, Atom, RSS, or arbitrary XML. The hCard microformat does this using a 1:1 representation of ...
(company contact information),
hRecipe
hRecipe is a draft microformat for publishing details of recipes using (X)HTML on web pages, using HTML classes and ''rel'' attributes. In its simplest form, it can be used to identify individual foodstuffs, because the only required propertie ...
(food recipe),
hReview
hReview is a microformat for publishing reviews of books, music, films, restaurants, businesses, holidays, etc.Microformats: Empowering Your Markup for Web 2.0 (2007) by John Allsopp p. 200 using (X)HTML on web pages, using HTML classes and ''re ...
(market reviews) and
hProduct
hProduct is a microformat for publishing details of products, on web pages, using (X)HTML classes and ''rel'' attributes.
On 12 May 2009, Google announced that they would be parsing the hProduct, hCard
hCard is a microformat for publishing ...
(product data) and provides its own format for definition of the terms and encyclopedic articles. This extension was made in order to solve
transliteration
Transliteration is a type of conversion of a text from one script to another that involves swapping letters (thus ''trans-'' + '' liter-'') in predictable ways, such as Greek → , Cyrillic → , Greek → the digraph , Armenian → or ...
problems between the Cyrillic and Latin alphabets. After the implementation of additional parameters from Schema's vocabulary,
indexation of information in Russian-language web-pages became more successful.
Global attributes
*
itemscope
– Creates the Item and indicates that descendants of this
element contain information about it.
*
itemtype
– A valid URL of a vocabulary that describes the item and its properties context.
*
itemid
– Indicates a unique identifier of the item.
*
itemprop
– Indicates that its containing tag holds the value of the specified item property. The property's name and value context are described by the item's vocabulary. Properties values usually consist of string values, but can also use
URL
A Uniform Resource Locator (URL), colloquially termed as a web address, is a reference to a web resource that specifies its location on a computer network and a mechanism for retrieving it. A URL is a specific type of Uniform Resource Identifi ...
s using the
a
element and its
href
attribute, the
img
element and its
src
attribute, or other elements that link to or embed external resources.
[
* ]itemref
– Properties that are not descendants of the element with the itemscope
attribute can be associated with the item using this attribute. Provides a list of element ids (not itemid
s) with additional properties elsewhere in the document.
* datetime
- Indicates date or duration as specified by ISO 8601
ISO 8601 is an international standard covering the worldwide exchange and communication of date and time-related data. It is maintained by the Geneva-based International Organization for Standardization (ISO) and was first published in 1988, ...
standard.
Example
The following HTML5 markup may be found on a typical “About” page containing information about a person:
Hello, my name is John Doe, I am a graduate research assistant at
the University of Dreams.
My friends call me Johnny.
You can visit my homepage at www.example.com/~JohnnyD.
I live at 1234 Peach Drive, Warner Robins, Georgia.
Here is the same markup with added Schema.org Microdata:
Hello, my name is
John Doe,
I am a
graduate research assistant
at the
University of Dreams.
My friends call me
Johnny.
You can visit my homepage at
www.example.com/~JohnnyD.
I live at
1234 Peach Drive,
Warner Robins,
Georgia.
As the above example shows, Microdata items can be nested. In this case, an item of type http://schema.org/PostalAddress is nested inside an item of type http://schema.org/Person.
The following text shows how Google parses the Microdata from the above example code. Developers can test pages containing Microdata using Google's ''Rich Snippet Testing Tool''.
Item
Type: http://schema.org/Person
name = John Doe
jobTitle = graduate research assistant
affiliation = University of Dreams
additionalName = Johnny
url = http://www.example.com/~JohnnyD
address = Item(1)
Item 1
Type: http://schema.org/PostalAddress
streetAddress = 1234 Peach Drive
addressLocality = Warner Robins
addressRegion = Georgia
The same machine-readable terms can be used not only in HTML Microdata, but also in other annotations such as RDFa or JSON-LD
JSON-LD (JavaScript Object Notation for Linked Data) is a method of encoding linked data using JSON. One goal for JSON-LD was to require as little effort as possible from developers to transform their existing JSON to JSON-LD. JSON-LD allows data ...
in the markup, or in an external RDF file in a serialization such as RDF/XML
RDF/XML is a syntax,[RDF/XML Syntax Specification](_blank)
Notation3
Notation3, or N3 as it is more commonly known, is a shorthand non-XML serialization of Resource Description Framework models, designed with human-readability in mind: N3 is much more compact and readable than XML RDF notation. The format is being ...
, or Turtle
Turtles are an order of reptiles known as Testudines, characterized by a special shell developed mainly from their ribs. Modern turtles are divided into two major groups, the Pleurodira (side necked turtles) and Cryptodira (hidden necked ...
.
Support
* Servers: Google
Google LLC () is an American Multinational corporation, multinational technology company focusing on Search Engine, search engine technology, online advertising, cloud computing, software, computer software, quantum computing, e-commerce, ar ...
can use microdata in its result pages. It was the preferred snippet format for the Google+
Google LLC () is an American multinational technology company focusing on search engine technology, online advertising, cloud computing, computer software, quantum computing, e-commerce, artificial intelligence, and consumer electronics. ...
social network.
* Browsers: , no major browser supports the Microdata DOM API
An application programming interface (API) is a way for two or more computer programs to communicate with each other. It is a type of software interface, offering a service to other pieces of software. A document or standard that describes how ...
. Opera supported it from 11.60 (released in 2011), but since removed its implementation. Firefox removed it in version 49.
See also
* Semantic web
* Microformat
Microformats (μF) are a set of defined HTML classes created to serve as consistent and descriptive metadata about an element, designating it as representing a certain type of data (such as contact information, geographic coordinates, even ...
* RDFa Lite
* JSON-LD
JSON-LD (JavaScript Object Notation for Linked Data) is a method of encoding linked data using JSON. One goal for JSON-LD was to require as little effort as possible from developers to transform their existing JSON to JSON-LD. JSON-LD allows data ...
* Semantic HTML
Semantic HTML is the use of HTML markup to reinforce the semantics, or meaning, of the information in web pages and web applications rather than merely to define its presentation or look. Semantic HTML is processed by traditional web browsers a ...
* Semantic social network A semantic social network is the result of the application of Semantic Web technologies to social networks and online social media.
History
The term Semantic Social Networks was coined independently by Stephen Downes anMarco Neumannin 2004 to de ...
References
External links
*
*
*
*
{{Semantic Web
Semantic HTML
Search engine optimization