Tag (metadata)
   HOME

TheInfoList



OR:

In
information system An information system (IS) is a formal, sociotechnical, organizational system designed to collect, process, store, and distribute information. From a sociotechnical perspective, information systems are composed by four components: task, people ...
s, a tag is a keyword or term assigned to a piece of information (such as an Internet bookmark,
multimedia Multimedia is a form of communication that uses a combination of different content forms such as text, audio, images, animations, or video into a single interactive presentation, in contrast to tradit ...
, database record, or
computer file A computer file is a computer resource for recording data in a computer storage device, primarily identified by its file name. Just as words can be written to paper, so can data be written to a computer file. Files can be shared with and trans ...
). This kind of metadata helps describe an item and allows it to be found again by browsing or searching. Tags are generally chosen informally and personally by the item's creator or by its viewer, depending on the system, although they may also be chosen from a
controlled vocabulary Control may refer to: Basic meanings Economics and business * Control (management), an element of management * Control, an element of management accounting * Comptroller (or controller), a senior financial officer in an organization * Control ...
. Tagging was popularized by
website A website (also written as a web site) is a collection of web pages and related content that is identified by a common domain name and published on at least one web server. Examples of notable websites are Google, Facebook, Amazon, and W ...
s associated with Web 2.0 and is an important feature of many Web 2.0 services. It is now also part of other database systems, desktop applications, and
operating system An operating system (OS) is system software that manages computer hardware, software resources, and provides common daemon (computing), services for computer programs. Time-sharing operating systems scheduler (computing), schedule tasks for ef ...
s.


Overview

People use tags to aid classification, mark ownership, note
boundaries Boundary or Boundaries may refer to: * Border, in political geography Entertainment * ''Boundaries'' (2016 film), a 2016 Canadian film * ''Boundaries'' (2018 film), a 2018 American-Canadian road trip film *Boundary (cricket), the edge of the pla ...
, and indicate online identity. Tags may take the form of words, images, or other identifying marks. An analogous example of tags in the physical world is
museum A museum ( ; plural museums or, rarely, musea) is a building or institution that cares for and displays a collection of artifacts and other objects of artistic, cultural, historical, or scientific importance. Many public museums make th ...
object tagging. People were using textual keywords to classify information and objects long before computers. Computer based search algorithms made the use of such keywords a rapid way of exploring records. Tagging gained popularity due to the growth of social bookmarking, image sharing, and social networking websites. These sites allow users to create and manage labels (or "tags") that categorize content using simple keywords. Websites that include tags often display collections of tags as tag clouds,For example, Blogger and
WordPress WordPress (WP or WordPress.org) is a free and open-source content management system (CMS) written in hypertext preprocessor language and paired with a MySQL or MariaDB database with supported HTTPS. Features include a plugin architectu ...
can display tag clouds.
as do some desktop applications.For example: Leap is a
macOS macOS (; previously OS X and originally Mac OS X) is a Unix operating system developed and marketed by Apple Inc. since 2001. It is the primary operating system for Apple's Mac computers. Within the market of desktop and la ...
application that features a clickable tag cloud of macOS tags: TaggTool is a
Windows Windows is a group of several proprietary graphical operating system families developed and marketed by Microsoft. Each family caters to a certain sector of the computing industry. For example, Windows NT for consumers, Windows Server for se ...
application that permits tagging files and displaying a tag cloud:
On websites that aggregate the tags of all users, an individual user's tags can be useful both to them and to the larger community of the website's users. Tagging systems have sometimes been classified into two kinds: ''top-down'' and ''bottom-up''. Top-down taxonomies are created by an authorized group of designers (sometimes in the form of a
controlled vocabulary Control may refer to: Basic meanings Economics and business * Control (management), an element of management * Control, an element of management accounting * Comptroller (or controller), a senior financial officer in an organization * Control ...
), whereas bottom-up taxonomies (called folksonomies) are created by all users. This definition of "top down" and "bottom up" should not be confused with the distinction between a ''single hierarchical'' tree structure (in which there is one correct way to classify each item) versus ''multiple non-hierarchical''
set Set, The Set, SET or SETS may refer to: Science, technology, and mathematics Mathematics *Set (mathematics), a collection of elements *Category of sets, the category whose objects and morphisms are sets and total functions, respectively Electro ...
s (in which there are multiple ways to classify an item); the structure of both top-down and bottom-up taxonomies may be either hierarchical, non-hierarchical, or a combination of both. Some researchers and applications have experimented with combining hierarchical and non-hierarchical tagging to aid in information retrieval. Others are combining top-down and bottom-up tagging, including in some large library catalogs ( OPACs) such as WorldCat. When tags or other taxonomies have further properties (or
semantics Semantics (from grc, σημαντικός ''sēmantikós'', "significant") is the study of reference, meaning, or truth. The term can be used to refer to subfields of several distinct disciplines, including philosophy, linguistics and comput ...
) such as relationships and attributes, they constitute an
ontology In metaphysics, ontology is the philosophy, philosophical study of being, as well as related concepts such as existence, Becoming (philosophy), becoming, and reality. Ontology addresses questions like how entities are grouped into Category ...
. Metadata tags as described in this article should not be confused with the use of the word "tag" in some software to refer to an automatically generated cross-reference; examples of the latter are ''tags tables'' in
Emacs Emacs , originally named EMACS (an acronym for "Editor MACroS"), is a family of text editors that are characterized by their extensibility. The manual for the most widely used variant, GNU Emacs, describes it as "the extensible, customizable, ...
and ''smart tags'' in
Microsoft Office Microsoft Office, or simply Office, is the former name of a family of client software, server software, and services developed by Microsoft. It was first announced by Bill Gates on August 1, 1988, at COMDEX in Las Vegas. Initially a marketin ...
.


History

The use of keywords as part of an identification and classification system long predates computers. Paper data storage devices, notably edge-notched cards, that permitted classification and sorting by multiple criteria were already in use prior to the twentieth century, and faceted classification has been used by libraries since the 1930s. In the late 1970s and early 1980s, the
Unix Unix (; trademarked as UNIX) is a family of multitasking, multiuser computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, ...
text editor
Emacs Emacs , originally named EMACS (an acronym for "Editor MACroS"), is a family of text editors that are characterized by their extensibility. The manual for the most widely used variant, GNU Emacs, describes it as "the extensible, customizable, ...
offered a companion software program called ''Tags'' that could automatically build a table of cross-references called a ''tags table'' that Emacs could use to jump between a function call and that function's definition. This use of the word "tag" did not refer to metadata tags, but was an early use of the word "tag" in software to refer to a word index.
Online database An online database is a database accessible from a local network or the Internet, as opposed to one that is stored locally on an individual computer or its attached storage (such as a CD). Online databases are hosted on websites, made available as ...
s and early websites deployed keyword tags as a way for publishers to help users find content. In the early days of the World Wide Web, the keywords meta element was used by web designers to tell web search engines what the web page was about, but these keywords were only visible in a web page's
source code In computing, source code, or simply code, is any collection of code, with or without comments, written using a human-readable programming language, usually as plain text. The source code of a program is specially designed to facilitate the ...
and were not modifiable by users. In 1997, the collaborative portal "A Description of the Equator and Some ØtherLands" produced by
documenta ''documenta'' is an exhibition of contemporary art which takes place every five years in Kassel, Germany. The ''documenta'' was founded by artist, teacher and curator Arnold Bode in 1955 as part of the Bundesgartenschau (Federal Horticultural ...
X, Germany, used the folksonomic term ''Tag'' for its co-authors and guest authors on its Upload page. In "The Equator" the term ''Tag'' for user-input was described as an ''abstract literal or keyword'' to aid the user. However, users defined singular ''Tags'', and did not share ''Tags'' at that point. In 2003, the social bookmarking website Delicious provided a way for its users to add "tags" to their bookmarks (as a way to help find them later); Delicious also provided browseable aggregated views of the bookmarks of all users featuring a particular tag. Within a couple of years, the photo sharing website Flickr allowed its users to add their own text tags to each of their pictures, constructing flexible and easy metadata that made the pictures highly searchable. The success of Flickr and the influence of Delicious popularized the concept, and other social software websites—such as
YouTube YouTube is a global online video sharing and social media platform headquartered in San Bruno, California. It was launched on February 14, 2005, by Steve Chen, Chad Hurley, and Jawed Karim. It is owned by Google, and is the second mo ...
, Technorati, and Last.fm—also implemented tagging. In 2005, the
Atom Every atom is composed of a nucleus and one or more electrons bound to the nucleus. The nucleus is made of one or more protons and a number of neutrons. Only the most common variety of hydrogen has no neutrons. Every solid, liquid, gas, a ...
web syndication standard provided a "category" element for inserting subject categories into
web feed On the World Wide Web, a web feed (or news feed) is a data format used for providing users with frequently updated content. Content distributors '' syndicate'' a web feed, thereby allowing users to ''subscribe'' a channel to it by adding the fe ...
s, and in 2007
Tim Bray Timothy William Bray (born June 21, 1955) is a Canadian software developer, environmentalist, political activist and one of the co-authors of the original XML specification. He worked for Amazon Web Services from December 2014 until May 2020 w ...
proposed a "tag" URN.


Examples


Within a blog

Many systems (and other web content management systems) allow authors to add free-form tags to a post, along with (or instead of) placing the post into a predetermined category. For example, a post may display that it has been tagged with baseball and tickets. Each of those tags is usually a web link leading to a index page listing all of the posts associated with that tag. The blog may have a sidebar listing all the tags in use on that blog, with each tag leading to an index page. To reclassify a post, an author edits its list of tags. All connections between posts are automatically tracked and updated by the blog software; there is no need to relocate the page within a complex hierarchy of categories.


Within application software

Some desktop applications and
web application A web application (or web app) is application software that is accessed using a web browser. Web applications are delivered on the World Wide Web to users with an active network connection. History In earlier computing models like client-serv ...
s feature their own tagging systems, such as email tagging in Gmail and Mozilla Thunderbird, bookmark tagging in Firefox, audio tagging in
iTunes iTunes () is a software program that acts as a media player, media library, mobile device management utility, and the client app for the iTunes Store. Developed by Apple Inc., it is used to purchase, play, download, and organize digital mu ...
or Winamp, and photo tagging in various applications. Some of these applications display collections of tags as tag clouds.


Assigned to computer files

There are various systems for applying tags to the files in a computer's file system. In
Apple An apple is an edible fruit produced by an apple tree (''Malus domestica''). Apple trees are cultivated worldwide and are the most widely grown species in the genus '' Malus''. The tree originated in Central Asia, where its wild ancest ...
's Mac System 7, released in 1991, users could assign one of seven editable colored labels (with editable names such as "Essential", "Hot", and "In Progress") to each file and folder. In later iterations of the Mac operating system ever since
OS X 10.9 OS X Mavericks (version 10.9) is the 10th major release of macOS, Apple Inc.'s desktop and server operating system for Macintosh computers. OS X Mavericks was announced on June 10, 2013, at WWDC 2013, and was released on October 22, 2013, wo ...
was released in 2013, users could assign multiple arbitrary tags as extended file attributes to any file or folder, and before that time the open-source OpenMeta standard provided similar tagging functionality for
Mac OS X macOS (; previously OS X and originally Mac OS X) is a Unix operating system developed and marketed by Apple Inc. since 2001. It is the primary operating system for Apple's Mac computers. Within the market of desktop and lap ...
. Several semantic file systems that implement tags are available for the Linux kernel, including Tagsistant.
Microsoft Windows Windows is a group of several proprietary graphical operating system families developed and marketed by Microsoft. Each family caters to a certain sector of the computing industry. For example, Windows NT for consumers, Windows Server for ...
allows users to set tags only on
Microsoft Office Microsoft Office, or simply Office, is the former name of a family of client software, server software, and services developed by Microsoft. It was first announced by Bill Gates on August 1, 1988, at COMDEX in Las Vegas. Initially a marketin ...
documents and some kinds of picture files.
Cross-platform In computing, cross-platform software (also called multi-platform software, platform-agnostic software, or platform-independent software) is computer software that is designed to work in several computing platforms. Some cross-platform software ...
file tagging standards include Extensible Metadata Platform (XMP), an
ISO standard The International Organization for Standardization (ISO ) is an international standard development organization composed of representatives from the national standards organizations of member countries. Membership requirements are given in A ...
for embedding metadata into popular image, video and document file formats, such as
JPEG JPEG ( ) is a commonly used method of lossy compression for digital images, particularly for those images produced by digital photography. The degree of compression can be adjusted, allowing a selectable tradeoff between storage size and imag ...
and PDF, without breaking their readability by applications that do not support XMP. XMP largely supersedes the earlier
IPTC Information Interchange Model The Information Interchange Model (IIM) is a file structure and set of metadata attributes that can be applied to text, images and other media types. It was developed in the early 1990s by the International Press Telecommunications Council (IPTC) ...
. Exif is a standard that specifies the image and audio file formats used by digital cameras, including some metadata tags. TagSpaces is an open-source cross-platform application for tagging files; it inserts tags into the filename.


For an event

An ''official tag'' is a keyword adopted by events and conferences for participants to use in their web publications, such as blog entries, photos of the event, and presentation slides. Search engines can then index them to make relevant materials related to the event searchable in a uniform way. In this case, the tag is part of a
controlled vocabulary Control may refer to: Basic meanings Economics and business * Control (management), an element of management * Control, an element of management accounting * Comptroller (or controller), a senior financial officer in an organization * Control ...
.


In research

A researcher may work with a large collection of items (e.g. press quotes, a bibliography, images) in digital form. If he/she wishes to associate each with a small number of themes (e.g. to chapters of a book, or to sub-themes of the overall subject), then a group of tags for these themes can be attached to each of the items in the larger collection. In this way, freeform classification allows the author to manage what would otherwise be unwieldy amounts of information.


Special types


Triple tags

A triple tag or machine tag uses a special
syntax In linguistics, syntax () is the study of how words and morphemes combine to form larger units such as phrases and sentences. Central concerns of syntax include word order, grammatical relations, hierarchical sentence structure ( constituenc ...
to define extra semantic information about the tag, making it easier or more meaningful for interpretation by a computer program. Triple tags comprise three parts: a namespace, a
predicate Predicate or predication may refer to: * Predicate (grammar), in linguistics * Predication (philosophy) * several closely related uses in mathematics and formal logic: **Predicate (mathematical logic) **Propositional function **Finitary relation, o ...
, and a value. For example, geo:long=50.123456 is a tag for the geographical
longitude Longitude (, ) is a geographic coordinate that specifies the east– west position of a point on the surface of the Earth, or another celestial body. It is an angular measurement, usually expressed in degrees and denoted by the Greek let ...
coordinate whose value is 50.123456. This triple structure is similar to the Resource Description Framework model for information. The triple tag format was first devised for geolicious in November 2004, to map Delicious bookmarks, and gained wider acceptance after its adoption by Mappr and GeoBloggers to map Flickr photos. In January 2007, Aaron Straup Cope at Flickr introduced the term ''machine tag'' as an alternative name for the triple tag, adding some questions and answers on purpose, syntax, and use. Specialized metadata for geographical identification is known as '' geotagging''; machine tags are also used for other purposes, such as identifying photos taken at a specific event or naming species using
binomial nomenclature In Taxonomy (biology), taxonomy, binomial nomenclature ("two-term naming system"), also called nomenclature ("two-name naming system") or binary nomenclature, is a formal system of naming species of living things by giving each a name compos ...
.


Hashtags

A hashtag is a kind of metadata tag marked by the prefix #, sometimes known as a "hash" symbol. This form of tagging is used on microblogging and social networking services such as
Twitter Twitter is an online social media and social networking service owned and operated by American company Twitter, Inc., on which users post and interact with 280-character-long messages known as "tweets". Registered users can post, like, and ...
,
Facebook Facebook is an online social media and social networking service owned by American company Meta Platforms. Founded in 2004 by Mark Zuckerberg with fellow Harvard College students and roommates Eduardo Saverin, Andrew McCollum, Dust ...
, Google+, VK and
Instagram Instagram is a photo and video sharing social networking service owned by American company Meta Platforms. The app allows users to upload media that can be edited with filters and organized by hashtags and geographical tagging. Posts can ...
. The hash is used to distinguish tag text, as distinct, from other text in the post.


Knowledge tags

A knowledge tag is a type of meta-information that describes or defines some aspect of a piece of information (such as a
document A document is a written, drawn, presented, or memorialized representation of thought, often the manifestation of non-fictional, as well as fictional, content. The word originates from the Latin ''Documentum'', which denotes a "teaching" o ...
, digital image, database table, or web page). Knowledge tags are more than traditional non-hierarchical keywords or terms; they are a type of metadata that captures knowledge in the form of descriptions, categorizations, classifications,
semantics Semantics (from grc, σημαντικός ''sēmantikós'', "significant") is the study of reference, meaning, or truth. The term can be used to refer to subfields of several distinct disciplines, including philosophy, linguistics and comput ...
, comments, notes, annotations, hyperdata, hyperlinks, or references that are collected in tag profiles (a kind of
ontology In metaphysics, ontology is the philosophy, philosophical study of being, as well as related concepts such as existence, Becoming (philosophy), becoming, and reality. Ontology addresses questions like how entities are grouped into Category ...
). These tag profiles reference an information resource that resides in a distributed, and often heterogeneous, storage repository. Knowledge tags are part of a
knowledge management Knowledge management (KM) is the collection of methods relating to creating, sharing, using and managing the knowledge and information of an organization. It refers to a multidisciplinary approach to achieve organisational objectives by making ...
discipline that leverages Enterprise 2.0 methodologies for users to capture insights, expertise, attributes, dependencies, or relationships associated with a data resource. Different kinds of knowledge can be captured in knowledge tags, including factual knowledge (that found in books and data), conceptual knowledge (found in perspectives and concepts), expectational knowledge (needed to make judgments and hypothesis), and methodological knowledge (derived from reasoning and strategies). These forms of
knowledge Knowledge can be defined as awareness of facts or as practical skills, and may also refer to familiarity with objects or situations. Knowledge of facts, also called propositional knowledge, is often defined as true belief that is distin ...
often exist outside the data itself and are derived from personal experience, insight, or expertise. Knowledge tags are considered an expansion of the information itself that adds additional value, context, and meaning to the information. Knowledge tags are valuable for preserving organizational intelligence that is often lost due to turnover, for sharing knowledge stored in the minds of individuals that is typically isolated and unharnessed by the organization, and for connecting knowledge that is often lost or disconnected from an information resource.


Advantages and disadvantages

In a typical tagging system, there is no explicit information about the meaning or
semantics Semantics (from grc, σημαντικός ''sēmantikós'', "significant") is the study of reference, meaning, or truth. The term can be used to refer to subfields of several distinct disciplines, including philosophy, linguistics and comput ...
of each tag, and a user can apply new tags to an item as easily as applying older tags. Hierarchical classification systems can be slow to change, and are rooted in the culture and era that created them; in contrast, the flexibility of tagging allows users to classify their collections of items in the ways that they find useful, but the personalized variety of terms can present challenges when searching and browsing. When users can freely choose tags (creating a
folksonomy Folksonomy is a classification system in which end users apply public tags to online items, typically to make those items easier for themselves or others to find later. Over time, this can give rise to a classification system based on those tags ...
, as opposed to selecting terms from a
controlled vocabulary Control may refer to: Basic meanings Economics and business * Control (management), an element of management * Control, an element of management accounting * Comptroller (or controller), a senior financial officer in an organization * Control ...
), the resulting metadata can include
homonym In linguistics, homonyms are words which are homographs (words that share the same spelling, regardless of pronunciation), or homophones ( equivocal words, that share the same pronunciation, regardless of spelling), or both. Using this definitio ...
s (the same tags used with different meanings) and synonyms (multiple tags for the same concept), which may lead to inappropriate connections between items and inefficient searches for information about a subject. For example, the tag "orange" may refer to the
fruit In botany, a fruit is the seed-bearing structure in flowering plants that is formed from the ovary after flowering. Fruits are the means by which flowering plants (also known as angiosperms) disseminate their seeds. Edible fruits in partic ...
or the
color Color (American English) or colour (British English) is the visual perceptual property deriving from the spectrum of light interacting with the photoreceptor cells of the eyes. Color categories and physical specifications of color are associ ...
, and items related to a version of the Linux kernel may be tagged "Linux", "kernel", "Penguin", "software", or a variety of other terms. Users can also choose tags that are different
inflection In linguistic morphology, inflection (or inflexion) is a process of word formation in which a word is modified to express different grammatical categories such as tense, case, voice, aspect, person, number, gender, mood, animacy, and ...
s of words (such as singular and plural), which can contribute to navigation difficulties if the system does not include stemming of tags when searching or browsing. Larger-scale folksonomies address some of the problems of tagging, in that users of tagging systems tend to notice the current use of "tag terms" within these systems, and thus use existing tags in order to easily form connections to related items. In this way, folksonomies may collectively develop a partial set of tagging conventions.


Complex system dynamics

Despite the apparent lack of control, research has shown that a simple form of shared vocabulary emerges in social bookmarking systems. Collaborative tagging exhibits a form of complex systems dynamics (or self-organizing dynamics). Thus, even if no central controlled vocabulary constrains the actions of individual users, the distribution of tags converges over time to stable power law distributions. Once such stable distributions form, simple folksonomic vocabularies can be extracted by examining the correlations that form between different tags. In addition, research has suggested that it is easier for
machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...
algorithms to learn tag semantics when users tag "verbosely"—when they annotate resources with a wealth of freely associated, descriptive keywords.


Spamming

Tagging systems open to the public are also open to tag spam, in which people apply an excessive number of tags or unrelated tags to an item (such as a
YouTube YouTube is a global online video sharing and social media platform headquartered in San Bruno, California. It was launched on February 14, 2005, by Steve Chen, Chad Hurley, and Jawed Karim. It is owned by Google, and is the second mo ...
video) in order to attract viewers. This abuse can be mitigated using human or statistical identification of spam items. The number of tags allowed may also be limited to reduce spam.


Syntax

Some tagging systems provide a single text box to enter tags, so to be able to tokenize the string, a separator must be used. Two popular separators are the space character and the
comma The comma is a punctuation mark that appears in several variants in different languages. It has the same shape as an apostrophe or single closing quotation mark () in many typefaces, but it differs from them in being placed on the baseline ...
. To enable the use of separators in the tags, a system may allow for higher-level separators (such as quotation marks) or escape characters. Systems can avoid the use of separators by allowing only one tag to be added to each input Web widget, widget at a time, although this makes adding multiple tags more time-consuming. A syntax for use within HTML is to use the rel-tag microformat which uses the Rel attribute, ''rel'' attribute with value "tag" (i.e., rel="tag") to indicate that the linked-to page acts as a tag for the current context.


See also

* Annotation * Collective intelligence * Concept map * Enterprise bookmarking * Enterprise social software * Expert system * Explicit knowledge * Human–computer interaction * Information ecology * Knowledge transfer * Knowledge worker * Management information system * Metaknowledge * Organisational memory * RRID * Semantics * Semantic web * Social network aggregation * Subject (documents) * Subject indexing


References

{{DEFAULTSORT:Tag (Metadata) Collective intelligence Computer jargon Information retrieval techniques Knowledge representation Metadata Reference Web 2.0