
In
information system
An information system (IS) is a formal, sociotechnical, organizational system designed to collect, process, Information Processing and Management, store, and information distribution, distribute information. From a sociotechnical perspective, info ...
s, a tag is a
keyword or term assigned to a piece of information (such as an
Internet bookmark,
multimedia
Multimedia is a form of communication that uses a combination of different content forms, such as Text (literary theory), writing, Sound, audio, images, animations, or video, into a single presentation. T ...
, database
record, or
computer file
A computer file is a System resource, resource for recording Data (computing), data on a Computer data storage, computer storage device, primarily identified by its filename. Just as words can be written on paper, so too can data be written to a ...
). This kind of
metadata
Metadata (or metainformation) is "data that provides information about other data", but not the content of the data itself, such as the text of a message or the image itself. There are many distinct types of metadata, including:
* Descriptive ...
helps describe an item and allows it to be found again by browsing or searching. Tags are generally chosen informally and personally by the item's creator or by its viewer, depending on the system, although they may also be chosen from a
controlled vocabulary
A controlled vocabulary provides a way to organize knowledge for subsequent retrieval. Controlled vocabularies are used in subject indexing schemes, subject headings, thesauri, taxonomies and other knowledge organization systems. Controlled v ...
.
Tagging was popularized by
website
A website (also written as a web site) is any web page whose content is identified by a common domain name and is published on at least one web server. Websites are typically dedicated to a particular topic or purpose, such as news, educatio ...
s associated with
Web 2.0 and is an important feature of many Web 2.0 services.
It is now also part of other
database systems,
desktop applications, and
operating system
An operating system (OS) is system software that manages computer hardware and software resources, and provides common daemon (computing), services for computer programs.
Time-sharing operating systems scheduler (computing), schedule tasks for ...
s.
Overview
People use tags to aid
classification
Classification is the activity of assigning objects to some pre-existing classes or categories. This is distinct from the task of establishing the classes themselves (for example through cluster analysis). Examples include diagnostic tests, identif ...
, mark ownership, note
boundaries, and indicate
online identity
Internet identity (IID), also online identity, online personality, online persona or internet persona, is a social identity that an Internet user establishes in online communities and websites. It may also be an actively constructed presentatio ...
. Tags may take the form of words, images, or other identifying marks. An analogous example of tags in the physical world is
museum
A museum is an institution dedicated to displaying or Preservation (library and archive), preserving culturally or scientifically significant objects. Many museums have exhibitions of these objects on public display, and some have private colle ...
object tagging. People were using textual
keywords to
classify information and objects long before computers. Computer based
search algorithm
In computer science, a search algorithm is an algorithm designed to solve a search problem. Search algorithms work to retrieve information stored within particular data structure, or calculated in the Feasible region, search space of a problem do ...
s made the use of such keywords a rapid way of exploring records.
Tagging gained popularity due to the growth of
social bookmarking
Social bookmarking is an online service which allows users to add, annotate, edit, and share Internet bookmark, bookmarks of web documents. Many online bookmark management services have launched since 1996; Delicious (website), Delicious, founded i ...
,
image sharing, and
social networking
A social network is a social structure consisting of a set of social actors (such as individuals or organizations), networks of Dyad (sociology), dyadic ties, and other Social relation, social interactions between actors. The social network per ...
websites.
These sites allow users to create and manage labels (or "tags") that categorize content using simple keywords. Websites that include tags often display collections of tags as
tag clouds, as do some desktop applications. On websites that aggregate the tags of all users, an individual user's tags can be useful both to them and to the larger community of the website's users.
Tagging systems have sometimes been classified into two kinds: ''top-down'' and ''bottom-up''.
Top-down
taxonomies are created by an authorized group of designers (sometimes in the form of a
controlled vocabulary
A controlled vocabulary provides a way to organize knowledge for subsequent retrieval. Controlled vocabularies are used in subject indexing schemes, subject headings, thesauri, taxonomies and other knowledge organization systems. Controlled v ...
), whereas bottom-up taxonomies (called
folksonomies) are created by all users.
This definition of "top down" and "bottom up" should not be confused with the distinction between a ''single hierarchical''
tree structure
A tree structure, tree diagram, or tree model is a way of representing the hierarchical nature of a structure in a graphical form. It is named a "tree structure" because the classic representation resembles a tree, although the chart is gen ...
(in which there is one correct way to classify each item) versus ''multiple non-hierarchical''
sets (in which there are multiple ways to classify an item); the structure of both top-down and bottom-up taxonomies may be either hierarchical, non-hierarchical, or a combination of both.
Some researchers and applications have experimented with combining hierarchical and non-hierarchical tagging to aid in information retrieval. Others are combining top-down and bottom-up tagging, including in some large library catalogs (
OPACs) such as
WorldCat
WorldCat is a union catalog that itemizes the collections of tens of thousands of institutions (mostly libraries), in many countries, that are current or past members of the OCLC global cooperative. It is operated by OCLC, Inc. Many of the O ...
.
When tags or other taxonomies have further properties (or
semantics
Semantics is the study of linguistic Meaning (philosophy), meaning. It examines what meaning is, how words get their meaning, and how the meaning of a complex expression depends on its parts. Part of this process involves the distinction betwee ...
) such as
relationships and
attributes, they constitute an
ontology
Ontology is the philosophical study of existence, being. It is traditionally understood as the subdiscipline of metaphysics focused on the most general features of reality. As one of the most fundamental concepts, being encompasses all of realit ...
.
In folder system a file cannot exist in two or more folders so tag system has been thought more convinient. But transitioning to tag system requires awareness of differece between properties of two systems. In foler system the information of classification is put outside of the file and we can change folder at once. In tag system the information of classification is put inside the file so changing its tag means changing the file and it needs to be saved again and takes time.
Metadata tags as described in this article should not be confused with the use of the word "tag" in some software to refer to an automatically generated
cross-reference
The term cross-reference (abbreviation: xref) can refer to either:
* An instance within a document which refers to related information elsewhere in the same document. In both printed and online dictionaries cross-references are important because ...
; examples of the latter are ''tags tables'' in
Emacs
Emacs (), originally named EMACS (an acronym for "Editor Macros"), is a family of text editors that are characterized by their extensibility. The manual for the most widely used variant, GNU Emacs, describes it as "the extensible, customizable, s ...
and
''smart tags'' in
Microsoft Office.
History
The use of keywords as part of an identification and classification system long predates computers.
Paper data storage devices, notably
edge-notched card
Edge-notched cards or edge-punched cards are a system used to store a small amount of binary or logical data on paper index cards, encoded via the presence or absence of notches in the edges of the cards. The notches allow efficient sorting of a l ...
s, that permitted classification and sorting by multiple criteria were already in use prior to the twentieth century, and
faceted classification has been used by libraries since the 1930s.
In the late 1970s and early 1980s,
Emacs
Emacs (), originally named EMACS (an acronym for "Editor Macros"), is a family of text editors that are characterized by their extensibility. The manual for the most widely used variant, GNU Emacs, describes it as "the extensible, customizable, s ...
, the text editor for
Unix
Unix (, ; trademarked as UNIX) is a family of multitasking, multi-user computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, a ...
systems, offered a companion software program called ''Tags'' that could automatically build a table of cross-references called a ''tags table'' that Emacs could use to jump between a
function call and that function's definition. This use of the word "tag" did not refer to metadata tags, but was an early use of the word "tag" in software to refer to a
word index.
Online database
In computing, a database is an organized collection of Data (computing), data or a type of data store based on the use of a database management system (DBMS), the software that interacts with end users, Application software, applications, and ...
s and early websites deployed keyword tags as a way for publishers to help users find content. In the early days of the
World Wide Web
The World Wide Web (WWW or simply the Web) is an information system that enables Content (media), content sharing over the Internet through user-friendly ways meant to appeal to users beyond Information technology, IT specialists and hobbyis ...
, the
keywords
meta element was used by
web designers to tell
web search engine
A search engine is a software system that provides hyperlinks to web pages, and other relevant information on World Wide Web, the Web in response to a user's web query, query. The user enters a query in a web browser or a mobile app, and the sea ...
s what the web page was about, but these keywords were only visible in a web page's
source code
In computing, source code, or simply code or source, is a plain text computer program written in a programming language. A programmer writes the human readable source code to control the behavior of a computer.
Since a computer, at base, only ...
and were not modifiable by users.
In 1997, the collaborative portal "A Description of the Equator and Some ØtherLands" produced by
documenta
Documenta (often stylized documenta) is an Art exhibition, exhibition of contemporary art which takes place every five years in Kassel, Germany.
Documenta was founded by artist, teacher and curator Arnold Bode in 1955 as part of the Bundesgarte ...
X, Germany, used the
folksonomic term ''Tag'' for its co-authors and guest authors on its Upload page. In "The Equator" the term ''Tag'' for user-input was described as an ''abstract literal or keyword'' to aid the user. However, users defined singular ''Tags'', and did not share ''Tags'' at that point.
In 2003, the
social bookmarking
Social bookmarking is an online service which allows users to add, annotate, edit, and share Internet bookmark, bookmarks of web documents. Many online bookmark management services have launched since 1996; Delicious (website), Delicious, founded i ...
website
Delicious provided a way for its users to add "tags" to their bookmarks (as a way to help find them later);
Delicious also provided browseable aggregated views of the bookmarks of all users featuring a particular tag. Within a couple of years, the
photo sharing
A photograph (also known as a photo, or more generically referred to as an ''image'' or ''picture'') is an image created by light falling on a photosensitivity, photosensitive surface, usually photographic film or an electronic image sensor. Th ...
website
Flickr
Flickr ( ) is an image hosting service, image and Online video platform, video hosting service, as well as an online community, founded in Canada and headquartered in the United States. It was created by Ludicorp in 2004 and was previously a co ...
allowed its users to add their own text tags to each of their pictures, constructing flexible and easy metadata that made the pictures highly searchable. The success of Flickr and the influence of Delicious popularized the concept, and other
social software
Social software, also known as social apps or social platform includes communications and interactive tools that are often based on the Internet. Communication tools typically handle capturing, storing and presenting communication, usually writt ...
websites—such as
YouTube
YouTube is an American social media and online video sharing platform owned by Google. YouTube was founded on February 14, 2005, by Steve Chen, Chad Hurley, and Jawed Karim who were three former employees of PayPal. Headquartered in ...
,
Technorati, and
Last.fm—also implemented tagging. In 2005, the
Atom
Atoms are the basic particles of the chemical elements. An atom consists of a atomic nucleus, nucleus of protons and generally neutrons, surrounded by an electromagnetically bound swarm of electrons. The chemical elements are distinguished fr ...
web syndication standard provided a "category" element for inserting subject categories into
web feeds, and in 2007
Tim Bray proposed a "tag"
URN.
Examples
Within a blog
Many systems (and other web
content management system
A content management system (CMS) is computer software used to manage the creation and modification of digital content ( content management).''Managing Enterprise Content: A Unified Content Strategy''. Ann Rockley, Pamela Kostur, Steve Manning. New ...
s) allow authors to add free-form tags to a post, along with (or instead of) placing the post into a predetermined category. For example, a post may display that it has been tagged with
baseball
and
tickets
. Each of those tags is usually a
web link leading to an index page listing all of the posts associated with that tag. The blog may have a sidebar listing all the tags in use on that blog, with each tag leading to an index page. To reclassify a post, an author edits its list of tags. All connections between posts are automatically tracked and updated by the blog software; there is no need to relocate the page within a complex hierarchy of categories.
Within application software
Some
desktop applications and
web application
A web application (or web app) is application software that is created with web technologies and runs via a web browser. Web applications emerged during the late 1990s and allowed for the server to dynamically build a response to the request, ...
s feature their own tagging systems, such as email tagging in
Gmail
Gmail is the email service provided by Google. it had 1.5 billion active user (computing), users worldwide, making it the largest email service in the world. It also provides a webmail interface, accessible through a web browser, and is also ...
and
Mozilla Thunderbird
Mozilla Thunderbird is a free and open-source email client that also functions as a personal information manager with a Digital calendar, calendar and contactbook, as well as an RSS feed reader, chat client (IRC/XMPP/Matrix (protocol), Matrix), ...
,
bookmark tagging in
Firefox
Mozilla Firefox, or simply Firefox, is a free and open-source web browser developed by the Mozilla Foundation and its subsidiary, the Mozilla Corporation. It uses the Gecko rendering engine to display web pages, which implements curr ...
, audio tagging in
iTunes
iTunes is a media player, media library, and mobile device management (MDM) utility developed by Apple. It is used to purchase, play, download and organize digital multimedia on personal computers running the macOS and Windows operating s ...
or
Winamp, and photo tagging in various applications. Some of these applications display collections of tags as
tag clouds.
Assigned to computer files
There are various systems for applying tags to the files in a computer's
file system.
In
Apple
An apple is a round, edible fruit produced by an apple tree (''Malus'' spp.). Fruit trees of the orchard or domestic apple (''Malus domestica''), the most widely grown in the genus, are agriculture, cultivated worldwide. The tree originated ...
's
Mac System 7, released in 1991, users could assign one of
seven editable colored labels (with editable names such as "Essential", "Hot", and "In Progress") to each file and folder. In later iterations of the Mac operating system ever since
OS X 10.9 was released in 2013, users could assign multiple arbitrary tags as
extended file attributes to any file or folder, and before that time the
open-source
Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use and view the source code, design documents, or content of the product. The open source model is a decentrali ...
OpenMeta standard provided similar tagging functionality for
Mac OS X
macOS, previously OS X and originally Mac OS X, is a Unix, Unix-based operating system developed and marketed by Apple Inc., Apple since 2001. It is the current operating system for Apple's Mac (computer), Mac computers. With ...
.
Several
semantic file systems that implement tags are available for the
Linux kernel
The Linux kernel is a Free and open-source software, free and open source Unix-like kernel (operating system), kernel that is used in many computer systems worldwide. The kernel was created by Linus Torvalds in 1991 and was soon adopted as the k ...
, including
Tagsistant.
Microsoft Windows
Windows is a Product lining, product line of Proprietary software, proprietary graphical user interface, graphical operating systems developed and marketed by Microsoft. It is grouped into families and subfamilies that cater to particular sec ...
allows users to set tags only on
Microsoft Office documents and some kinds of picture files.
Cross-platform
Within computing, cross-platform software (also called multi-platform software, platform-agnostic software, or platform-independent software) is computer software that is designed to work in several Computing platform, computing platforms. Some ...
file tagging standards include
Extensible Metadata Platform (XMP), an
ISO standard
The International Organization for Standardization (ISO ; ; ) is an independent, non-governmental, international standard development organization composed of representatives from the national standards organizations of member countries.
Me ...
for embedding metadata into popular image, video and document file formats, such as
JPEG
JPEG ( , short for Joint Photographic Experts Group and sometimes retroactively referred to as JPEG 1) is a commonly used method of lossy compression for digital images, particularly for those images produced by digital photography. The degr ...
and
PDF
Portable document format (PDF), standardized as ISO 32000, is a file format developed by Adobe Inc., Adobe in 1992 to present documents, including text formatting and images, in a manner independent of application software, computer hardware, ...
, without breaking their readability by applications that do not support XMP. XMP largely supersedes the earlier
IPTC Information Interchange Model.
Exif is a standard that specifies the image and audio
file format
A file format is a Computer standard, standard way that information is encoded for storage in a computer file. It specifies how bits are used to encode information in a digital storage medium. File formats may be either proprietary format, pr ...
s used by
digital camera
A digital camera, also called a digicam, is a camera that captures photographs in Digital data storage, digital memory. Most cameras produced today are digital, largely replacing those that capture images on photographic film or film stock. Dig ...
s, including some metadata tags.
TagSpaces is an open-source cross-platform application for tagging files; it inserts tags into the
filename
A filename or file name is a name used to uniquely identify a computer file in a file system. Different file systems impose different restrictions on filename lengths.
A filename may (depending on the file system) include:
* name – base ...
.
For an event
An ''official tag'' is a keyword adopted by events and conferences for participants to use in their web publications, such as blog entries, photos of the event, and presentation slides. Search engines can then index them to make relevant materials related to the event searchable in a uniform way. In this case, the tag is part of a
controlled vocabulary
A controlled vocabulary provides a way to organize knowledge for subsequent retrieval. Controlled vocabularies are used in subject indexing schemes, subject headings, thesauri, taxonomies and other knowledge organization systems. Controlled v ...
.
In research
A researcher may work with a large collection of items (e.g. press quotes, a bibliography, images) in digital form. If he/she wishes to associate each with a small number of themes (e.g. to chapters of a book, or to sub-themes of the overall subject), then a group of tags for these themes can be attached to each of the items in the larger collection. In this way, freeform
classification
Classification is the activity of assigning objects to some pre-existing classes or categories. This is distinct from the task of establishing the classes themselves (for example through cluster analysis). Examples include diagnostic tests, identif ...
allows the author to manage what would otherwise be unwieldy amounts of information.
Special types
Triple tags
A triple tag or machine tag uses a special
syntax
In linguistics, syntax ( ) is the study of how words and morphemes combine to form larger units such as phrases and sentences. Central concerns of syntax include word order, grammatical relations, hierarchical sentence structure (constituenc ...
to define extra
semantic
Semantics is the study of linguistic Meaning (philosophy), meaning. It examines what meaning is, how words get their meaning, and how the meaning of a complex expression depends on its parts. Part of this process involves the distinction betwee ...
information about the tag, making it easier or more meaningful for interpretation by a computer program. Triple tags comprise three parts: a
namespace
In computing, a namespace is a set of signs (''names'') that are used to identify and refer to objects of various kinds. A namespace ensures that all of a given set of objects have unique names so that they can be easily identified.
Namespaces ...
, a
predicate, and a value. For example,
geo:long=50.123456
is a tag for the geographical
longitude
Longitude (, ) is a geographic coordinate that specifies the east- west position of a point on the surface of the Earth, or another celestial body. It is an angular measurement, usually expressed in degrees and denoted by the Greek lett ...
coordinate whose value is 50.123456. This triple structure is similar to the
Resource Description Framework
The Resource Description Framework (RDF) is a method to describe and exchange graph data. It was originally designed as a data model for metadata by the World Wide Web Consortium (W3C). It provides a variety of syntax notations and formats, of whi ...
model for information.
The triple tag format was first devised for geolicious in November 2004, to map
Delicious bookmarks, and gained wider acceptance after its adoption by Mappr and GeoBloggers to map
Flickr
Flickr ( ) is an image hosting service, image and Online video platform, video hosting service, as well as an online community, founded in Canada and headquartered in the United States. It was created by Ludicorp in 2004 and was previously a co ...
photos. In January 2007, Aaron Straup Cope at Flickr introduced the term ''machine tag'' as an alternative name for the triple tag, adding some questions and answers on purpose, syntax, and use.
Specialized metadata for geographical identification is known as ''
geotagging''; machine tags are also used for other purposes, such as identifying photos taken at a specific event or naming species using
binomial nomenclature
In taxonomy, binomial nomenclature ("two-term naming system"), also called binary nomenclature, is a formal system of naming species of living things by giving each a name composed of two parts, both of which use Latin grammatical forms, altho ...
.
Hashtags
A hashtag is a kind of metadata tag marked by the prefix
#
, sometimes known as a "hash" symbol. This form of tagging is used on
microblogging
Microblogging is a form of blogging using short posts without titles known as microposts or status updates. Microblogs "allow users to exchange small elements of content such as short sentences, individual images, or video links", which may be the ...
and
social networking service
A social networking service (SNS), or social networking site, is a type of online social media platform which people use to build social networks or social relationships with other people who share similar personal or career content, interest ...
s such as
Twitter
Twitter, officially known as X since 2023, is an American microblogging and social networking service. It is one of the world's largest social media platforms and one of the most-visited websites. Users can share short text messages, image ...
,
Facebook
Facebook is a social media and social networking service owned by the American technology conglomerate Meta Platforms, Meta. Created in 2004 by Mark Zuckerberg with four other Harvard College students and roommates, Eduardo Saverin, Andre ...
,
Google+
Google+ (sometimes written as Google Plus, stylized as G+ or g+) was a Social networking service, social network owned and operated by Google until it ceased operations in 2019. The network was launched on June 28, 2011, in an attempt to challe ...
,
VK and
Instagram
Instagram is an American photo sharing, photo and Short-form content, short-form video sharing social networking service owned by Meta Platforms. It allows users to upload media that can be edited with Social media camera filter, filters, be ...
. The hash is used to distinguish tag text, as distinct, from other text in the post.
Knowledge tags
A knowledge tag is a type of
meta-information that describes or defines some aspect of a piece of information (such as a
document
A document is a writing, written, drawing, drawn, presented, or memorialized representation of thought, often the manifestation of nonfiction, non-fictional, as well as fictional, content. The word originates from the Latin ', which denotes ...
,
digital image
A digital image is an image composed of picture elements, also known as pixels, each with '' finite'', '' discrete quantities'' of numeric representation for its intensity or gray level that is an output from its two-dimensional functions f ...
,
database table, or
web page
A web page (or webpage) is a World Wide Web, Web document that is accessed in a web browser. A website typically consists of many web pages hyperlink, linked together under a common domain name. The term "web page" is therefore a metaphor of pap ...
).
Knowledge tags are more than traditional non-hierarchical
keywords or terms; they are a type of
metadata
Metadata (or metainformation) is "data that provides information about other data", but not the content of the data itself, such as the text of a message or the image itself. There are many distinct types of metadata, including:
* Descriptive ...
that captures knowledge in the form of descriptions, categorizations, classifications,
semantics
Semantics is the study of linguistic Meaning (philosophy), meaning. It examines what meaning is, how words get their meaning, and how the meaning of a complex expression depends on its parts. Part of this process involves the distinction betwee ...
, comments, notes, annotations,
hyperdata,
hyperlinks
In computing, a hyperlink, or simply a link, is a digital reference providing direct access to data by a user's clicking or tapping. A hyperlink points to a whole document or to a specific element within a document. Hypertext is text with ...
, or references that are collected in tag profiles (a kind of
ontology
Ontology is the philosophical study of existence, being. It is traditionally understood as the subdiscipline of metaphysics focused on the most general features of reality. As one of the most fundamental concepts, being encompasses all of realit ...
).
These tag profiles reference an information resource that resides in a distributed, and often heterogeneous, storage repository.
Knowledge tags are part of a
knowledge management
Knowledge management (KM) is the set of procedures for producing, disseminating, utilizing, and overseeing an organization's knowledge and data. It alludes to a multidisciplinary strategy that maximizes knowledge utilization to accomplish organ ...
discipline that leverages
Enterprise 2.0 methodologies for users to capture insights, expertise, attributes, dependencies, or relationships associated with a data resource.
Different kinds of knowledge can be captured in knowledge tags, including factual knowledge (that found in books and data), conceptual knowledge (found in perspectives and concepts), expectational knowledge (needed to make judgments and hypothesis), and methodological knowledge (derived from reasoning and strategies).
These forms of
knowledge
Knowledge is an Declarative knowledge, awareness of facts, a Knowledge by acquaintance, familiarity with individuals and situations, or a Procedural knowledge, practical skill. Knowledge of facts, also called propositional knowledge, is oft ...
often exist outside the data itself and are derived from personal experience, insight, or expertise. Knowledge tags are considered an expansion of the information itself that adds additional value, context, and meaning to the information. Knowledge tags are valuable for preserving organizational intelligence that is often lost due to
turnover, for sharing knowledge stored in the minds of individuals that is typically isolated and unharnessed by the organization, and for connecting knowledge that is often lost or disconnected from an information resource.
Advantages and disadvantages
In a typical tagging system, there is no explicit information about the meaning or
semantics
Semantics is the study of linguistic Meaning (philosophy), meaning. It examines what meaning is, how words get their meaning, and how the meaning of a complex expression depends on its parts. Part of this process involves the distinction betwee ...
of each tag, and a user can apply new tags to an item as easily as applying older tags.
Hierarchical classification systems can be slow to change, and are rooted in the culture and era that created them; in contrast, the flexibility of tagging allows users to classify their collections of items in the ways that they find useful, but the personalized variety of terms can present challenges when searching and browsing.
When users can freely choose tags (creating a
folksonomy
Folksonomy is a classification system in which end users apply public tags to online items, typically to make those items easier for themselves or others to find later. Over time, this can give rise to a classification system based on those tag ...
, as opposed to selecting terms from a
controlled vocabulary
A controlled vocabulary provides a way to organize knowledge for subsequent retrieval. Controlled vocabularies are used in subject indexing schemes, subject headings, thesauri, taxonomies and other knowledge organization systems. Controlled v ...
), the resulting metadata can include
homonym
In linguistics, homonyms are words which are either; '' homographs''—words that mean different things, but have the same spelling (regardless of pronunciation), or '' homophones''—words that mean different things, but have the same pronunciat ...
s (the same tags used with different meanings) and
synonym
A synonym is a word, morpheme, or phrase that means precisely or nearly the same as another word, morpheme, or phrase in a given language. For example, in the English language, the words ''begin'', ''start'', ''commence'', and ''initiate'' are a ...
s (multiple tags for the same concept), which may lead to inappropriate connections between items and inefficient searches for information about a subject. For example, the tag "orange" may refer to the
fruit
In botany, a fruit is the seed-bearing structure in flowering plants (angiosperms) that is formed from the ovary after flowering.
Fruits are the means by which angiosperms disseminate their seeds. Edible fruits in particular have long propaga ...
or the
color
Color (or colour in English in the Commonwealth of Nations, Commonwealth English; American and British English spelling differences#-our, -or, see spelling differences) is the visual perception based on the electromagnetic spectrum. Though co ...
, and items related to a version of the
Linux kernel
The Linux kernel is a Free and open-source software, free and open source Unix-like kernel (operating system), kernel that is used in many computer systems worldwide. The kernel was created by Linus Torvalds in 1991 and was soon adopted as the k ...
may be tagged "Linux", "kernel", "Penguin", "software", or a variety of other terms. Users can also choose tags that are different
inflection
In linguistic Morphology (linguistics), morphology, inflection (less commonly, inflexion) is a process of word formation in which a word is modified to express different grammatical category, grammatical categories such as grammatical tense, ...
s of words (such as singular and plural), which can contribute to navigation difficulties if the system does not include
stemming of tags when searching or browsing. Larger-scale folksonomies address some of the problems of tagging, in that users of tagging systems tend to notice the current use of "tag terms" within these systems, and thus use existing tags in order to easily form connections to related items. In this way, folksonomies may collectively develop a partial set of tagging conventions.
Complex system dynamics
Despite the apparent lack of control, research has shown that a simple form of shared vocabulary emerges in social bookmarking systems. Collaborative tagging exhibits a form of
complex system
A complex system is a system composed of many components that may interact with one another. Examples of complex systems are Earth's global climate, organisms, the human brain, infrastructure such as power grid, transportation or communication sy ...
s dynamics (or
self-organizing
Self-organization, also called spontaneous order in the social sciences, is a process where some form of overall order and disorder, order arises from local interactions between parts of an initially disordered system. The process can be spont ...
dynamics).
Thus, even if no central controlled vocabulary constrains the actions of individual users, the distribution of tags converges over time to stable
power law
In statistics, a power law is a Function (mathematics), functional relationship between two quantities, where a Relative change and difference, relative change in one quantity results in a relative change in the other quantity proportional to the ...
distributions.
Once such stable distributions form, simple
folksonomic vocabularies can be extracted by examining the
correlation
In statistics, correlation or dependence is any statistical relationship, whether causal or not, between two random variables or bivariate data. Although in the broadest sense, "correlation" may indicate any type of association, in statistics ...
s that form between different tags. In addition, research has suggested that it is easier for
machine learning
Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of Computational statistics, statistical algorithms that can learn from data and generalise to unseen data, and thus perform Task ( ...
algorithms to learn tag semantics when users tag "verbosely"—when they annotate resources with a wealth of freely associated, descriptive keywords.
Spamming
Tagging systems open to the public are also open to tag spam, in which people apply an excessive number of tags or unrelated tags to an item (such as a
YouTube
YouTube is an American social media and online video sharing platform owned by Google. YouTube was founded on February 14, 2005, by Steve Chen, Chad Hurley, and Jawed Karim who were three former employees of PayPal. Headquartered in ...
video) in order to attract viewers. This abuse can be mitigated using human or statistical identification of spam items. The number of tags allowed may also be limited to reduce spam.
Syntax
Some tagging systems provide a single
text box to enter tags, so to be able to
tokenize
Lexical tokenization is conversion of a text into (semantically or syntactically) meaningful ''lexical tokens'' belonging to categories defined by a "lexer" program. In case of a natural language, those categories include nouns, verbs, adjectives ...
the string, a
separator must be used. Two popular separators are the
space character
A whitespace character is a character data element that represents white space when text is
rendered for display by a computer.
For example, a ''space'' character (, ASCII 32) represents blank space such as a word divider in a Western scri ...
and the
comma
The comma is a punctuation mark that appears in several variants in different languages. Some typefaces render it as a small line, slightly curved or straight, but inclined from the vertical; others give it the appearance of a miniature fille ...
. To enable the use of separators in the tags, a system may allow for higher-level separators (such as
quotation mark
Quotation marks are punctuation marks used in pairs in various writing systems to identify direct speech, a quotation, or a phrase. The pair consists of an opening quotation mark and a closing quotation mark, which may or may not be the sam ...
s) or
escape character
In computing and telecommunications, an escape character is a character that invokes an alternative interpretation on the following characters in a character sequence. An escape character is a particular case of metacharacters. Generally, the ...
s. Systems can avoid the use of separators by allowing only one tag to be added to each input
widget at a time, although this makes adding multiple tags more time-consuming.
A syntax for use within
HTML
Hypertext Markup Language (HTML) is the standard markup language for documents designed to be displayed in a web browser. It defines the content and structure of web content. It is often assisted by technologies such as Cascading Style Sheets ( ...
is to use the rel-tag
microformat which uses the
''rel'' attribute with value "tag" (i.e.,
rel="tag"
) to indicate that the linked-to page acts as a tag for the current context.
See also
*
Annotation
An annotation is extra information associated with a particular point in a document or other piece of information. It can be a note that includes a comment or explanation. Annotations are sometimes presented Marginalia, in the margin of book page ...
*
Collective intelligence
*
Concept map
*
Enterprise bookmarking
*
Enterprise social software Enterprise social software (also known as or regarded as a major component of Enterprise 2.0), comprises social software as used in " enterprise" (business/ commercial) contexts. It includes social and networked modifications to corporate intranet ...
*
Expert system
In artificial intelligence (AI), an expert system is a computer system emulating the decision-making ability of a human expert.
Expert systems are designed to solve complex problems by reasoning through bodies of knowledge, represented mainly as ...
*
Explicit knowledge
*
Human–computer interaction
Human–computer interaction (HCI) is the process through which people operate and engage with computer systems. Research in HCI covers the design and the use of computer technology, which focuses on the interfaces between people (users) and comp ...
*
Information ecology
*
Knowledge transfer
Knowledge transfer refers to transferring an awareness of facts or practical skills from one entity to another.Kjell Arne Røvik (2016). "Knowledge Transfer as Translation: Review and Elements of an Instrumental Theory." ''International Journa ...
*
Knowledge worker
Knowledge workers are workers whose main capital is knowledge. Examples include ICT professionals, physicians, pharmacists, architects, engineers, scientists, designers, public accountants, lawyers, librarians, archivists, editors, and ...
*
Management information system
A management information system (MIS) is an information system used for decision-making, and for the coordination, control, analysis, and visualization of information in an organization. The study of the management information systems involves peo ...
*
Meta-knowledge
*
Organizational memory
*
RRID
*
Semantics
Semantics is the study of linguistic Meaning (philosophy), meaning. It examines what meaning is, how words get their meaning, and how the meaning of a complex expression depends on its parts. Part of this process involves the distinction betwee ...
*
Semantic Web
The Semantic Web, sometimes known as Web 3.0, is an extension of the World Wide Web through standards set by the World Wide Web Consortium (W3C). The goal of the Semantic Web is to make Internet data machine-readable.
To enable the encoding o ...
*
Social network aggregation
*
Subject (documents)
*
Subject indexing
Subject indexing is the act of describing or classifying a document
A document is a writing, written, drawing, drawn, presented, or memorialized representation of thought, often the manifestation of nonfiction, non-fictional, as well as ...
Notes
References
{{DEFAULTSORT:Tag (Metadata)
Collective intelligence
Computer jargon
Information retrieval techniques
Knowledge representation
Metadata
Reference
Web 2.0