Identifier Separation Protocol
   HOME

TheInfoList



OR:

An identifier is a name that identifies (that is, labels the identity of) either a unique object or a unique ''class'' of objects, where the "object" or class may be an idea, person, physical
countable In mathematics, a Set (mathematics), set is countable if either it is finite set, finite or it can be made in one to one correspondence with the set of natural numbers. Equivalently, a set is ''countable'' if there exists an injective function fro ...
object (or class thereof), or physical noncountable substance (or class thereof). The abbreviation ID often refers to identity, identification (the process of identifying), or an identifier (that is, an instance of identification). An identifier may be a word, number, letter, symbol, or any combination of those. The words, numbers, letters, or symbols may follow an encoding system (wherein letters, digits, words, or symbols ''stand for'' epresentideas or longer names) or they may simply be arbitrary. When an identifier follows an encoding system, it is often referred to as a code or id code. For instance the
ISO/IEC 11179 The ISO/IEC 11179 metadata registry (MDR) standard is an international International Organization for Standardization, ISO/International Electrotechnical Commission, IEC standard for representing metadata for an organization in a metadata registry ...
metadata registry standard defines a code as ''system of valid symbols that substitute for longer values'' in contrast to identifiers without symbolic meaning. Identifiers that do not follow any encoding scheme are often said to be arbitrary Ids; they are arbitrarily assigned and have no greater meaning. (Sometimes identifiers are called "codes" even when they are actually arbitrary, whether because the speaker believes that they have deeper meaning or simply because they are speaking casually and imprecisely.) The
unique identifier A unique identifier (UID) is an identifier that is guaranteed to be unique among all identifiers used for those objects and for a specific purpose. The concept was formalized early in the development of computer science and information systems. ...
(UID) is an identifier that refers to ''only one instance''—only one particular object in the universe. A
part number A part number (often abbreviated PN, P/N, part no., or part #) is an identifier of a particular part design or material used in a particular industry. Its purpose is to simplify reference that item. A part number unambiguously identifies a part ...
is an identifier, but it is not a ''unique'' identifier—for that, a
serial number A serial number (SN) is a unique identifier used to ''uniquely'' identify an item, and is usually assigned incrementally or sequentially. Despite being called serial "numbers", they do not need to be strictly numerical and may contain letters ...
is needed, to identify ''each instance'' of the part design. Thus the ''identifier'' "Model T" identifies the ''class'' ''(model)'' of automobiles that Ford's Model T comprises; whereas the ''unique identifier'' "Model T Serial Number 159,862" identifies one specific member of that class—that is, one particular Model T car, owned by one specific person. The concepts of ''name'' and ''identifier'' are denotatively equal, and the terms are thus denotatively
synonym A synonym is a word, morpheme, or phrase that means precisely or nearly the same as another word, morpheme, or phrase in a given language. For example, in the English language, the words ''begin'', ''start'', ''commence'', and ''initiate'' are a ...
ous; but they are not always connotatively synonymous, because
code name A code name, codename, call sign, or cryptonym is a code word or name used, sometimes clandestinely, to refer to another name, word, project, or person. Code names are often used for military purposes, or in espionage. They may also be used in ...
s and Id numbers are often connotatively distinguished from names in the sense of traditional
natural language A natural language or ordinary language is a language that occurs naturally in a human community by a process of use, repetition, and change. It can take different forms, typically either a spoken language or a sign language. Natural languages ...
naming. For example, both "
Jamie Zawinski Jamie Werner Zawinski (born November 3, 1968), commonly known as jwz, is an American computer programmer, blogger, and impresario. He is best known for his role in the creation of Netscape Navigator, Netscape Mail, Lucid Emacs, Mozilla.org, an ...
" and "
Netscape Netscape Communications Corporation (originally Mosaic Communications Corporation) was an American independent computer services company with headquarters in Mountain View, California, and then Dulles, Virginia. Its Netscape web browser was o ...
employee number 20" are identifiers for the same specific human being; but normal English-language connotation may consider "Jamie Zawinski" a "name" and not an "identifier", whereas it considers "Netscape employee number 20" an "identifier" but not a "name." This is an emic indistinction rather than an
etic In anthropology, folkloristics, linguistics, and the social and behavioral sciences, ''emic'' () and ''etic'' () refer to two kinds of field research done and viewpoints obtained. The ''emic'' approach is an insider's perspective, which looks ...
one.


Metadata

In metadata, an identifier is a language-independent label, sign or token that uniquely identifies an object within an
identification scheme In metadata, an identification scheme is used to identify unique records in a set. If a data element is used to identify a record within a data set, the data element uses the Identifier representation term. An identification scheme should be contra ...
. The suffix "identifier" is also used as a
representation term A representation term is a word, or a combination of words, that semantically represent the data type (value domain) of a data element. A representation term is commonly referred to as a ''class word'' by those familiar with data dictionaries. ISO ...
when naming a
data element In metadata, the term data element is an atomic unit of data that has precise meaning or precise semantics. A data element has: # An identification such as a data element name # A clear data element definition # One or more representation term ...
. ID codes may inherently carry
metadata Metadata (or metainformation) is "data that provides information about other data", but not the content of the data itself, such as the text of a message or the image itself. There are many distinct types of metadata, including: * Descriptive ...
along with them. For example, when you know that the food package in front of you has the identifier "2011-09-25T15:42Z-MFR5-P02-243-45", you not only have that data, you also have the metadata that tells you that it was packaged on September 25, 2011, at 3:42pm UTC, manufactured by Licensed Vendor Number 5, at the Peoria, IL, USA plant, in Building 2, and was the 243rd package off the line in that shift, and was inspected by Inspector Number 45. Arbitrary identifiers might lack metadata. For example, if a food package just says 100054678214, its ID may not tell anything except identity—no date, manufacturer name, production sequence rank, or inspector number. In some cases, arbitrary identifiers such as sequential serial numbers leak information (i.e. the
German tank problem German(s) may refer to: * Germany, the country of the Germans and German things **Germania (Roman era) * Germans, citizens of Germany, people of German ancestry, or native speakers of the German language ** For citizenship in Germany, see also Ge ...
). Opaque identifiers—identifiers designed to avoid leaking even that small amount of information—include "really
opaque pointer In computer programming, an opaque pointer is a special case of an opaque data type, a data type declared to be a pointer to a record or data structure of some unspecified type. Opaque pointers are present in several programming languages inclu ...
s" and Version 4 UUIDs.


In computer science

In
computer science Computer science is the study of computation, information, and automation. Computer science spans Theoretical computer science, theoretical disciplines (such as algorithms, theory of computation, and information theory) to Applied science, ...
, identifiers (IDs) are lexical tokens that name entities. Identifiers are used extensively in virtually all information processing systems. Identifying entities makes it possible to refer to them, which is essential for any kind of symbolic processing.


In computer languages

In
computer language A computer language is a formal language used to communicate with a computer. Types of computer languages include: * Software construction#Construction languages, Construction language – all forms of communication by which a human can Comput ...
s, identifiers are tokens (also called
symbol A symbol is a mark, Sign (semiotics), sign, or word that indicates, signifies, or is understood as representing an idea, physical object, object, or wikt:relationship, relationship. Symbols allow people to go beyond what is known or seen by cr ...
s) which name language entities. Some of the kinds of entities an identifier might denote include variables,
types Type may refer to: Science and technology Computing * Typing, producing text via a keyboard, typewriter, etc. * Data type, collection of values used for computations. * File type * TYPE (DOS command), a command to display contents of a file. * Ty ...
,
labels A label (as distinct from signage) is a piece of paper, plastic film, cloth, metal, or other material affixed to a container or product. Labels are most often affixed to packaging and containers using an adhesive, or sewing when affixed to ...
,
subroutine In computer programming, a function (also procedure, method, subroutine, routine, or subprogram) is a callable unit of software logic that has a well-defined interface and behavior and can be invoked multiple times. Callable units provide a ...
s, and packages.


Ambiguity


Identifiers (IDs) versus Unique identifiers (UIDs)

A resource may carry multiple identifiers. Typical examples are: * One person with multiple names, nicknames, and forms of address (titles, salutations) ** ''For example:'' One specific person may be identified by all of the following identifiers: Jane Smith; Jane Elizabeth Meredith Smith; Jane E. M. Smith; Jane E. Smith; Janie Smith; Janie; Little Janie (as opposed to her mother or sister or cousin, Big Janie); Aunt Jane; Auntie Janie; Mom; Grandmom; Nana; Kelly's mother; Billy's grandmother; Ms. Smith; Dr. Smith; Jane E. Smith, PhD; and Fuzzy (her jocular nickname at work). * One document with multiple versions * One substance with multiple names (for example, CAS index names versus
IUPAC The International Union of Pure and Applied Chemistry (IUPAC ) is an international federation of National Adhering Organizations working for the advancement of the chemical sciences, especially by developing nomenclature and terminology. It is ...
names; INN generic drug names versus USAN generic drug names versus brand names) The inverse is also possible, where multiple resources are represented with the same identifier (discussed below).


Implicit context and namespace conflicts

Many
code In communications and information processing, code is a system of rules to convert information—such as a letter, word, sound, image, or gesture—into another form, sometimes shortened or secret, for communication through a communicati ...
s and nomenclatural systems originate within a small
namespace In computing, a namespace is a set of signs (''names'') that are used to identify and refer to objects of various kinds. A namespace ensures that all of a given set of objects have unique names so that they can be easily identified. Namespaces ...
. Over the years, some of them bleed into larger namespaces (as people interact in ways they formerly had not, e.g., cross-border trade, scientific collaboration, military alliance, and general cultural interconnection or assimilation). When such dissemination happens, the limitations of the original naming convention, which had formerly been latent and moot, become painfully apparent, often necessitating
retronym A retronym is a newer name for something that differentiates it from something else that is newer, similar, or seen in everyday life; thus, avoiding confusion between the two. Etymology The term ''retronym'', a neologism composed of the combi ...
y,
synonym A synonym is a word, morpheme, or phrase that means precisely or nearly the same as another word, morpheme, or phrase in a given language. For example, in the English language, the words ''begin'', ''start'', ''commence'', and ''initiate'' are a ...
ity, translation/
transcoding Transcoding is the direct digital-to-digital conversion of one encoding to another, such as for video data files, audio files (e.g., MP3, WAV), or character encoding (e.g., UTF-8, ISO/IEC 8859). This is usually done in cases where a target ...
, and so on. Such limitations generally accompany the shift away from the original context to the broader one. Typically the system shows implicit context (context was formerly assumed, and narrow), lack of capacity (e.g., low number of possible IDs, reflecting the outmoded narrow context), lack of
extensibility Extensibility is a software engineering and systems design principle that provides for future growth. Extensibility is a measure of the ability to extend a system and the level of effort required to implement the extension. Extensions can be t ...
(no features defined and reserved against future needs), and lack of specificity and disambiguating capability (related to the context shift, where longstanding uniqueness encounters novel nonuniqueness). Within computer science, this problem is called
naming collision A naming collision is a circumstance where two or more identifiers in a given namespace or a given scope cannot be unambiguously resolved, and such unambiguous resolution is a requirement of the underlying system. Example: XML element names In ...
. The story of the origination and expansion of the
CODEN CODEN – according to ASTM standard E250 – is a six-character, alphanumeric bibliographic code that provides concise, unique and unambiguous identification of the titles of periodicals and non-serial publications from all subject areas. COD ...
system provides a good case example in a recent-decades, technical-nomenclature context. The capitalization variations seen with specific designators reveals an instance of this problem occurring in
natural language A natural language or ordinary language is a language that occurs naturally in a human community by a process of use, repetition, and change. It can take different forms, typically either a spoken language or a sign language. Natural languages ...
s, where the proper noun/common noun distinction (and its complications) must be dealt with. A universe in which every object had a UID would not need any namespaces, which is to say that it would constitute one gigantic namespace; but human minds could never keep track of, or semantically interrelate, so many UIDs.


Identifiers in various disciplines


See also


References


External links

* * {{Authority control Metadata