HOME

TheInfoList



OR:

In computer
programming language A programming language is a system of notation for writing computer programs. Most programming languages are text-based formal languages, but they may also be graphical. They are a kind of computer language. The description of a programming l ...
s, an identifier is a lexical token (also called a
symbol A symbol is a mark, sign, or word that indicates, signifies, or is understood as representing an idea, object, or relationship. Symbols allow people to go beyond what is known or seen by creating linkages between otherwise very different conc ...
, but not to be confused with the symbol primitive data type) that names the language's entities. Some of the kinds of entities an identifier might denote include variables,
data type In computer science and computer programming, a data type (or simply type) is a set of possible values and a set of allowed operations on it. A data type tells the compiler or interpreter how the programmer intends to use the data. Most progra ...
s,
labels A label (as distinct from signage) is a piece of paper, plastic film, cloth, metal, or other material affixed to a container or product, on which is written or printed information or symbols about the product or item. Information printed dir ...
,
subroutine In computer programming, a function or subroutine is a sequence of program instructions that performs a specific task, packaged as a unit. This unit can then be used in programs wherever that particular task should be performed. Functions ma ...
s, and modules.


Lexical form

Which character sequences constitute identifiers depends on the lexical grammar of the language. A common rule is
alphanumeric Alphanumericals or alphanumeric characters are a combination of alphabetical and numerical characters. More specifically, they are the collection of Latin letters and Arabic digits. An alphanumeric code is an identifier made of alphanumeric ...
sequences, with underscore also allowed (in some languages, _ is not allowed), and with the condition that it can not begin with a numerical digit (to simplify lexing by avoiding confusing with integer literals) – so foo, foo1, foo_bar, _foo are allowed, but 1foo is not – this is the definition used in earlier versions of C and C++, Python, and many other languages. Later versions of these languages, along with many other modern languages, support many more
Unicode Unicode, formally The Unicode Standard,The formal version reference is is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, ...
characters in an identifier. However, a common restriction is not to permit whitespace characters and language operators; this simplifies tokenization by making it free-form and context-free. For example, forbidding + in identifiers due to its use as a binary operation means that a+b and a + b can be tokenized the same, while if it were allowed, a+b would be an identifier, not an addition. Whitespace in identifier is particularly problematic, as if spaces are allowed in identifiers, then a clause such as if rainy day then 1 is legal, with rainy day as an identifier, but tokenizing this requires the phrasal context of being in the condition of an if clause. Some languages do allow spaces in identifiers, however, such as
ALGOL 68 ALGOL 68 (short for ''Algorithmic Language 1968'') is an imperative programming language that was conceived as a successor to the ALGOL 60 programming language, designed with the goal of a much wider scope of application and more rigorously ...
and some ALGOL variants – for example, the following is a valid statement: real half pi; which could be entered as .real. half pi; (keywords are represented in boldface, concretely via stropping). In ALGOL this was possible because keywords are syntactically differentiated, so there is no risk of collision or ambiguity, spaces are eliminated during the line reconstruction phase, and the source was processed via scannerless parsing, so lexing could be context-sensitive. In most languages, some character sequences have the lexical form of an identifier but are known as
keywords Keyword may refer to: Computing * Keyword (Internet search), a word or phrase typically used by bloggers or online content creator to rank a web page on a particular topic * Index term, a term used as a keyword to documents in an information syste ...
– for example, if is frequently a keyword for an if clause, but lexically is of the same form as ig or foo namely a sequence of letters. This overlap can be handled in various ways: these may be forbidden from being identifiers – which simplifies tokenization and parsing – in which case they are reserved words; they may both be allowed but distinguished in other ways, such as via stropping; or keyword sequences may be allowed as identifiers and which sense is determined from context, which requires a context-sensitive lexer. Non-keywords may also be reserved words (forbidden as identifiers), particularly for
forward compatibility Forward compatibility or upward compatibility is a design characteristic that allows a system to accept input intended for a later version of itself. The concept can be applied to entire systems, electrical interfaces, telecommunication signal ...
, in case a word may become a keyword in future. In a few languages, e.g., PL/1, the distinction is not clear.


Semantics

The scope, or accessibility within a program of an identifier can be either local or global. A global identifier is declared outside of functions and is available throughout the program. A local identifier is declared within a specific function and only available within that function. For implementations of programming languages that are using a
compiler In computing, a compiler is a computer program that translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primarily used for programs that ...
, identifiers are often only
compile time In computer science, compile time (or compile-time) describes the time window during which a computer program is compiled. The term is used as an adjective to describe concepts related to the context of program compilation, as opposed to concept ...
entities. That is, at runtime the compiled program contains references to memory addresses and offsets rather than the textual identifier tokens (these memory addresses, or offsets, having been assigned by the compiler to each identifier). In languages that support reflection, such as interactive evaluation of source code (using an interpreter or an incremental compiler), identifiers are also runtime entities, sometimes even as first-class objects that can be freely manipulated and evaluated. In
Lisp A lisp is a speech impairment in which a person misarticulates sibilants (, , , , , , , ). These misarticulations often result in unclear speech. Types * A frontal lisp occurs when the tongue is placed anterior to the target. Interdental lispi ...
, these are called
symbols A symbol is a mark, sign, or word that indicates, signifies, or is understood as representing an idea, object, or relationship. Symbols allow people to go beyond what is known or seen by creating linkages between otherwise very different co ...
. Compilers and interpreters do not usually assign any semantic meaning to an identifier based on the actual character sequence used. However, there are exceptions. For example: * In
Perl Perl is a family of two High-level programming language, high-level, General-purpose programming language, general-purpose, Interpreter (computing), interpreted, dynamic programming languages. "Perl" refers to Perl 5, but from 2000 to 2019 it ...
a variable is indicated using a prefix called a sigil, which specifies aspects of how the variable is interpreted in expressions. * In
Ruby A ruby is a pinkish red to blood-red colored gemstone, a variety of the mineral corundum (aluminium oxide). Ruby is one of the most popular traditional jewelry gems and is very durable. Other varieties of gem-quality corundum are called sapp ...
a variable is automatically considered
immutable In object-oriented and functional programming, an immutable object (unchangeable object) is an object whose state cannot be modified after it is created.Goetz et al. ''Java Concurrency in Practice''. Addison Wesley Professional, 2006, Section 3.4 ...
if its identifier starts with a capital letter. * In Fortran, the first letter in a variable's name indicates whether by default it is created as an
integer An integer is the number zero (), a positive natural number (, , , etc.) or a negative integer with a minus sign ( −1, −2, −3, etc.). The negative numbers are the additive inverses of the corresponding positive numbers. In the language ...
or
floating point In computing, floating-point arithmetic (FP) is arithmetic that represents real numbers approximately, using an integer with a fixed precision, called the significand, scaled by an integer exponent of a fixed base. For example, 12.345 can be r ...
variable. * In Go, the capitalization of the first letter of a variable's name determines its visibility (uppercase for public, lowercase for private). In some languages such as Go, identifiers uniqueness is based on their spelling and their visibility. In
HTML The HyperText Markup Language or HTML is the standard markup language for documents designed to be displayed in a web browser. It can be assisted by technologies such as Cascading Style Sheets (CSS) and scripting languages such as JavaScri ...
an identifier is one of the possible attributes of an
HTML element An HTML element is a type of HTML (HyperText Markup Language) document component, one of several types of HTML nodes (there are also text nodes, comment nodes and others). The first used version of HTML was written by Tim Berners-Lee in 1993 ...
. It is unique within the document.


References

{{reflist


See also

* Naming convention (programming) Programming language concepts Metadata Syntactic entities