Integer literal
   HOME

TheInfoList



OR:

In
computer science Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to practical disciplines (includi ...
, an integer literal is a kind of literal for an
integer An integer is the number zero (), a positive natural number (, , , etc.) or a negative integer with a minus sign ( −1, −2, −3, etc.). The negative numbers are the additive inverses of the corresponding positive numbers. In the languag ...
whose value is directly represented in
source code In computing, source code, or simply code, is any collection of code, with or without comments, written using a human-readable programming language, usually as plain text. The source code of a program is specially designed to facilitate the w ...
. For example, in the assignment statement x = 1, the string 1 is an integer literal indicating the value 1, while in the statement x = 0x10 the string 0x10 is an integer literal indicating the value 16, which is represented by 10 in hexadecimal (indicated by the 0x prefix). By contrast, in x = cos(0), the expression cos(0) evaluates to 1 (as the cosine of 0), but the value 1 is not ''literally'' included in the source code. More simply, in x = 2 + 2, the expression 2 + 2 evaluates to 4, but the value 4 is not literally included. Further, in x = "1" the "1" is a
string literal A string literal or anonymous string is a string value in the source code of a computer program. Modern programming languages commonly use a quoted sequence of characters, formally " bracketed delimiters", as in x = "foo", where "foo" is a string ...
, not an integer literal, because it is in quotes. The value of the string is 1, which happens to be an integer string, but this is semantic analysis of the string literal – at the syntactic level "1" is simply a string, no different from "foo".


Parsing

Recognizing a string (sequence of characters in the source code) as an integer literal is part of the
lexical analysis In computer science, lexical analysis, lexing or tokenization is the process of converting a sequence of characters (such as in a computer program or web page) into a sequence of ''lexical tokens'' ( strings with an assigned and thus identified ...
(lexing) phase, while evaluating the literal to its value is part of the semantic analysis phase. Within the lexer and phrase grammar, the token class is often denoted integer, with the lowercase indicating a lexical-level token class, as opposed to phrase-level production rule (such as ListOfIntegers). Once a string has been lexed (tokenized) as an integer literal, its value cannot be determined syntactically (it is ''just'' an integer), and evaluation of its value becomes a semantic question. Integer literals are generally lexed with
regular expression A regular expression (shortened as regex or regexp; sometimes referred to as rational expression) is a sequence of characters that specifies a search pattern in text. Usually such patterns are used by string-searching algorithms for "find" ...
s, as in
Python Python may refer to: Snakes * Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia ** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia * Python (mythology), a mythical serpent Computing * Python (pro ...
.2.4.4. Integer and long integer literals


Evaluation

As with other literals, integer literals are generally evaluated at compile time, as part of the semantic analysis phase. In some cases this semantic analysis is done in the lexer, immediately on recognition of an integer literal, while in other cases this is deferred until the parsing stage, or until after the parse tree has been completely constructed. For example, on recognizing the string 0x10 the lexer could immediately evaluate this to 16 and store that (a token of type integer and value 16), or defer evaluation and instead record a token of type integer and value 0x10. Once literals have been evaluated, further semantic analysis in the form of constant folding is possible, meaning that literal expressions involving literal values can be evaluated at the compile phase. For example, in the statement x = 2 + 2 after the literals have been evaluated and the expression 2 + 2 has been parsed, it can then be evaluated to 4, though the value 4 does not itself appear as a literal.


Affixes

Integer literals frequently have prefixes indicating base, and less frequently suffixes indicating type. For example, in
C++ C++ (pronounced "C plus plus") is a high-level general-purpose programming language created by Danish computer scientist Bjarne Stroustrup as an extension of the C programming language, or "C with Classes". The language has expanded significan ...
0x10ULL indicates the value 16 (because hexadecimal) as an unsigned long long integer. Common prefixes include: * 0x or 0X for hexadecimal (base 16); * 0, 0o or 0O for
octal The octal numeral system, or oct for short, is the base-8 number system, and uses the digits 0 to 7. This is to say that 10octal represents eight and 100octal represents sixty-four. However, English, like most languages, uses a base-10 number ...
(base 8); * 0b or 0B for
binary Binary may refer to: Science and technology Mathematics * Binary number, a representation of numbers using only two digits (0 and 1) * Binary function, a function that takes two arguments * Binary operation, a mathematical operation that ta ...
(base 2). Common suffixes include: * l or L for long integer; * ll or LL for long long integer; * u or U for unsigned integer. These affixes are somewhat similar to
sigils A sigil () is a type of symbol used in magic. The term has usually referred to a pictorial signature of a deity or spirit. In modern usage, especially in the context of chaos magic, sigil refers to a symbolic representation of the practitione ...
, though sigils attach to identifiers (names), not literals.


Digit separators

In some languages, integer literals may contain digit separators to allow
digit grouping A decimal separator is a symbol used to separate the integer part from the fractional part of a number written in decimal form (e.g., "." in 12.45). Different countries officially designate different symbols for use as the separator. The choi ...
into more legible forms. If this is available, it can usually be done for floating point literals as well. This is particularly useful for
bit field A bit field is a data structure that consists of one or more adjacent bits which have been allocated for specific purposes, so that any single bit or group of bits within the structure can be set or inspected. A bit field is most commonly used to ...
s and makes it easier to see the size of large numbers (such as a million) at a glance by subitizing rather than counting digits. It is also useful for numbers that are typically grouped, such as
credit card number A payment card number, primary account number (PAN), or simply a card number, is the card identifier found on payment cards, such as credit cards and debit cards, as well as stored-value cards, gift cards and other similar cards. In some situati ...
or social security numbers. Very long numbers can be further grouped by doubling up separators. Typically decimal numbers (base-10) are grouped in three digit groups (representing one of 1000 possible values), binary numbers (base-2) in four digit groups (one nibble, representing one of 16 possible values), and hexadecimal numbers (base-16) in two digit groups (each digit is one nibble, so two digits are one
byte The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable uni ...
, representing one of 256 possible values). Numbers from other systems (such as id numbers) are grouped following whatever convention is in use.


Examples

In Ada, C# (from version 7.0), D, Eiffel, Go (from version 1.13),
Haskell Haskell () is a general-purpose, statically-typed, purely functional programming language with type inference and lazy evaluation. Designed for teaching, research and industrial applications, Haskell has pioneered a number of programming lan ...
(from GHC version 8.6.1),
Java Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's mos ...
(from version 7),
Julia Julia is usually a feminine given name. It is a Latinate feminine form of the name Julio and Julius. (For further details on etymology, see the Wiktionary entry "Julius".) The given name ''Julia'' had been in use throughout Late Antiquity (e.g ...
,
Perl Perl is a family of two high-level, general-purpose, interpreted, dynamic programming languages. "Perl" refers to Perl 5, but from 2000 to 2019 it also referred to its redesigned "sister language", Perl 6, before the latter's name was offic ...
,
Python Python may refer to: Snakes * Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia ** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia * Python (mythology), a mythical serpent Computing * Python (pro ...
(from version 3.6),
Ruby A ruby is a pinkish red to blood-red colored gemstone, a variety of the mineral corundum ( aluminium oxide). Ruby is one of the most popular traditional jewelry gems and is very durable. Other varieties of gem-quality corundum are called ...
,
Rust Rust is an iron oxide, a usually reddish-brown oxide formed by the reaction of iron and oxygen in the catalytic presence of water or air moisture. Rust consists of hydrous iron(III) oxides (Fe2O3·nH2O) and iron(III) oxide-hydroxide (FeO( ...
and
Swift Swift or SWIFT most commonly refers to: * SWIFT, an international organization facilitating transactions between banks ** SWIFT code * Swift (programming language) * Swift (bird), a family of birds It may also refer to: Organizations * SWIFT, ...
, integer literals and float literals can be separated with an
underscore An underscore, ; also called an underline, low line, or low dash; is a line drawn under a segment of text. In proofreading, underscoring is a convention that says "set this text in italic type", traditionally used on manuscript or typescript a ...
(_). There can be some restrictions on placement; for example, in Java they cannot appear at the start or end of the literal, nor next to a decimal point. Note that while the period, comma, and (thin) spaces are used in normal writing for digit separation, these conflict with their existing use in programming languages as
radix point A decimal separator is a symbol used to separate the integer part from the fractional part of a number written in decimal form (e.g., "." in 12.45). Different countries officially designate different symbols for use as the separator. The choi ...
, list separator (and in C/C++, the comma operator), and token separator. Examples include: int oneMillion = 1_000_000; int creditCardNumber = 1234_5678_9012_3456; int socialSecurityNumber = 123_45_6789; In
C++14 C++14 is a version of the ISO/IEC 14882 standard for the C++ programming language. It is intended to be a small extension over C++11, featuring mainly bug fixes and small improvements, and was replaced by C++17. Its approval was announced on Augus ...
(2014) and the next version of C , C23, the apostrophe character may be used to separate digits arbitrarily in numeric literals. The underscore was initially proposed, with an initial proposal in 1993, and again for
C++11 C++11 is a version of the ISO/ IEC 14882 standard for the C++ programming language. C++11 replaced the prior version of the C++ standard, called C++03, and was later replaced by C++14. The name follows the tradition of naming language versions b ...
, following other languages. However, this caused conflict with user-defined literals, so the apostrophe was proposed instead, as an " upper comma" (which is used in some other contexts). auto integer_literal = 1'000'000; auto binary_literal = 0b0100'1100'0110; auto very_long_binary_literal = 0b0000'0001'0010'0011''0100'0101'0110'0111;


Notes


References

{{reflist Literal Source code