In
computing
Computing is any goal-oriented activity requiring, benefiting from, or creating computer, computing machinery. It includes the study and experimentation of algorithmic processes, and the development of both computer hardware, hardware and softw ...
and
telecommunications
Telecommunication, often used in its plural form or abbreviated as telecom, is the transmission of information over a distance using electronic means, typically through cables, radio waves, or other communication technologies. These means of ...
, an escape character is a
character that invokes an alternative interpretation on the following characters in a character sequence. An escape character is a particular case of
metacharacters. Generally, the judgement of whether something is an escape character or not depends on the context.
In the telecommunications field, escape characters are used to indicate that the following characters are encoded differently. This is used to alter
control character
In computing and telecommunications, a control character or non-printing character (NPC) is a code point in a character encoding, character set that does not represent a written Character (computing), character or symbol. They are used as in-ba ...
s that would otherwise be noticed and acted on by the underlying telecommunications hardware, such as
illegal characters. In this context, the use of escape characters is often referred to as quoting.
Definition
An escape character may not have its own meaning, so all escape sequences are of two or more characters.
Escape characters are part of the
syntax
In linguistics, syntax ( ) is the study of how words and morphemes combine to form larger units such as phrases and sentences. Central concerns of syntax include word order, grammatical relations, hierarchical sentence structure (constituenc ...
for many programming languages, data formats, and communication protocols. For a given
alphabet
An alphabet is a standard set of letter (alphabet), letters written to represent particular sounds in a spoken language. Specifically, letters largely correspond to phonemes as the smallest sound segments that can distinguish one word from a ...
an escape character's purpose is to start character sequences (so named
escape sequence
In computer science, an escape sequence is a combination of characters that has a meaning other than the literal characters contained therein; it is marked by one or more preceding (and possibly terminating) characters.
Examples
* In C and ma ...
s), which have to be interpreted differently from the same characters occurring without the prefixed escape character.
The functions of escape sequences include:
* To encode a syntactic entity, such as device commands or special data, which cannot be directly represented by the alphabet.
* To represent characters, referred to as ''character quoting'', which cannot be typed in the current context, or would have an undesired interpretation. In this case, an escape sequence is a
digraph consisting of an escape character itself and a "quoted" character.
Control character
Generally, an escape character is not a particular case of (device)
control character
In computing and telecommunications, a control character or non-printing character (NPC) is a code point in a character encoding, character set that does not represent a written Character (computing), character or symbol. They are used as in-ba ...
s, nor vice versa. If we define control characters as non-
graphic, or as having a special meaning for an output device (e.g.
printer or
text terminal) then any escape character for this device is a control one. But escape characters used in programming (such as the
backslash, "\") are graphic, hence are not control characters. Conversely most (but not all) of the
ASCII
ASCII ( ), an acronym for American Standard Code for Information Interchange, is a character encoding standard for representing a particular set of 95 (English language focused) printable character, printable and 33 control character, control c ...
"control characters" have some control function in isolation, therefore they are not escape characters.
In many programming languages, an escape character also forms some escape sequences which are referred to as control characters. For example,
line break has an escape sequence of .
Examples
JavaScript
JavaScript uses the (backslash) as an escape character for:
* single quote
* double quote
* backslash
*
new line
New or NEW may refer to:
Music
* New, singer of K-pop group The Boyz
* ''New'' (album), by Paul McCartney, 2013
** "New" (Paul McCartney song), 2013
* ''New'' (EP), by Regurgitator, 1995
* "New" (Daya song), 2017
* "New" (No Doubt song), 1 ...
*
carriage return
* tab
* backspace
*
form feed
*
vertical tab (
Internet Explorer 9 and older treats as instead of a vertical tab (). If cross-browser compatibility is a concern, use instead of .)
*
null character () (only if the next character is not a decimal digit; else it is an octal escape sequence)
* character represented by the hexadecimal byte
FF
The and escapes are not allowed in JSON strings.
Example code:
console.log("Using \\n \nWill shift the characters after \\n one row down")
console.log("Using \\t \twill shift the characters after \\t one tab length to the right")
console.log("Using \\r \rWill imitate a carriage return, which means shifting to the start of the row") // can be used to clear the screen on some terminals. Windows uses \r\n instead of \n alone
ASCII escape character
The ASCII "escape" character (
octal
Octal (base 8) is a numeral system with eight as the base.
In the decimal system, each place is a power of ten. For example:
: \mathbf_ = \mathbf \times 10^1 + \mathbf \times 10^0
In the octal system, each place is a power of eight. For ex ...
: ,
hexadecimal
Hexadecimal (also known as base-16 or simply hex) is a Numeral system#Positional systems in detail, positional numeral system that represents numbers using a radix (base) of sixteen. Unlike the decimal system representing numbers using ten symbo ...
: , or, in decimal, , also represented by the sequences or ) is used in many output devices to start a series of characters called a control sequence or escape sequence. Typically, the escape character was sent first in such a sequence to alert the device that the following characters were to be interpreted as a control sequence rather than as plain characters, then one or more characters would follow to specify some detailed action, after which the device would go back to interpreting characters normally. For example, the sequence of , followed by the printable characters , would cause a Digital Equipment Corporation (DEC)
VT102 terminal to move its
cursor to the 10th cell of the 2nd line of the screen. This was later developed into
ANSI escape codes covered by the ANSI X3.64 standard. The escape character also starts each command sequence in the Hewlett-Packard
Printer Command Language.
An early reference to the term "escape character" is found in
Bob Bemer's IBM technical publications, who is credited with inventing this mechanism during his work on the
ASCII
ASCII ( ), an acronym for American Standard Code for Information Interchange, is a character encoding standard for representing a particular set of 95 (English language focused) printable character, printable and 33 control character, control c ...
character set.
The
Escape key is usually found on standard PC keyboards. However, it is commonly absent from keyboards for PDAs and other devices not designed primarily for ASCII communications. The DEC
VT220 series was one of the few popular keyboards that did not have a dedicated Esc key, instead of using one of the keys above the main keypad. In
user interface
In the industrial design field of human–computer interaction, a user interface (UI) is the space where interactions between humans and machines occur. The goal of this interaction is to allow effective operation and control of the machine fro ...
s of the 1970s–1980s it was not uncommon to use this key as an escape character, but in modern desktop computers, such use is dropped. Sometimes the key was identified with
AltMode (for alternative mode). Even with no dedicated key, the escape character code could be generated by typing while simultaneously holding down .
Programming and data formats
Many modern
programming language
A programming language is a system of notation for writing computer programs.
Programming languages are described in terms of their Syntax (programming languages), syntax (form) and semantics (computer science), semantics (meaning), usually def ...
s specify the double-quote character () as a
delimiter
A delimiter is a sequence of one or more Character (computing), characters for specifying the boundary between separate, independent regions in plain text, Expression (mathematics), mathematical expressions or other Data stream, data streams. An ...
for a
string literal
string literal or anonymous string is a literal for a string value in the source code of a computer program. Modern programming languages commonly use a quoted sequence of characters, formally "bracketed delimiters", as in x = "foo", where , "foo ...
. The
backslash () escape character typically provides two ways to include double-quotes inside a string literal, either by modifying the meaning of the double-quote character embedded in the string ( becomes ), or by modifying the meaning of a sequence of characters including the hexadecimal value of a double-quote character ( becomes ).
C,
C++,
Java
Java is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea (a part of Pacific Ocean) to the north. With a population of 156.9 million people (including Madura) in mid 2024, proje ...
, and
Ruby
Ruby is a pinkish-red-to-blood-red-colored gemstone, a variety of the mineral corundum ( aluminium oxide). Ruby is one of the most popular traditional jewelry gems and is very durable. Other varieties of gem-quality corundum are called sapph ...
all allow exactly the same two backslash escape styles. The
PostScript
PostScript (PS) is a page description language and dynamically typed, stack-based programming language. It is most commonly used in the electronic publishing and desktop publishing realm, but as a Turing complete programming language, it c ...
language and Microsoft
Rich Text Format
)
As an example, the following RTF code
would be rendered as follows:
This is some bold text.
Character encoding
A standard RTF file can only consist of 7-bit ASCII characters, but can use escape sequences to encode other characters. ...
also use backslash escapes. The
quoted-printable encoding uses the
equals sign
The equals sign (British English) or equal sign (American English), also known as the equality sign, is the mathematical symbol , which is used to indicate equality. In an equation it is placed between two expressions that have the same valu ...
as an escape character.
URL and
URI use
%-
escapes to quote characters with a special meaning, as for non-ASCII characters. The
ampersand () character may be considered as an escape character in
SGML
The Standard Generalized Markup Language (SGML; International Organization for Standardization, ISO 8879:1986) is a standard for defining generalized markup languages for documents. ISO 8879 Annex A.1 states that generalized markup is "based on t ...
and derived formats such as
HTML
Hypertext Markup Language (HTML) is the standard markup language for documents designed to be displayed in a web browser. It defines the content and structure of web content. It is often assisted by technologies such as Cascading Style Sheets ( ...
and
XML
Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing data. It defines a set of rules for encoding electronic document, documents in a format that is both human-readable and Machine-r ...
.
Some programming languages also provide other ways to represent special characters in literals, without requiring an escape character (see e.g.
delimiter collision).
Communication protocols
The
Point-to-Point Protocol
In computer networking, Point-to-Point Protocol (PPP) is a data link layer (layer 2) communication protocol between two routers directly without any host or any other networking in between. It can provide loop detection, authentication, transmissio ...
(PPP) uses the
octet (, or ASCII:
}
) as an escape character. The octet immediately following should be
XORed by before being passed to a higher level protocol. This is applied to both itself and the control character (which is used in PPP to mark the beginning and end of a frame) when those octets need to be transmitted by a higher level protocol encapsulated by PPP, as well as other octets negotiated when the link is established. That is, when a higher level protocol wishes to transmit , it is transmitted as the sequence , and is transmitted as .
Bourne shell
In
Bourne shell (sh), the
asterisk
The asterisk ( ), from Late Latin , from Ancient Greek , , "little star", is a Typography, typographical symbol. It is so called because it resembles a conventional image of a star (heraldry), heraldic star.
Computer scientists and Mathematici ...
() and
question mark
The question mark (also known as interrogation point, query, or eroteme in journalism) is a punctuation, punctuation mark that indicates a question or interrogative clause or phrase in many languages.
History
The history of the question mark is ...
() characters are
wildcard character
In software, a wildcard character is a kind of placeholder represented by a single character (computing), character, such as an asterisk (), which can be interpreted as a number of literal characters or an empty string. It is often used in file ...
s expanded via
globbing. Without a preceding escape character, an will expand to the names of all files in the
working directory that do not start with a period
if and only if
In logic and related fields such as mathematics and philosophy, "if and only if" (often shortened as "iff") is paraphrased by the biconditional, a logical connective between statements. The biconditional is true in two cases, where either bo ...
there are such files, otherwise remains unexpanded. So to refer to a file literally called "*", the shell must be told not to interpret it in this way, by preceding it with a backslash (). This modifies the interpretation of the asterisk ().
Compare:
Similarly, characters like the
ampersand,
pipe and
semicolon (used for command chaining), angle brackets (used for
redirection), and parentheses have special syntactic meaning to the Bourne shell. These must also be escaped—referred to as "quoting" in the manual page—in order to be used literally as arguments to another program:
$ echo (`-´)> # not escaped or quoted
bash: syntax error near unexpected token ``-´'
$ echo \(`-´\)\> # escaped with backslashes
(`-´)>
$ echo '(`-´)>' # protected by single quotes; same effect as above
(`-´)>
$ echo ;) # syntax error
$ echo ';)' \;\) # both OK
Windows Command Prompt
The
Windows command-line interpreter uses a
caret character () to escape reserved characters that have special meanings (in particular: ,
|
, , , , , ). The
DOS command-line interpreter, though it has similar syntax, does not support this.
For example, on the Windows Command Prompt, this will result in a syntax error.
C:\>echo
The syntax of the command is incorrect.
whereas this will output the string:
C:\>echo ^
Windows PowerShell
In
Windows
Windows is a Product lining, product line of Proprietary software, proprietary graphical user interface, graphical operating systems developed and marketed by Microsoft. It is grouped into families and subfamilies that cater to particular sec ...
, the backslash is used as a path separator; therefore, it generally cannot be used as an escape character.
PowerShell
PowerShell is a shell program developed by Microsoft for task automation and configuration management. As is typical for a shell, it provides a command-line interpreter for interactive use and a script interpreter for automation via a langu ...
uses
backtick ( ` ) instead.
For example, the following command:
PS C:\> echo "`tFirst line`nNew line"
First line
New line
:
Others
*
Quoted-printable, which encodes 8-bit data into 7-bit data of limited line lengths, uses the
equals sign
The equals sign (British English) or equal sign (American English), also known as the equality sign, is the mathematical symbol , which is used to indicate equality. In an equation it is placed between two expressions that have the same valu ...
(
=
) as an escape character.
See also
*
AltGr key used to type characters that are unusual for the locale of the keyboard layout.
*
Escape sequences in C
*
Leaning toothpick syndrome
*
Nested quotation
*
Stropping (syntax)
In computer language design, stropping is a method of explicitly marking letter sequences as having a special property, such as being a keyword, or a certain type of variable or storage location, and thus inhabiting a different namespace from ord ...
– in some conventions a leading character (such as an apostrophe) functions as an escape character
References
External links
That Powerful ESCAPE Character -- Key and Sequences –
Bob Bemer
{{FS1037C
Pattern matching
Control characters