HOME

TheInfoList



OR:

The null character (also null terminator) is a control character with the value zero. It is present in many character sets, including those defined by the Baudot and ITA2 codes, ISO/IEC 646 (or
ASCII ASCII ( ), abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. Because ...
), the
C0 control code The C0 and C1 control code or control character sets define control codes for use in text by computer systems that use ASCII and derivatives of ASCII. The codes represent additional information about the text, such as the position of a cursor ...
, the Universal Coded Character Set (or
Unicode Unicode, formally The Unicode Standard,The formal version reference is is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, ...
), and EBCDIC. It is available in nearly all mainstream
programming language A programming language is a system of notation for writing computer programs. Most programming languages are text-based formal languages, but they may also be graphical. They are a kind of computer language. The description of a programming l ...
s. It is often abbreviated as NUL (or NULL, though in some contexts that term is used for the null pointer). In 8-bit codes, it is known as a null byte. The original meaning of this character was like NOP—when sent to a printer or a terminal, it has no effect (some terminals, however, incorrectly display it as
space Space is the boundless three-dimensional extent in which objects and events have relative position and direction. In classical physics, physical space is often conceived in three linear dimensions, although modern physicists usually con ...
). When electromechanical
teleprinter A teleprinter (teletypewriter, teletype or TTY) is an electromechanical device that can be used to send and receive typed messages through various communications channels, in both point-to-point (telecommunications), point-to-point and point- ...
s were used as computer output devices, one or more null characters were sent at the end of each printed line to allow time for the mechanism to return to the first printing position on the next line. On punched tape, the character is represented with no holes at all, so a new unpunched tape is initially filled with null characters, and often text could be inserted at a reserved space of null characters by punching the new characters into the tape over the nulls. Today the character has much more significance in the programming language C and its derivatives and in many data formats, where it serves as a reserved character used to signify the end of a
string String or strings may refer to: *String (structure), a long flexible structure made from threads twisted together, which is used to tie, bind, or hang other objects Arts, entertainment, and media Films * ''Strings'' (1991 film), a Canadian anim ...
, often called a null-terminated string. This allows the string to be any length with only the overhead of one byte; the alternative of storing a count requires either a string length limit of 255 or an overhead of more than one byte (there are other advantages/disadvantages described in the null-terminated string article).


Representation

The null character is often represented as the escape sequence \0 in
source code In computing, source code, or simply code, is any collection of code, with or without comment (computer programming), comments, written using a human-readable programming language, usually as plain text. The source code of a Computer program, p ...
, string literals or character constants.Kernighan and Ritchie, ''C'', p. 38 In many languages ( such as C, which introduced this notation), this is not a separate escape sequence, but an octal escape sequence with a single
octal The octal numeral system, or oct for short, is the radix, base-8 number system, and uses the Numerical digit, digits 0 to 7. This is to say that 10octal represents eight and 100octal represents sixty-four. However, English, like most languages, ...
digit 0; as a consequence, \0 must not be followed by any of the digits 0 through 7; otherwise it is interpreted as the start of a longer octal escape sequence. Other escape sequences that are found in use in various languages are \000, \x00, \z, or \u0000. A null character can be placed in a
URL A Uniform Resource Locator (URL), colloquially termed as a web address, is a reference to a web resource that specifies its location on a computer network and a mechanism for retrieving it. A URL is a specific type of Uniform Resource Identifie ...
with the percent code %00. The ability to represent a null character does not always mean the resulting string will be correctly interpreted, as many programs will consider the null to be the end of the string. Thus the ability to type it (in case of
unchecked user input Improper input validation or unchecked user input is a type of vulnerability in computer software that may be used for security exploits. This vulnerability is caused when " e product does not validate or incorrectly validates input that can affect ...
) creates a vulnerability known as null byte injection and can lead to security exploits.Null Byte Injection
WASC Threat Classification Null Byte Attack section. In caret notation the null character is ^@. On some keyboards, one can enter a null character by holding down and pressing (on US layouts just will often work, there being no need for to get the @ sign). In documentation, the null character is sometimes represented as a single- em-width symbol containing the letters "NUL". In
Unicode Unicode, formally The Unicode Standard,The formal version reference is is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, ...
, there is a character with a corresponding glyph for visual representation of the null character, symbol for null, U+2400 (␀)—not to be confused with the actual null character, U+0000. The Hexadecimal notation for null is 00 and decoding the Base64 string AA

. also holds the null character. In the
Windows Windows is a group of several proprietary graphical operating system families developed and marketed by Microsoft. Each family caters to a certain sector of the computing industry. For example, Windows NT for consumers, Windows Server for ...
operating system the character can be typed in in some GUI applications, such as
WordPad WordPad is the basic word processor that has been included with almost all versions of Microsoft Windows from Windows 95 onwards. It is more advanced than Windows Notepad, and simpler than Microsoft Word and Microsoft Works (last updated in 2 ...
, by typing 2400 followed immediately by the key combination.


Encoding

In all modern character sets, the null character has a code point value of zero. In most encodings, this is translated to a single code unit with a zero value. For instance, in
UTF-8 UTF-8 is a variable-length character encoding used for electronic communication. Defined by the Unicode Standard, the name is derived from ''Unicode'' (or ''Universal Coded Character Set'') ''Transformation Format 8-bit''. UTF-8 is capable of ...
it is a single zero byte. However, in Modified UTF-8 the null character is encoded as two bytes: 0xC0, 0x80. This allows the byte with the value of zero, which is now not used for any character, to be used as a string terminator.


References


External links


Null Byte Injection
WASC Threat Classification Null Byte Attack section * Poison Null Byte Introduction Introduction to Nullify 9 * Byte Attack *
Apple An apple is an edible fruit produced by an apple tree (''Malus domestica''). Apple trees are cultivated worldwide and are the most widely grown species in the genus '' Malus''. The tree originated in Central Asia, where its wild ances ...
br>null byte injection
QR code A QR code (an initialism for quick response code) is a type of Barcode#Matrix (2D) barcodes, matrix barcode (or two-dimensional barcode) invented in 1994 by the Japanese company Denso#Denso Wave, Denso Wave. A barcode is a machine-readable optic ...
br>vulnerability
{{DEFAULTSORT:Null Character Control characters Computer security exploits