HOME

TheInfoList



OR:

In
computing Computing is any goal-oriented activity requiring, benefiting from, or creating computer, computing machinery. It includes the study and experimentation of algorithmic processes, and the development of both computer hardware, hardware and softw ...
, a null pointer (sometimes shortened to nullptr or null) or null reference is a value saved for indicating that the pointer or
reference A reference is a relationship between objects in which one object designates, or acts as a means by which to connect to or link to, another object. The first object in this relation is said to ''refer to'' the second object. It is called a ''nam ...
does not refer to a valid
object Object may refer to: General meanings * Object (philosophy), a thing, being, or concept ** Object (abstract), an object which does not exist at any particular time or place ** Physical object, an identifiable collection of matter * Goal, an a ...
. Programs routinely use null pointers to represent conditions such as the end of a
list A list is a Set (mathematics), set of discrete items of information collected and set forth in some format for utility, entertainment, or other purposes. A list may be memorialized in any number of ways, including existing only in the mind of t ...
of unknown length or the failure to perform some action; this use of null pointers can be compared to
nullable type Nullable types are a feature of some programming languages which allow a value to be set to the special value NULL instead of the usual possible values of the data type. In statically typed languages, a nullable type is an option type, while i ...
s and to the ''Nothing'' value in an
option type Option or Options may refer to: Computing * Option key, a key on Apple computer keyboards * Option type, a polymorphic data type in programming languages * Command-line option, an optional parameter to a command *OPTIONS, an HTTP request metho ...
. A null pointer should not be confused with an uninitialized pointer: a null pointer is guaranteed to compare unequal to any pointer that points to a valid object. However, in general, most languages do not offer such guarantee for uninitialized pointers. It might compare equal to other, valid pointers; or it might compare equal to null pointers. It might do both at different times; or the comparison might be
undefined behavior In computer programming, a program exhibits undefined behavior (UB) when it contains, or is executing code for which its programming language specification does not mandate any specific requirements. This is different from unspecified behavior, ...
. Also, in languages offering such support, the correct use depends on the individual experience of each developer and linter tools. Even when used properly, null pointers are semantically incomplete, since they do not offer the possibility to express the difference between "not applicable", "not known", and "future" values. Because a null pointer does not point to a meaningful object, an attempt to access the data stored at that (invalid) memory location may cause a run-time error or immediate program crash. This is the null pointer error, or null pointer exception. It is one of the most common types of software weaknesses, and
Tony Hoare Sir Charles Antony Richard Hoare (; born 11 January 1934), also known as C. A. R. Hoare, is a British computer scientist who has made foundational contributions to programming languages, algorithms, operating systems, formal verification, and ...
, who introduced the concept, has referred to it as a "billion dollar mistake".


C

In C, two null pointers of any type are guaranteed to compare equal. Prior to C23, the preprocessor macro NULL was provided, defined as an implementation-defined null pointer constant in < stdlib.h>, which in
C99 C99 (previously C9X, formally ISO/IEC 9899:1999) is a past version of the C programming language open standard. It extends the previous version ( C90) with new features for the language and the standard library, and helps implementations mak ...
can be portably expressed as ((void *)0), the integer value 0 converted to the type void* (see pointer to void type). Since C23, a null pointer is represented with nullptr which is of type nullptr_t (originally introduced to C++11), providing a type safe null pointer. The C standard does not say that the null pointer is the same as the pointer to
memory address In computing, a memory address is a reference to a specific memory location in memory used by both software and hardware. These addresses are fixed-length sequences of digits, typically displayed and handled as unsigned integers. This numeric ...
 0, though that may be the case in practice. Dereferencing a null pointer is
undefined behavior In computer programming, a program exhibits undefined behavior (UB) when it contains, or is executing code for which its programming language specification does not mandate any specific requirements. This is different from unspecified behavior, ...
in C, and a conforming implementation is allowed to assume that any pointer that is dereferenced is not null. In practice, dereferencing a null pointer may result in an attempted read or write from
memory Memory is the faculty of the mind by which data or information is encoded, stored, and retrieved when needed. It is the retention of information over time for the purpose of influencing future action. If past events could not be remembe ...
that is not mapped, triggering a
segmentation fault In computing, a segmentation fault (often shortened to segfault) or access violation is a Interrupt, failure condition raised by hardware with memory protection, notifying an operating system (OS) the software has attempted to access a restricted ...
or memory access violation. This may manifest itself as a program crash, or be transformed into a software exception that can be caught by program code. There are, however, certain circumstances where this is not the case. For example, in
x86 x86 (also known as 80x86 or the 8086 family) is a family of complex instruction set computer (CISC) instruction set architectures initially developed by Intel, based on the 8086 microprocessor and its 8-bit-external-bus variant, the 8088. Th ...
real mode Real mode, also called real address mode, is an operating mode of all x86-compatible CPUs. The mode gets its name from the fact that addresses in real mode always correspond to real locations in memory. Real mode is characterized by a 20- bit s ...
, the address 0000:0000 is readable and also usually writable, and dereferencing a pointer to that address is a perfectly valid but typically unwanted action that may lead to undefined but non-crashing behavior in the application; if a null pointer is represented as a pointer to that address, dereferencing it will lead to that behavior. There are occasions when dereferencing a pointer to address zero ''is'' intentional and well-defined; for example,
BIOS In computing, BIOS (, ; Basic Input/Output System, also known as the System BIOS, ROM BIOS, BIOS ROM or PC BIOS) is a type of firmware used to provide runtime services for operating systems and programs and to perform hardware initialization d ...
code written in C for 16-bit real-mode x86 devices may write the
interrupt descriptor table The interrupt descriptor table (IDT) is a data structure used by the x86 architecture to implement an interrupt vector table. The IDT is used by the processor to determine the memory addresses of the handlers to be executed on interrupts and exc ...
(IDT) at physical address 0 of the machine by dereferencing a pointer with the same value as a null pointer for writing. It is also possible for the compiler to optimize away the null pointer dereference, avoiding a segmentation fault but causing other undesired behavior.


C++

In C++, while the NULL macro was inherited from C, the integer literal for zero has been traditionally preferred to represent a null pointer constant. However,
C++11 C++11 is a version of a joint technical standard, ISO/IEC 14882, by the International Organization for Standardization (ISO) and International Electrotechnical Commission (IEC), for the C++ programming language. C++11 replaced the prior vers ...
introduced the explicit null pointer constant nullptr and type nullptr_t to be used instead, providing a type safe null pointer. nullptr and type nullptr_t were later introduced to C in C23.


Other languages

In some programming language environments (at least one proprietary Lisp implementation, for example), the value used as the null pointer (called nil in
Lisp Lisp (historically LISP, an abbreviation of "list processing") is a family of programming languages with a long history and a distinctive, fully parenthesized Polish notation#Explanation, prefix notation. Originally specified in the late 1950s, ...
) may actually be a pointer to a block of internal data useful to the implementation (but not explicitly reachable from user programs), thus allowing the same register to be used as a useful constant and a quick way of accessing implementation internals. This is known as the nil vector. In languages with a
tagged architecture In computer science, a tagged architecture is a type of computer architecture where every word (data type), word of memory constitutes a tagged union, being divided into a number of bits of data, and a ''tag'' section that describes the type of the ...
, a possibly null pointer can be replaced with a
tagged union In computer science, a tagged union, also called a variant, variant record, choice type, discriminated union, disjoint union, sum type, or coproduct, is a data structure used to hold a value that could take on several different, but fixed, types. ...
which enforces explicit handling of the exceptional case; in fact, a possibly null pointer can be seen as a tagged pointer with a computed tag. Programming languages use different literals for the ''null pointer''. In Python, for example, a null value is called None. In
Java Java is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea (a part of Pacific Ocean) to the north. With a population of 156.9 million people (including Madura) in mid 2024, proje ...
and C#, the literal null is provided as a literal for reference types. In Pascal and
Swift Swift or SWIFT most commonly refers to: * SWIFT, an international organization facilitating transactions between banks ** SWIFT code * Swift (programming language) * Swift (bird), a family of birds It may also refer to: Organizations * SWIF ...
, a null pointer is called nil. In Eiffel, it is called a void reference.


Null dereferencing

Because a null pointer does not point to a meaningful object, an attempt to
dereference In computer science, a pointer is an object (computer science), object in many programming languages that stores a memory address. This can be that of another value located in computer memory, or in some cases, that of memory-mapped I/O, memo ...
(i.e., access the data stored at that memory location) a null pointer usually (but not always) causes a run-time error or immediate program crash.
MITRE The mitre (Commonwealth English) or miter (American English; American and British English spelling differences#-re, -er, see spelling differences; both pronounced ; ) is a type of headgear now known as the traditional, ceremonial headdress of ...
lists the null pointer error as one of the most commonly exploited software weaknesses. * In C, dereferencing a null pointer is
undefined behavior In computer programming, a program exhibits undefined behavior (UB) when it contains, or is executing code for which its programming language specification does not mandate any specific requirements. This is different from unspecified behavior, ...
. ISO/IEC 9899, clause 6.5.3.2, paragraph 4, esp. footnote 87. Many implementations cause such code to result in the program being halted with an
access violation In computing, a segmentation fault (often shortened to segfault) or access violation is a Interrupt, failure condition raised by hardware with memory protection, notifying an operating system (OS) the software has attempted to access a restricted ...
, because the null pointer representation is chosen to be an address that is never allocated by the system for storing objects. However, this behavior is not universal. It is also not guaranteed, since compilers are permitted to optimize programs under the assumption that they are free of undefined behavior. This behaviour is the same in C++, as there is no null pointer exception in the C++ language. On platforms such as
Unix-like A Unix-like (sometimes referred to as UN*X, *nix or *NIX) operating system is one that behaves in a manner similar to a Unix system, although not necessarily conforming to or being certified to any version of the Single UNIX Specification. A Uni ...
systems and
Windows Windows is a Product lining, product line of Proprietary software, proprietary graphical user interface, graphical operating systems developed and marketed by Microsoft. It is grouped into families and subfamilies that cater to particular sec ...
with the
Visual Studio Visual Studio is an integrated development environment (IDE) developed by Microsoft. It is used to develop computer programs including web site, websites, web apps, web services and mobile apps. Visual Studio uses Microsoft software development ...
compiler, an access violation causes a C/C++
SIGSEGV In computing, a segmentation fault (often shortened to segfault) or access violation is a Interrupt, failure condition raised by hardware with memory protection, notifying an operating system (OS) the software has attempted to access a restricted ...
signal to be issued. Although in C/C++ null dereferences are not exceptions which can be caught in C++ try/catch blocks, it is possible to "catch" such an access violation by using (std::)signal() in C/C++ to specify a handler to be called when that signal is issued. * In D, much like C++, a null pointer dereference results in a segmentation fault. * In
Delphi Delphi (; ), in legend previously called Pytho (Πυθώ), was an ancient sacred precinct and the seat of Pythia, the major oracle who was consulted about important decisions throughout the ancient Classical antiquity, classical world. The A ...
and many other Pascal implementations, the constant represents a null pointer to the first address in memory which is also used to initialize managed variables. Dereferencing it raises an external OS exception which is mapped onto a Pascal exception instance if the unit is linked in the clause. * In
Java Java is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea (a part of Pacific Ocean) to the north. With a population of 156.9 million people (including Madura) in mid 2024, proje ...
, access to a null reference (null) causes a (NPE), which can be caught by error handling code, but the preferred practice is to ensure that such exceptions never occur. * In
Lisp Lisp (historically LISP, an abbreviation of "list processing") is a family of programming languages with a long history and a distinctive, fully parenthesized Polish notation#Explanation, prefix notation. Originally specified in the late 1950s, ...
, is a first class object. By convention, (first nil) is , as is (rest nil). So dereferencing in these contexts will not cause an error, but poorly written code can get into an infinite loop. * In
.NET The .NET platform (pronounced as "''dot net"'') is a free and open-source, managed code, managed computer software framework for Microsoft Windows, Windows, Linux, and macOS operating systems. The project is mainly developed by Microsoft emplo ...
and C#, access to null reference (null) causes a to be thrown. Although catching these is generally considered bad practice, this exception type can be caught and handled by the program. * In
Objective-C Objective-C is a high-level general-purpose, object-oriented programming language that adds Smalltalk-style message passing (messaging) to the C programming language. Originally developed by Brad Cox and Tom Love in the early 1980s, it was ...
, messages may be sent to a object (which is a null pointer) without causing the program to be interrupted; the message is simply ignored, and the return value (if any) is or , depending on the type. * In Python, dereferencing None raises an AttributeError, as NoneType has no (user-facing) methods or attributes defined. * In
Rust Rust is an iron oxide, a usually reddish-brown oxide formed by the reaction of iron and oxygen in the catalytic presence of water or air moisture. Rust consists of hydrous iron(III) oxides (Fe2O3·nH2O) and iron(III) oxide-hydroxide (FeO(OH) ...
, dereferencing a null pointer (std::ptr::null()) in an unsafe block results in undefined behaviour, which usually results in a segmentation fault or corrupted memory. * Before the introduction of
Supervisor Mode Access Prevention Supervisor Mode Access Prevention (SMAP) is a feature of some CPU implementations such as the Intel Broadwell microarchitecture that allows supervisor mode programs to optionally set user-space memory mappings so that access to those mappings f ...
(SMAP), a null pointer dereference bug could be exploited by mapping page zero into the attacker's
address space In computing, an address space defines a range of discrete addresses, each of which may correspond to a network host, peripheral device, disk sector, a memory cell or other logical or physical entity. For software programs to save and retrieve ...
and hence causing the null pointer to point to that region. This could lead to code execution in some cases.


Mitigation

While there could be languages with no nulls, most do have the possibility of nulls so there are techniques to avoid or aid debugging null pointer dereferences. Bond et al. suggest modifying the Java Virtual Machine (JVM) to keep track of null propagation. There are three levels of handling null references, in order of effectiveness: # languages with no null; # languages that can statically analyse code to avoid the possibility of null dereference at run time; # if null dereference can occur at runtime, tools that aid debugging. Pure
functional languages In computer science, functional programming is a programming paradigm where programs are constructed by applying and composing functions. It is a declarative programming paradigm in which function definitions are trees of expressions that map ...
are an example of level 1 since no direct access is provided to pointers and all code and data is immutable. User code running in interpreted or virtual-machine languages generally does not suffer the problem of null pointer dereferencing. Where a language does provide or utilise pointers which could become void, it is possible to avoid runtime null dereferences by providing compilation-time checking via
static analysis Static analysis, static projection, or static scoring is a simplified analysis wherein the effect of an immediate change to a system is calculated without regard to the longer-term response of the system to that change. If the short-term effect i ...
or other techniques, with syntactic assistance from language features such as those seen in the Eiffel programming language with Void safety to avoid null derefences, D, and
Rust Rust is an iron oxide, a usually reddish-brown oxide formed by the reaction of iron and oxygen in the catalytic presence of water or air moisture. Rust consists of hydrous iron(III) oxides (Fe2O3·nH2O) and iron(III) oxide-hydroxide (FeO(OH) ...
. In some languages analysis can be performed using external tools, but these are weak compared to direct language support with compiler checks since they are limited by the language definition itself. The last resort of level 3 is when a null reference occurs at runtime, debugging aids can help.


History

In 2009,
Tony Hoare Sir Charles Antony Richard Hoare (; born 11 January 1934), also known as C. A. R. Hoare, is a British computer scientist who has made foundational contributions to programming languages, algorithms, operating systems, formal verification, and ...
stated that he invented the null reference in 1965 as part of the
ALGOL W ALGOL W is a programming language. It is based on a proposal for ALGOL X by Niklaus Wirth and Tony Hoare as a successor to ALGOL 60. ALGOL W is a relatively simple upgrade of the original ALGOL 60, adding string, bitstring, complex number a ...
language. In that 2009 reference Hoare describes his invention as a "billion-dollar mistake":


See also

*
Memory debugger A memory debugger is a debugger for finding software memory problems such as memory leaks and buffer overflows. These are due to bugs related to the allocation and deallocation of dynamic memory. Programs written in languages that have garba ...
*
Zero page The zero page or base page is the block of memory at the very beginning of a computer's address space; that is, the page whose starting address is zero. The size of a page depends on the context, and the significance of zero page memory versus h ...


Notes


References

* {{Nulls Pointers (computer programming)