History
In 1955, Soviet Ukrainian computer scientist Kateryna Yushchenko created the Address programming language that made possible indirect addressing and addresses of the highest rank – analogous to pointers. This language was widely used on the Soviet Union computers. However, it was unknown outside the Soviet Union and usually Harold Lawson is credited with the invention, in 1964, of the pointer. In 2000, Lawson was presented the Computer Pioneer Award by theFormal description
InArchitectural roots
Pointers are a very thinUses
Pointers are directly supported without restrictions in languages such asC pointers
The basicptr
as the identifier of an object of the following type:
* pointer that points to an object of type int
This is usually stated more succinctly as "ptr
is a pointer to int
."
Because the C language does not specify an implicit initialization for objects of automatic storage duration, care should often be taken to ensure that the address to which ptr
points is valid; this is why it is sometimes suggested that a pointer be explicitly initialized to the NULL
: ISO/IEC 9899, clause 7.17, paragraph 3: ''NULL... which expands to an implementation-defined null pointer constant...''
a
to ptr
. For example, if a
is stored at memory location of 0x8130 then the value of ptr
will be 0x8130 after the assignment. To dereference the pointer, an asterisk is used again:
ptr
(which is 0x8130), "locate" that address in memory and set its value to 8.
If a
is later accessed again, its new value will be 8.
This example may be clearer if memory is examined directly.
Assume that a
is located at address 0x8130 in memory and ptr
at 0x8134; also assume this is a 32-bit machine such that an int is 32-bits wide. The following is what would be in memory after the following code snippet is executed:
a
to ptr
:
ptr
by coding:
ptr
(which is 0x8130), 'locate' that address, and assign 8 to that location yielding the following memory:
Clearly, accessing a
will yield the value of 8 because the previous instruction modified the contents of a
by way of the pointer ptr
.
Use in data structures
When setting upC arrays
In C, array indexing is formally defined in terms of pointer arithmetic; that is, the language specification requires thatarray /code> be equivalent to *(array + i)
. Thus in C, arrays can be thought of as pointers to consecutive areas of memory (with no gaps), and the syntax for accessing arrays is identical for that which can be used to dereference pointers. For example, an array array
can be declared and used in the following manner:
int array /* Declares 5 contiguous integers */
int *ptr = array; /* Arrays can be used as pointers */
ptr = 1; /* Pointers can be indexed with array syntax */
*(array + 1) = 2; /* Arrays can be dereferenced with pointer syntax */
*(1 + array) = 2; /* Pointer addition is commutative */
2 rray= 4; /* Subscript operator is commutative */
This allocates a block of five integers and names the block array
, which acts as a pointer to the block. Another common use of pointers is to point to dynamically allocated memory from malloc
C dynamic memory allocation refers to performing manual memory management for dynamic memory allocation in the C programming language via a group of functions in the C standard library, namely , , , and .
The C++ programming language includ ...
which returns a consecutive block of memory of no less than the requested size that can be used as an array.
While most operators on arrays and pointers are equivalent, the result of the sizeof
sizeof is a unary operator in the C and C++ programming languages that evaluates to the storage size of an expression or a data type, measured in units sized as char. Consequently, the expression sizeof(char) evaluates to 1. The number of b ...
operator differs. In this example, sizeof(array)
will evaluate to 5*sizeof(int)
(the size of the array), while sizeof(ptr)
will evaluate to sizeof(int*)
, the size of the pointer itself.
Default values of an array can be declared like:
int array = ;
If array
is located in memory starting at address 0x1000 on a 32-bit little-endian
'' Jonathan_Swift.html" ;"title="Gulliver's Travels'' by Jonathan Swift">Gulliver's Travels'' by Jonathan Swift, the novel from which the term was coined
In computing, endianness is the order in which bytes within a word (data type), word of d ...
machine then memory will contain the following (values are in hexadecimal
Hexadecimal (also known as base-16 or simply hex) is a Numeral system#Positional systems in detail, positional numeral system that represents numbers using a radix (base) of sixteen. Unlike the decimal system representing numbers using ten symbo ...
, like the addresses):
:
Represented here are five integers: 2, 4, 3, 1, and 5. These five integers occupy 32 bits (4 bytes) each with the least-significant byte stored first (this is a little-endian CPU architecture
In computer science and computer engineering, computer architecture is a description of the structure of a computer system made from component parts. It can sometimes be a high-level description that ignores details of the implementation. At a mo ...
) and are stored consecutively starting at address 0x1000.
The syntax for C with pointers is:
* array
means 0x1000;
* array + 1
means 0x1004: the "+ 1" means to add the size of 1 int
, which is 4 bytes;
* *array
means to dereference the contents of array
. Considering the contents as a memory address (0x1000), look up the value at that location (0x0002);
* array /code> means element number i
, 0-based, of array
which is translated into *(array + i)
.
The last example is how to access the contents of array
. Breaking it down:
* array + i
is the memory location of the (i)th element of array
, starting at i=0;
* *(array + i)
takes that memory address and dereferences it to access the value.
C linked list
Below is an example definition of a linked list
In computer science, a linked list is a linear collection of data elements whose order is not given by their physical placement in memory. Instead, each element points to the next. It is a data structure consisting of a collection of nodes whi ...
in C.
/* the empty linked list is represented by NULL
* or some other sentinel value */
#define EMPTY_LIST NULL
struct link ;
This pointer-recursive definition is essentially the same as the reference-recursive definition from the language Haskell
Haskell () is a general-purpose, statically typed, purely functional programming language with type inference and lazy evaluation. Designed for teaching, research, and industrial applications, Haskell pioneered several programming language ...
:
data Link a = Nil
, Cons a (Link a)
Nil
is the empty list, and Cons a (Link a)
is a cons
In computer programming, ( or ) is a fundamental function in most dialects of the Lisp programming language. ''constructs'' memory objects which hold two values or pointers to two values. These objects are referred to as (cons) cells, conses, ...
cell of type a
with another link also of type a
.
The definition with references, however, is type-checked and does not use potentially confusing signal values. For this reason, data structures in C are usually dealt with via wrapper function
A wrapper function is a function (another word for a ''subroutine'') in a software library or a computer program whose main purpose is to call a second subroutine or a system call with little or no additional computation. Wrapper functions sim ...
s, which are carefully checked for correctness.
Pass-by-address using pointers
Pointers can be used to pass variables by their address, allowing their value to be changed. For example, consider the following C code:
/* a copy of the int n can be changed within the function without affecting the calling code */
void passByValue(int n)
/* a pointer m is passed instead. No copy of the value pointed to by m is created */
void passByAddress(int *m)
int main(void)
Dynamic memory allocation
In some programs, the required amount of memory depends on what ''the user'' may enter. In such cases the programmer needs to allocate memory dynamically. This is done by allocating memory at the ''heap'' rather than on the ''stack'', where variables usually are stored (although variables can also be stored in the CPU registers). Dynamic memory allocation can only be made through pointers, and names – like with common variables – cannot be given.
Pointers are used to store and manage the addresses of dynamically allocated blocks of memory. Such blocks are used to store data objects or arrays of objects. Most structured and object-oriented languages provide an area of memory, called the ''heap'' or ''free store'', from which objects are dynamically allocated.
The example C code below illustrates how structure objects are dynamically allocated and referenced. The standard C library
The C standard library, sometimes referred to as libc, is the standard library for the C programming language, as specified in the ISO C standard.ISO/ IEC (2018). '' ISO/IEC 9899:2018(E): Programming Languages - C §7'' Starting from the origin ...
provides the function malloc()
for allocating memory blocks from the heap. It takes the size of an object to allocate as a parameter and returns a pointer to a newly allocated block of memory suitable for storing the object, or it returns a null pointer if the allocation failed.
/* Parts inventory item */
struct Item ;
/* Allocate and initialize a new Item object */
struct Item * make_item(const char *name)
The code below illustrates how memory objects are dynamically deallocated, i.e., returned to the heap or free store. The standard C library provides the function free()
for deallocating a previously allocated memory block and returning it back to the heap.
/* Deallocate an Item object */
void destroy_item(struct Item *item)
Memory-mapped hardware
On some computing architectures, pointers can be used to directly manipulate memory or memory-mapped devices.
Assigning addresses to pointers is an invaluable tool when programming microcontroller
A microcontroller (MC, uC, or μC) or microcontroller unit (MCU) is a small computer on a single integrated circuit. A microcontroller contains one or more CPUs (processor cores) along with memory and programmable input/output peripherals. Pro ...
s. Below is a simple example declaring a pointer of type int and initialising it to a hexadecimal
Hexadecimal (also known as base-16 or simply hex) is a Numeral system#Positional systems in detail, positional numeral system that represents numbers using a radix (base) of sixteen. Unlike the decimal system representing numbers using ten symbo ...
address in this example the constant 0x7FFF:
int *hardware_address = (int *)0x7FFF;
In the mid 80s, using the BIOS
In computing, BIOS (, ; Basic Input/Output System, also known as the System BIOS, ROM BIOS, BIOS ROM or PC BIOS) is a type of firmware used to provide runtime services for operating systems and programs and to perform hardware initialization d ...
to access the video capabilities of PCs was slow. Applications that were display-intensive typically used to access CGA video memory directly by casting the hexadecimal
Hexadecimal (also known as base-16 or simply hex) is a Numeral system#Positional systems in detail, positional numeral system that represents numbers using a radix (base) of sixteen. Unlike the decimal system representing numbers using ten symbo ...
constant 0xB8000 to a pointer to an array of 80 unsigned 16-bit int values. Each value consisted of an ASCII
ASCII ( ), an acronym for American Standard Code for Information Interchange, is a character encoding standard for representing a particular set of 95 (English language focused) printable character, printable and 33 control character, control c ...
code in the low byte, and a colour in the high byte. Thus, to put the letter 'A' at row 5, column 2 in bright white on blue, one would write code like the following:
#define VID ((unsigned short (*) 00xB8000)
void foo(void)
Use in control tables
Control tables that are used to control program flow
In computer science, control flow (or flow of control) is the order in which individual Statement (computer science), statements, Instruction (computer science), instructions or function calls of an imperative programming, imperative computer pro ...
usually make extensive use of pointers. The pointers, usually embedded in a table entry, may, for instance, be used to hold the entry points to subroutine
In computer programming, a function (also procedure, method, subroutine, routine, or subprogram) is a callable unit of software logic that has a well-defined interface and behavior and can be invoked multiple times.
Callable units provide a ...
s to be executed, based on certain conditions defined in the same table entry. The pointers can however be simply indexes to other separate, but associated, tables comprising an array of the actual addresses or the addresses themselves (depending upon the programming language constructs available). They can also be used to point to earlier table entries (as in loop processing) or forward to skip some table entries (as in a switch
In electrical engineering, a switch is an electrical component that can disconnect or connect the conducting path in an electrical circuit, interrupting the electric current or diverting it from one conductor to another. The most common type o ...
or "early" exit from a loop). For this latter purpose, the "pointer" may simply be the table entry number itself and can be transformed into an actual address by simple arithmetic.
Typed pointers and casting
In many languages, pointers have the additional restriction that the object they point to has a specific type
Type may refer to:
Science and technology Computing
* Typing, producing text via a keyboard, typewriter, etc.
* Data type, collection of values used for computations.
* File type
* TYPE (DOS command), a command to display contents of a file.
* ...
. For example, a pointer may be declared to point to an integer
An integer is the number zero (0), a positive natural number (1, 2, 3, ...), or the negation of a positive natural number (−1, −2, −3, ...). The negations or additive inverses of the positive natural numbers are referred to as negative in ...
; the language will then attempt to prevent the programmer from pointing it to objects which are not integers, such as floating-point number
In computing, floating-point arithmetic (FP) is arithmetic on subsets of real numbers formed by a ''significand'' (a signed sequence of a fixed number of digits in some base) multiplied by an integer power of that base.
Numbers of this form ...
s, eliminating some errors.
For example, in C
int *money;
char *bags;
money
would be an integer pointer and bags
would be a char pointer.
The following would yield a compiler warning of "assignment from incompatible pointer type" under GCC
bags = money;
because money
and bags
were declared with different types.
To suppress the compiler warning, it must be made explicit that you do indeed wish to make the assignment by typecasting
In film, television, and theatre, typecasting is the process by which a particular actor becomes strongly identified with a specific character, one or more particular roles, or characters having the same traits or coming from the same social or ...
it
bags = (char *)money;
which says to cast the integer pointer of money
to a char pointer and assign to bags
.
A 2005 draft of the C standard requires that casting a pointer derived from one type to one of another type should maintain the alignment correctness for both types (6.3.2.3 Pointers, par. 7):
char *external_buffer = "abcdef";
int *internal_data;
internal_data = (int *)external_buffer; // UNDEFINED BEHAVIOUR if "the resulting pointer
// is not correctly aligned"
In languages that allow pointer arithmetic, arithmetic on pointers takes into account the size of the type. For example, adding an integer number to a pointer produces another pointer that points to an address that is higher by that number times the size of the type. This allows us to easily compute the address of elements of an array of a given type, as was shown in the C arrays example above. When a pointer of one type is cast to another type of a different size, the programmer should expect that pointer arithmetic will be calculated differently. In C, for example, if the money
array starts at 0x2000 and sizeof(int)
is 4 bytes whereas sizeof(char)
is 1 byte, then money + 1
will point to 0x2004, but bags + 1
would point to 0x2001. Other risks of casting include loss of data when "wide" data is written to "narrow" locations (e.g. bags = 65537;
), unexpected results when bit-shifting values, and comparison problems, especially with signed vs unsigned values.
Although it is impossible in general to determine at compile-time which casts are safe, some languages store run-time type information
In computer programming, run-time type information or run-time type identification (RTTI) is a feature of some programming languages (such as C++, Object Pascal, and Ada) that exposes information about an object's data type at runtime. Run-time ...
which can be used to confirm that these dangerous casts are valid at runtime. Other languages merely accept a conservative approximation of safe casts, or none at all.
Value of pointers
In C and C++, even if two pointers compare as equal that doesn't mean they are equivalent. In these languages ''and'' LLVM
LLVM, also called LLVM Core, is a target-independent optimizer and code generator. It can be used to develop a Compiler#Front end, frontend for any programming language and a Compiler#Back end, backend for any instruction set architecture. LLVM i ...
, the rule is interpreted to mean that "just because two pointers point to the same address, does not mean they are equal in the sense that they can be used interchangeably", the difference between the pointers referred to as their ''provenance''. Casting to an integer type such as uintptr_t
is implementation-defined and the comparison it provides does not provide any more insight as to whether the two pointers are interchangeable. In addition, further conversion to bytes and arithmetic will throw off optimizers trying to keep track the use of pointers, a problem still being elucidated in academic research.
Making pointers safer
As a pointer allows a program to attempt to access an object that may not be defined, pointers can be the origin of a variety of programming errors. However, the usefulness of pointers is so great that it can be difficult to perform programming tasks without them. Consequently, many languages have created constructs designed to provide some of the useful features of pointers without some of their pitfalls, also sometimes referred to as ''pointer hazards''. In this context, pointers that directly address memory (as used in this article) are referred to as s, by contrast with smart pointers or other variants.
One major problem with pointers is that as long as they can be directly manipulated as a number, they can be made to point to unused addresses or to data which is being used for other purposes. Many languages, including most functional programming languages and recent imperative programming
In computer science, imperative programming is a programming paradigm of software that uses Statement (computer science), statements that change a program's state (computer science), state. In much the same way that the imperative mood in natural ...
languages like Java
Java is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea (a part of Pacific Ocean) to the north. With a population of 156.9 million people (including Madura) in mid 2024, proje ...
, replace pointers with a more opaque type of reference, typically referred to as simply a ''reference'', which can only be used to refer to objects and not manipulated as numbers, preventing this type of error. Array indexing is handled as a special case.
A pointer which does not have any address assigned to it is called a wild pointer. Any attempt to use such uninitialized pointers can cause unexpected behavior, either because the initial value is not a valid address, or because using it may damage other parts of the program. The result is often a segmentation fault, storage violation or wild branch (if used as a function pointer or branch address).
In systems with explicit memory allocation, it is possible to create a dangling pointer
Dangling pointers and wild pointers in computer programming are pointers that do not point to a valid object of the appropriate type. These are special cases of memory safety violations. More generally, dangling references and wild references a ...
by deallocating the memory region it points into. This type of pointer is dangerous and subtle because a deallocated memory region may contain the same data as it did before it was deallocated but may be then reallocated and overwritten by unrelated code, unknown to the earlier code. Languages with garbage collection prevent this type of error because deallocation is performed automatically when there are no more references in scope.
Some languages, like C++, support smart pointers, which use a simple form of reference counting
In computer science, reference counting is a programming technique of storing the number of references, pointers, or handles to a resource, such as an object, a block of memory, disk space, and others.
In garbage collection algorithms, refere ...
to help track allocation of dynamic memory in addition to acting as a reference. In the absence of reference cycles, where an object refers to itself indirectly through a sequence of smart pointers, these eliminate the possibility of dangling pointers and memory leaks. Delphi
Delphi (; ), in legend previously called Pytho (Πυθώ), was an ancient sacred precinct and the seat of Pythia, the major oracle who was consulted about important decisions throughout the ancient Classical antiquity, classical world. The A ...
strings support reference counting natively.
The Rust programming language introduces a ''borrow checker'', ''pointer lifetimes'', and an optimisation based around option types for null pointer
In computing, a null pointer (sometimes shortened to nullptr or null) or null reference is a value saved for indicating that the Pointer (computer programming), pointer or reference (computer science), reference does not refer to a valid Object (c ...
s to eliminate pointer bugs, without resorting to Garbage collection (computer science), garbage collection.
Special kinds of pointers
Kinds defined by value
Null pointer
A null pointer has a value reserved for indicating that the pointer does not refer to a valid object. Null pointers are routinely used to represent conditions such as the end of a List (computing), list of unknown length or the failure to perform some action; this use of null pointers can be compared to nullable types and to the ''Nothing'' value in an option type.
Dangling pointer
A dangling pointer is a pointer that does not point to a valid object and consequently may make a program crash or behave oddly. In the Pascal or C (programming language), C programming languages, pointers that are not specifically initialized may point to unpredictable addresses in memory.
The following example code shows a dangling pointer:
int func(void)
Here, p2
may point to anywhere in memory, so performing the assignment *p2 = 'b';
can corrupt an unknown area of memory or trigger a segmentation fault.
Wild branch
Where a pointer is used as the address of the entry point to a program or start of a subroutine, function which doesn't return anything and is also either uninitialized or corrupted, if a call or unconditional branch, jump is nevertheless made to this address, a " wild branch" is said to have occurred. In other words, a wild branch is a function pointer that is wild (dangling).
The consequences are usually unpredictable and the error may present itself in several different ways depending upon whether or not the pointer is a "valid" address and whether or not there is (coincidentally) a valid instruction (opcode) at that address. The detection of a wild branch can present one of the most difficult and frustrating debugging exercises since much of the evidence may already have been destroyed beforehand or by execution of one or more inappropriate instructions at the branch location. If available, an instruction set simulator can usually not only detect a wild branch before it takes effect, but also provide a complete or partial trace of its history.
Kinds defined by structure
Autorelative pointer
An autorelative pointer is a pointer whose value is interpreted as an offset from the address of the pointer itself; thus, if a data structure has an autorelative pointer member that points to some portion of the data structure itself, then the data structure may be relocated in memory without having to update the value of the auto relative pointer.
The cited patent also uses the term self-relative pointer to mean the same thing. However, the meaning of that term has been used in other ways:
* to mean an offset from the address of a structure rather than from the address of the pointer itself;
* to mean a pointer containing its own address, which can be useful for reconstructing in any arbitrary region of memory a collection of data structures that point to each other.
Based pointer
A based pointer is a pointer whose value is an offset from the value of another pointer. This can be used to store and load blocks of data, assigning the address of the beginning of the block to the base pointer.
Kinds defined by use or datatype
Multiple indirection
In some languages, a pointer can reference another pointer, requiring multiple dereference operations to get to the original value. While each level of indirection may add a performance cost, it is sometimes necessary in order to provide correct behavior for complex data structures
In computer science, a data structure is a data organization and storage format that is usually chosen for efficient access to data. More precisely, a data structure is a collection of data values, the relationships among them, and the functi ...
. For example, in C it is typical to define a linked list
In computer science, a linked list is a linear collection of data elements whose order is not given by their physical placement in memory. Instead, each element points to the next. It is a data structure consisting of a collection of nodes whi ...
in terms of an element that contains a pointer to the next element of the list:
struct element ;
struct element *head = NULL;
This implementation uses a pointer to the first element in the list as a surrogate for the entire list. If a new value is added to the beginning of the list, head
has to be changed to point to the new element. Since C arguments are always passed by value, using double indirection allows the insertion to be implemented correctly, and has the desirable side-effect of eliminating special case code to deal with insertions at the front of the list:
// Given a sorted list at *head, insert the element item at the first
// location where all earlier elements have lesser or equal value.
void insert(struct element **head, struct element *item)
// Caller does this:
insert(&head, item);
In this case, if the value of item
is less than that of head
, the caller's head
is properly updated to the address of the new item.
A basic example is in the argv argument to the main function#C and C++, main function in C (and C++), which is given in the prototype as char **argv
—this is because the variable argv
itself is a pointer to an array of strings (an array of arrays), so *argv
is a pointer to the 0th string (by convention the name of the program), and **argv
is the 0th character of the 0th string.
Function pointer
In some languages, a pointer can reference executable code, i.e., it can point to a function, method, or procedure. A function pointer will store the address of a function to be invoked. While this facility can be used to call functions dynamically, it is often a favorite technique of virus and other malicious software writers.
int sum(int n1, int n2)
int main(void)
Back pointer
In doubly linked list
In computer science, a linked list is a linear collection of data elements whose order is not given by their physical placement in memory. Instead, each element points to the next. It is a data structure consisting of a collection of nodes whi ...
s or tree (data structure), tree structures, a back pointer held on an element 'points back' to the item referring to the current element. These are useful for navigation and manipulation, at the expense of greater memory use.
Simulation using an array index
It is possible to simulate pointer behavior using an index to an (normally one-dimensional) array.
Primarily for languages which do not support pointers explicitly but ''do'' support arrays, the Array data type, array can be thought of and processed as if it were the entire memory range (within the scope of the particular array) and any index to it can be thought of as equivalent to a general-purpose register
A processor register is a quickly accessible location available to a computer's processor. Registers usually consist of a small amount of fast storage, although some registers have specific hardware functions, and may be read-only or write-onl ...
in assembly language (that points to the individual bytes but whose actual value is relative to the start of the array, not its absolute address in memory).
Assuming the array is, say, a contiguous 16 megabyte character data structure
In computer science, a data structure is a data organization and storage format that is usually chosen for Efficiency, efficient Data access, access to data. More precisely, a data structure is a collection of data values, the relationships amo ...
, individual bytes (or a String (computer science), string of contiguous bytes within the array) can be directly addressed and manipulated using the name of the array with a 31 bit unsigned integer
An integer is the number zero (0), a positive natural number (1, 2, 3, ...), or the negation of a positive natural number (−1, −2, −3, ...). The negations or additive inverses of the positive natural numbers are referred to as negative in ...
as the simulated pointer (this is quite similar to the ''C arrays'' example shown above). Pointer arithmetic can be simulated by adding or subtracting from the index, with minimal additional overhead compared to genuine pointer arithmetic.
It is even theoretically possible, using the above technique, together with a suitable instruction set simulator to simulate ''any'' machine code or the intermediate (byte code) of ''any'' processor/language in another language that does not support pointers at all (for example Java
Java is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea (a part of Pacific Ocean) to the north. With a population of 156.9 million people (including Madura) in mid 2024, proje ...
/ JavaScript). To achieve this, the binary numeral system, binary code can initially be loaded into contiguous bytes of the array for the simulator to "read", interpret and execute entirely within the memory containing the same array.
If necessary, to completely avoid buffer overflow problems, bounds checking
In computer programming, bounds checking is any method of detecting whether a variable is within some bounds before it is used. It is usually used to ensure that a number fits into a given type (range checking), or that a variable being used as ...
can usually be inserted by the compiler (or if not, hand coded in the simulator).
Support in various programming languages
Ada
Ada (programming language), Ada is a strongly typed language where all pointers are typed and only safe type conversions are permitted. All pointers are by default initialized to null
, and any attempt to access data through a null
pointer causes an Exception handling, exception to be raised. Pointers in Ada are called ''access types''. Ada 83 did not permit arithmetic on access types (although many compiler vendors provided for it as a non-standard feature), but Ada 95 supports “safe” arithmetic on access types via the package System.Storage_Elements
.
BASIC
Several old versions of BASIC for the Windows platform had support for STRPTR() to return the address of a string, and for VARPTR() to return the address of a variable. Visual Basic 5 also had support for OBJPTR() to return the address of an object interface, and for an ADDRESSOF operator to return the address of a function. The types of all of these are integers, but their values are equivalent to those held by pointer types.
Newer dialects of BASIC, such as FreeBASIC
FreeBASIC is a FOSS, free and open source multiplatform compiler and programming language based on BASIC licensed under the GNU General Public License, GNU GPL for Microsoft Windows, protected-mode MS-DOS (DOS extender), Linux, FreeBSD and Xbox ...
or BlitzMax, have exhaustive pointer implementations, however. In FreeBASIC, arithmetic on ANY
pointers (equivalent to C's void*
) are treated as though the ANY
pointer was a byte width. ANY
pointers cannot be dereferenced, as in C. Also, casting between ANY
and any other type's pointers will not generate any warnings.
dim as integer f = 257
dim as any ptr g = @f
dim as integer ptr i = g
assert(*i = 257)
assert( (g + 4) = (@f + 1) )
C and C++
In C and C++ pointers are variables that store addresses and can be ''null''. Each pointer has a type it points to, but one can freely cast between pointer types (but not between a function pointer and an object pointer). A special pointer type called the “void pointer” allows pointing to any (non-function) object, but is limited by the fact that it cannot be dereferenced directly (it shall be cast). The address itself can often be directly manipulated by casting a pointer to and from an integral type of sufficient size, though the results are implementation-defined and may indeed cause undefined behavior; while earlier C standards did not have an integral type that was guaranteed to be large enough, C99 specifies the uintptr_t
''typedef name'' defined in C_data_types#Fixed-width_integer_types,
, but an implementation need not provide it.
C++ fully supports C pointers and C typecasting. It also supports a new group of typecasting operators to help catch some unintended dangerous casts at compile-time. Since C++11, the C++ Standard Library, C++ standard library also provides smart pointers (unique_ptr
, shared_ptr
and weak_ptr
) which can be used in some situations as a safer alternative to primitive C pointers. C++ also supports another form of reference, quite different from a pointer, called simply a ''reference (C++), reference'' or ''reference type''.
Pointer arithmetic, that is, the ability to modify a pointer's target address with arithmetic operations (as well as magnitude comparisons), is restricted by the language standard to remain within the bounds of a single array object (or just after it), and will otherwise invoke undefined behavior
In computer programming, a program exhibits undefined behavior (UB) when it contains, or is executing code for which its programming language specification does not mandate any specific requirements. This is different from unspecified behavior, ...
. Adding or subtracting from a pointer moves it by a multiple of the size of its datatype. For example, adding 1 to a pointer to 4-byte integer values will increment the pointer's pointed-to byte-address by 4. This has the effect of incrementing the pointer to point at the next element in a contiguous array of integers—which is often the intended result. Pointer arithmetic cannot be performed on void
pointers because the void type has no size, and thus the pointed address can not be added to, although GNU Compiler Collection, gcc and other compilers will perform byte arithmetic on void*
as a non-standard extension, treating it as if it were char *
.
Pointer arithmetic provides the programmer with a single way of dealing with different types: adding and subtracting the number of elements required instead of the actual offset in bytes. (Pointer arithmetic with char *
pointers uses byte offsets, because sizeof(char)
is 1 by definition.) In particular, the C definition explicitly declares that the syntax a[n]
, which is the n
-th element of the array a
, is equivalent to *(a + n)
, which is the content of the element pointed by a + n
. This implies that n[a]
is equivalent to a[n]
, and one can write, e.g., a[3]
or 3[a]
equally well to access the fourth element of an array a
.
While powerful, pointer arithmetic can be a source of Software bug, computer bugs. It tends to confuse novice programmers, forcing them into different contexts: an expression can be an ordinary arithmetic one or a pointer arithmetic one, and sometimes it is easy to mistake one for the other. In response to this, many modern high-level computer languages (for example Java
Java is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea (a part of Pacific Ocean) to the north. With a population of 156.9 million people (including Madura) in mid 2024, proje ...
) do not permit direct access to memory using addresses. Also, the safe C dialect Cyclone programming language, Cyclone addresses many of the issues with pointers. See C (programming language)#Pointers, C programming language for more discussion.
The void
pointer, or void*
, is supported in ANSI C and C++ as a generic pointer type. A pointer to void
can store the address of any object (not function), and, in C, is implicitly converted to any other object pointer type on assignment, but it must be explicitly cast if dereferenced.
The C Programming Language, K&R C used char*
for the “type-agnostic pointer” purpose (before ANSI C).
int x = 4;
void* p1 = &x;
int* p2 = p1; // void* implicitly converted to int*: valid C, but not C++
int a = *p2;
int b = *(int*)p1; // when dereferencing inline, there is no implicit conversion
C++ does not allow the implicit conversion of void*
to other pointer types, even in assignments. This was a design decision to avoid careless and even unintended casts, though most compilers only output warnings, not errors, when encountering other casts.
int x = 4;
void* p1 = &x;
int* p2 = p1; // this fails in C++: there is no implicit conversion from void*
int* p3 = (int*)p1; // C-style cast
int* p4 = reinterpret_cast(p1); // C++ cast
In C++, there is no void&
(reference to void) to complement void*
(pointer to void), because references behave like aliases to the variables they point to, and there can never be a variable whose type is void
.
Pointer-to-member
In C++ pointers to non-static members of a class can be defined. If a class C
has a member T a
then &C::a
is a pointer to the member a
of type T C::*
. This member can be an object or a Function pointer#Method pointers, function. They can be used on the right-hand side of operators .*
and ->*
to access the corresponding member.
struct S ;
S s1;
S* ptrS = &s1;
int S::* ptr = &S::a; // pointer to S::a
int (S::* fp)()const = &S::f; // pointer to S::f
s1.*ptr = 1;
std::cout << (s1.*fp)() << "\n"; // prints 1
ptrS->*ptr = 2;
std::cout << (ptrS->*fp)() << "\n"; // prints 2
Pointer declaration syntax overview
These pointer declarations cover most variants of pointer declarations. Of course it is possible to have triple pointers, but the main principles behind a triple pointer already exist in a double pointer. The naming used here is what the expression typeid(type).name()
equals for each of these types when using g++ or clang.
char A5_A5_c /* array of arrays of chars */
char *A5_Pc /* array of pointers to chars */
char **PPc; /* pointer to pointer to char ("double pointer") */
char (*PA5_c) /* pointer to array(s) of chars */
char *FPcvE(); /* function which returns a pointer to char(s) */
char (*PFcvE)(); /* pointer to a function which returns a char */
char (*FPA5_cvE()) /* function which returns pointer to an array of chars */
char (*A5_PFcvE[5])(); /* an array of pointers to functions which return a char */
The following declarations involving pointers-to-member are valid only in C++:
class C;
class D;
char C::* M1Cc; /* pointer-to-member to char */
char C::*A5_M1Cc /* array of pointers-to-member to char */
char* C::* M1CPc; /* pointer-to-member to pointer to char(s) */
char C::** PM1Cc; /* pointer to pointer-to-member to char */
char (*M1CA5_c) /* pointer-to-member to array(s) of chars */
char C::* FM1CcvE(); /* function which returns a pointer-to-member to char */
char D::* C::* M1CM1Dc; /* pointer-to-member to pointer-to-member to pointer to char(s) */
char C::* C::* M1CMS_c; /* pointer-to-member to pointer-to-member to pointer to char(s) */
char (C::* FM1CA5_cvE()) /* function which returns pointer-to-member to an array of chars */
char (C::* M1CFcvE)() /* pointer-to-member-function which returns a char */
char (C::* A5_M1CFcvE[5])(); /* an array of pointers-to-member-functions which return a char */
The ()
and []
have a higher priority than *
.
C#
In the C Sharp (programming language), C# programming language, pointers are supported by either marking blocks of code that include pointers with the unsafe
keyword, or by using
the System.Runtime.CompilerServices
assembly provisions for pointer access.
The syntax is essentially the same as in C++, and the address pointed can be either Managed code, managed or Managed code, unmanaged memory. However, pointers to managed memory (any pointer to a managed object) must be declared using the fixed
keyword, which prevents the Garbage collection (computer science), garbage collector from moving the pointed object as part of memory management while the pointer is in scope, thus keeping the pointer address valid.
However, an exception to this is from using the IntPtr
structure, which is a memory managed equivalent to int*
, and does not require the unsafe
keyword nor the CompilerServices
assembly. This type is often returned when using methods from the System.Runtime.InteropServices
, for example:
// Get 16 bytes of memory from the process's unmanaged memory
IntPtr pointer = System.Runtime.InteropServices.Marshal.AllocHGlobal(16);
// Do something with the allocated memory
// Free the allocated memory
System.Runtime.InteropServices.Marshal.FreeHGlobal(pointer);
The .NET Framework, .NET framework includes many classes and methods in the System
and System.Runtime.InteropServices
namespaces (such as the Marshal
class) which convert .NET types (for example, System.String
) to and from many Managed code, unmanaged types and pointers (for example, LPWSTR
or void*
) to allow communication with Managed code, unmanaged code. Most such methods have the same security permission requirements as unmanaged code, since they can affect arbitrary places in memory.
COBOL
The COBOL programming language supports pointers to variables. Primitive or group (record) data objects declared within the LINKAGE SECTION
of a program are inherently pointer-based, where the only memory allocated within the program is space for the address of the data item (typically a single memory word). In program source code, these data items are used just like any other WORKING-STORAGE
variable, but their contents are implicitly accessed indirectly through their LINKAGE
pointers.
Memory space for each pointed-to data object is typically dynamic memory allocation, allocated dynamically using external subroutine, CALL
statements or via embedded extended language constructs such as EXEC CICS, EXEC CICS
or SQL, EXEC SQL
statements.
Extended versions of COBOL also provide pointer variables declared with USAGE
IS
POINTER
clauses. The values of such pointer variables are established and modified using SET
and SET
ADDRESS
statements.
Some extended versions of COBOL also provide PROCEDURE-POINTER
variables, which are capable of storing the function pointer, addresses of executable code.
PL/I
The PL/I
PL/I (Programming Language One, pronounced and sometimes written PL/1) is a procedural, imperative computer programming language initially developed by IBM. It is designed for scientific, engineering, business and system programming. It has b ...
language provides full support for pointers to all data types (including pointers to structures), recursion, Computer multitasking, multitasking, string handling, and extensive built-in subroutine, functions. PL/I was quite a leap forward compared to the programming languages of its time. PL/I pointers are untyped, and therefore no casting is required for pointer dereferencing or assignment. The declaration syntax for a pointer is DECLARE xxx POINTER;
, which declares a pointer named "xxx". Pointers are used with BASED
variables. A based variable can be declared with a default locator (DECLARE xxx BASED(ppp);
or without (DECLARE xxx BASED;
), where xxx is a based variable, which may be an element variable, a structure, or an array, and ppp is the default pointer). Such a variable can be address without an explicit pointer reference (xxx=1;
, or may be addressed with an explicit reference to the default locator (ppp), or to any other pointer (qqq->xxx=1;
).
Pointer arithmetic is not part of the PL/I standard, but many compilers allow expressions of the form ptr = ptr±expression
. IBM PL/I also has the builtin function PTRADD
to perform the arithmetic. Pointer arithmetic is always performed in bytes.
IBM ''Enterprise'' PL/I compilers have a new form of typed pointer called a HANDLE
.
D
The D (programming language), D programming language is a derivative of C and C++ which fully supports C pointers and C typecasting.
Eiffel
The Eiffel (programming language), Eiffel object-oriented language employs value and reference semantics without pointer arithmetic. Nevertheless, pointer classes are provided. They offer pointer arithmetic, typecasting, explicit memory management,
interfacing with non-Eiffel software, and other features.
Fortran
Fortran, Fortran-90 introduced a strongly typed pointer capability. Fortran pointers contain more than just a simple memory address. They also encapsulate the lower and upper bounds of array dimensions, strides (for example, to support arbitrary array sections), and other metadata. An ''association operator'', =>
is used to associate a POINTER
to a variable which has a TARGET
attribute. The Fortran-90 ALLOCATE
statement may also be used to associate a pointer to a block of memory. For example, the following code might be used to define and create a linked list structure:
type real_list_t
real :: sample_data(100)
type (real_list_t), pointer :: next => null ()
end type
type (real_list_t), target :: my_real_list
type (real_list_t), pointer :: real_list_temp
real_list_temp => my_real_list
do
read (1,iostat=ioerr) real_list_temp%sample_data
if (ioerr /= 0) exit
allocate (real_list_temp%next)
real_list_temp => real_list_temp%next
end do
Fortran-2003 adds support for procedure pointers. Also, as part of the ''C Interoperability'' feature, Fortran-2003 supports intrinsic functions for converting C-style pointers into Fortran pointers and back.
Go
Go (programming language), Go has pointers. Its declaration syntax is equivalent to that of C, but written the other way around, ending with the type. Unlike C, Go has garbage collection, and disallows pointer arithmetic. Reference types, like in C++, do not exist. Some built-in types, like maps and channels, are boxed (i.e. internally they are pointers to mutable structures), and are initialized using the make
function. In an approach to unified syntax between pointers and non-pointers, the arrow (->
) operator has been dropped: the dot operator on a pointer refers to the field or method of the dereferenced object. This, however, only works with 1 level of indirection.
Java
There is no explicit representation of pointers in Java
Java is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea (a part of Pacific Ocean) to the north. With a population of 156.9 million people (including Madura) in mid 2024, proje ...
. Instead, more complex data structures like Object-oriented programming, objects and Array data structure, arrays are implemented using reference (computer science), references. The language does not provide any explicit pointer manipulation operators. It is still possible for code to attempt to dereference a null reference (null pointer), however, which results in a run-time exception handling, exception being thrown. The space occupied by unreferenced memory objects is recovered automatically by garbage collection at run-time.
Modula-2
Pointers are implemented very much as in Pascal, as are VAR
parameters in procedure calls. Modula-2 is even more strongly typed than Pascal, with fewer ways to escape the type system. Some of the variants of Modula-2 (such as Modula-3) include garbage collection.
Oberon
Much as with Modula-2, pointers are available. There are still fewer ways to evade the type system and so Oberon (programming language), Oberon and its variants are still safer with respect to pointers than Modula-2 or its variants. As with Modula-3, garbage collection is a part of the language specification.
Pascal
Unlike many languages that feature pointers, standard International Organization for Standardization, ISO Pascal only allows pointers to reference dynamically created variables that are anonymous and does not allow them to reference standard static or local variables. It does not have pointer arithmetic. Pointers also must have an associated type and a pointer to one type is not compatible with a pointer to another type (e.g. a pointer to a char is not compatible with a pointer to an integer). This helps eliminate the type security issues inherent with other pointer implementations, particularly those used for PL/I
PL/I (Programming Language One, pronounced and sometimes written PL/1) is a procedural, imperative computer programming language initially developed by IBM. It is designed for scientific, engineering, business and system programming. It has b ...
or C (Programming Language), C. It also removes some risks caused by dangling pointers, but the ability to dynamically let go of referenced space by using the dispose
standard procedure (which has the same effect as the free
library function found in C (Programming Language), C) means that the risk of dangling pointers has not been entirely eliminated.
However, in some commercial and open source Pascal (or derivatives) compiler implementations —like Free Pascal, Turbo Pascal or the Object Pascal in Embarcadero Delphi— a pointer is allowed to reference standard static or local variables and can be cast from one pointer type to another. Moreover, pointer arithmetic is unrestricted: adding or subtracting from a pointer moves it by that number of bytes in either direction, but using the Inc
or Dec
standard procedures with it moves the pointer by the size of the data type
In computer science and computer programming, a data type (or simply type) is a collection or grouping of data values, usually specified by a set of possible values, a set of allowed operations on these values, and/or a representation of these ...
it is ''declared'' to point to. An untyped pointer is also provided under the name Pointer
, which is compatible with other pointer types.
Perl
The Perl programming language
A programming language is a system of notation for writing computer programs.
Programming languages are described in terms of their Syntax (programming languages), syntax (form) and semantics (computer science), semantics (meaning), usually def ...
supports pointers, although rarely used, in the form of the pack and unpack functions. These are intended only for simple interactions with compiled OS libraries. In all other cases, Perl uses references
A reference is a relationship between Object (philosophy), objects in which one object designates, or acts as a means by which to connect to or link to, another object. The first object in this relation is said to ''refer to'' the second object. ...
, which are typed and do not allow any form of pointer arithmetic. They are used to construct complex data structures.
See also
* Address constant
* Bounded pointer
* Buffer overflow
* Cray pointer
* Fat pointer
* Function pointer
* Hazard pointer
* Iterator
* Opaque pointer
* Pointee
* Pointer swizzling
* Reference (computer science)
* Static program analysis
* Storage violation
* Tagged pointer
* Variable (computer science)
* Zero-based numbering
Notes
References
External links
PL/I List Processing
Paper from the June, 1967 issue of CACM
cdecl.org
A tool to convert pointer declarations to plain English
Over IQ.com
A beginner level guide describing pointers in a plain English.
Pointers and Memory
Introduction to pointers – Stanford Computer Science Education Library
A visual model for beginner C programmiers
0pointer.de
A terse list of minimum length source codes that dereference a null pointer in several different programming languages
* Committee draft.
{{DEFAULTSORT:Pointer (Computing)
Articles with example C code
Pointers (computer programming)
Primitive types
American inventions
Programming language comparisons
Articles with example Ada code
Articles with example BASIC code
Articles with example C++ code
Articles with example C Sharp code
Articles with example D code
Articles with example Eiffel code
Articles with example Fortran code
Articles with example Java code
Articles with example Pascal code
sv:Datatyp#Pekare och referenstyper