0x7D0
, 0x7D4
, 0x7D8
, ..., 0x7F4
) so that the element with index ''i'' has the address 2000 + (''i'' × 4).
The memory address of the first element of an array is called first address, foundation address, or base address.
Because the mathematical concept of a History
The first digital computers used machine-language programming to set up and access array structures for data tables, vector and matrix computations, and for many other purposes.Applications
Arrays are used to implement mathematical vectors andIF
statements. They are known in this context as Element identifier and addressing formulas
When data objects are stored in an array, individual objects are selected by an index that is usually a non-negative scalarA
with three rows and four columns might provide access to the element at the 2nd row and 4th column by the expression A 3]
in the case of a zero-based indexing system. Thus two indices are used for a two-dimensional array, three for a three-dimensional array, and ''n'' for an ''n''-dimensional array.
The number of indices needed to specify an element is called the dimension, dimensionality, or rank (computer programming), rank of the array.
In standard arrays, each index is restricted to a certain range of consecutive integers (or consecutive values of some One-dimensional arrays
A one-dimensional array (or single dimension array) is a type of linear array. Accessing its elements involves a single subscript which can either represent a row or column index. As an example consider the C declarationint anArrayName 0
which declares a one-dimensional array of ten integers. Here, the array can store ten elements of type int
. This array has indices starting from zero through nine. For example, the expressions anArrayName /code> and anArrayName /code> are the first and last elements respectively.
For a vector with linear addressing, the element with index ''i'' is located at the address , where ''B'' is a fixed ''base address'' and ''c'' a fixed constant, sometimes called the ''address increment'' or ''stride''.
If the valid element indices begin at 0, the constant ''B'' is simply the address of the first element of the array. For this reason, the C programming language
''The C Programming Language'' (sometimes termed ''K&R'', after its authors' initials) is a computer programming book written by Brian Kernighan and Dennis Ritchie, the latter of whom originally designed and implemented the language, as well as ...
specifies that array indices always begin at 0; and many programmers will call that element " zeroth" rather than "first".
However, one can choose the index of the first element by an appropriate choice of the base address ''B''. For example, if the array has five elements, indexed 1 through 5, and the base address ''B'' is replaced by , then the indices of those same elements will be 31 to 35. If the numbering does not start at 0, the constant ''B'' may not be the address of any element.
Multidimensional arrays
For a multidimensional array, the element with indices ''i'',''j'' would have address ''B'' + ''c'' · ''i'' + ''d'' · ''j'', where the coefficients ''c'' and ''d'' are the ''row'' and ''column address increments'', respectively.
More generally, in a ''k''-dimensional array, the address of an element with indices ''i''1, ''i''2, ..., ''i''''k'' is
: ''B'' + ''c''1 · ''i''1 + ''c''2 · ''i''2 + … + ''c''''k'' · ''i''''k''.
For example: int a 3];
This means that array a has 2 rows and 3 columns, and the array is of integer type. Here we can store 6 elements they will be stored linearly but starting from first row linear then continuing with second row. The above array will be stored as a11, a12, a13, a21, a22, a23.
This formula requires only ''k'' multiplications and ''k'' additions, for any array that can fit in memory. Moreover, if any coefficient is a fixed power of 2, the multiplication can be replaced by bitwise operation, bit shifting.
The coefficients ''c''''k'' must be chosen so that every valid index tuple maps to the address of a distinct element.
If the minimum legal value for every index is 0, then ''B'' is the address of the element whose indices are all zero. As in the one-dimensional case, the element indices may be changed by changing the base address ''B''. Thus, if a two-dimensional array has rows and columns indexed from 1 to 10 and 1 to 20, respectively, then replacing ''B'' by will cause them to be renumbered from 0 through 9 and 4 through 23, respectively. Taking advantage of this feature, some languages (like FORTRAN 77) specify that array indices begin at 1, as in mathematical tradition while other languages (like Fortran 90, Pascal and Algol) let the user choose the minimum value for each index.
Dope vectors
The addressing formula is completely defined by the dimension ''d'', the base address ''B'', and the increments ''c''1, ''c''2, ..., ''c''''k''. It is often useful to pack these parameters into a record called the array's ''descriptor'' or ''stride vector'' or ''dope vector In computer programming, a dope vector is a data structure used to hold information about a data object, especially its memory layout.
Purpose
Dope vectors are most commonly used to describe arrays, which commonly store multiple instances of a par ...
''. The size of each element, and the minimum and maximum values allowed for each index may also be included in the dope vector. The dope vector is a complete handle
A handle is a part of, or attachment to, an object that allows it to be grasped and manipulated by hand. The design of each type of handle involves substantial ergonomic issues, even where these are dealt with intuitively or by following tr ...
for the array, and is a convenient way to pass arrays as arguments to procedures. Many useful array slicing
In computer programming, array slicing is an operation that extracts a subset of elements from an array and packages them as another array, possibly in a different dimension from the original.
Common examples of array slicing are extracting a su ...
operations (such as selecting a sub-array, swapping indices, or reversing the direction of the indices) can be performed very efficiently by manipulating the dope vector.
Compact layouts
Often the coefficients are chosen so that the elements occupy a contiguous area of memory. However, that is not necessary. Even if arrays are always created with contiguous elements, some array slicing operations may create non-contiguous sub-arrays from them.
There are two systematic compact layouts for a two-dimensional array. For example, consider the matrix
:
In the row-major order layout (adopted by C for statically declared arrays), the elements in each row are stored in consecutive positions and all of the elements of a row have a lower address than any of the elements of a consecutive row:
:
In column-major order (traditionally used by Fortran), the elements in each column are consecutive in memory and all of the elements of a column have a lower address than any of the elements of a consecutive column:
:
For arrays with three or more indices, "row major order" puts in consecutive positions any two elements whose index tuples differ only by one in the ''last'' index. "Column major order" is analogous with respect to the ''first'' index.
In systems which use processor cache or virtual memory
In computing, virtual memory, or virtual storage is a memory management technique that provides an "idealized abstraction of the storage resources that are actually available on a given machine" which "creates the illusion to users of a very ...
, scanning an array is much faster if successive elements are stored in consecutive positions in memory, rather than sparsely scattered. Many algorithms that use multidimensional arrays will scan them in a predictable order. A programmer (or a sophisticated compiler) may use this information to choose between row- or column-major layout for each array. For example, when computing the product ''A''·''B'' of two matrices, it would be best to have ''A'' stored in row-major order, and ''B'' in column-major order.
Resizing
Static arrays have a size that is fixed when they are created and consequently do not allow elements to be inserted or removed. However, by allocating a new array and copying the contents of the old array to it, it is possible to effectively implement a ''dynamic'' version of an array; see dynamic array
In computer science, a dynamic array, growable array, resizable array, dynamic table, mutable array, or array list is a random access, variable-size list data structure that allows elements to be added or removed. It is supplied with standard lib ...
. If this operation is done infrequently, insertions at the end of the array require only amortized constant time.
Some array data structures do not reallocate storage, but do store a count of the number of elements of the array in use, called the count or size. This effectively makes the array a dynamic array
In computer science, a dynamic array, growable array, resizable array, dynamic table, mutable array, or array list is a random access, variable-size list data structure that allows elements to be added or removed. It is supplied with standard lib ...
with a fixed maximum size or capacity; Pascal strings are examples of this.
Non-linear formulas
More complicated (non-linear) formulas are occasionally used. For a compact two-dimensional triangular array
In mathematics and computing, a triangular array of numbers, polynomials, or the like, is a doubly indexed sequence in which each row is only as long as the row's own index. That is, the ''i''th row contains only ''i'' elements.
Examples
Notable ...
, for instance, the addressing formula is a polynomial of degree 2.
Efficiency
Both ''store'' and ''select'' take (deterministic worst case) constant time
In computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations performed by t ...
. Arrays take linear ( O(''n'')) space in the number of elements ''n'' that they hold.
In an array with element size ''k'' and on a machine with a cache line size of B bytes, iterating through an array of ''n'' elements requires the minimum of ceiling(''nk''/B) cache misses, because its elements occupy contiguous memory locations. This is roughly a factor of B/''k'' better than the number of cache misses needed to access ''n'' elements at random memory locations. As a consequence, sequential iteration over an array is noticeably faster in practice than iteration over many other data structures, a property called locality of reference
In computer science, locality of reference, also known as the principle of locality, is the tendency of a processor to access the same set of memory locations repetitively over a short period of time. There are two basic types of reference localit ...
(this does ''not'' mean however, that using a perfect hash or trivial hash within the same (local) array, will not be even faster - and achievable in constant time
In computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations performed by t ...
). Libraries provide low-level optimized facilities for copying ranges of memory (such as memcpy) which can be used to move contiguous
Contiguity or contiguous may refer to:
*Contiguous data storage, in computer science
* Contiguity (probability theory)
*Contiguity (psychology)
* Contiguous distribution of species, in biogeography
*Geographic contiguity of territorial land
*Conti ...
blocks of array elements significantly faster than can be achieved through individual element access. The speedup of such optimized routines varies by array element size, architecture, and implementation.
Memory-wise, arrays are compact data structures with no per-element overhead. There may be a per-array overhead (e.g., to store index bounds) but this is language-dependent. It can also happen that elements stored in an array require ''less'' memory than the same elements stored in individual variables, because several array elements can be stored in a single word
A word is a basic element of language that carries an objective or practical meaning, can be used on its own, and is uninterruptible. Despite the fact that language speakers often have an intuitive grasp of what a word is, there is no consen ...
; such arrays are often called ''packed'' arrays. An extreme (but commonly used) case is the bit array
A bit array (also known as bitmask, bit map, bit set, bit string, or bit vector) is an array data structure that compactly stores bits. It can be used to implement a simple set data structure. A bit array is effective at exploiting bit-level pa ...
, where every bit represents a single element. A single octet
Octet may refer to:
Music
* Octet (music), ensemble consisting of eight instruments or voices, or composition written for such an ensemble
** String octet, a piece of music written for eight string instruments
*** Octet (Mendelssohn), 1825 com ...
can thus hold up to 256 different combinations of up to 8 different conditions, in the most compact form.
Array accesses with statically predictable access patterns are a major source of data parallelism
Data parallelism is parallelization across multiple processors in parallel computing environments. It focuses on distributing the data across different nodes, which operate on the data in parallel. It can be applied on regular data structures lik ...
.
Comparison with other data structures
Dynamic array
In computer science, a dynamic array, growable array, resizable array, dynamic table, mutable array, or array list is a random access, variable-size list data structure that allows elements to be added or removed. It is supplied with standard lib ...
s or growable arrays are similar to arrays but add the ability to insert and delete elements; adding and deleting at the end is particularly efficient. However, they reserve linear ( Θ(''n'')) additional storage, whereas arrays do not reserve additional storage.
Associative array
In computer science, an associative array, map, symbol table, or dictionary is an abstract data type that stores a collection of (key, value) pairs, such that each possible key appears at most once in the collection. In mathematical terms an ...
s provide a mechanism for array-like functionality without huge storage overheads when the index values are sparse. For example, an array that contains values only at indexes 1 and 2 billion may benefit from using such a structure. Specialized associative arrays with integer keys include Patricia tries, Judy array
Judy is a short form of the name Judith.
Judy may refer to:
Places
* Judy, Kentucky, village in Montgomery County, United States
* Judy Woods, woodlands in Bradford, West Yorkshire, England, United Kingdom
Animals
* Judy (dog) (1936–1950), ...
s, and van Emde Boas trees.
Balanced trees require O(log ''n'') time for indexed access, but also permit inserting or deleting elements in O(log ''n'') time, whereas growable arrays require linear (Θ(''n'')) time to insert or delete elements at an arbitrary position.
Linked list
In computer science, a linked list is a linear collection of data elements whose order is not given by their physical placement in memory. Instead, each element points to the next. It is a data structure consisting of a collection of nodes whi ...
s allow constant time removal and insertion in the middle but take linear time for indexed access. Their memory use is typically worse than arrays, but is still linear.
An Iliffe vector
In computer programming, an Iliffe vector, also known as a display, is a data structure used to implement multi-dimensional arrays. An Iliffe vector for an ''n''-dimensional array (where ''n'' ≥ 2) consists of a vector (or 1-dimension ...
is an alternative to a multidimensional array structure. It uses a one-dimensional array of references
Reference is a relationship between objects in which one object designates, or acts as a means by which to connect to or link to, another object. The first object in this relation is said to ''refer to'' the second object. It is called a '' name'' ...
to arrays of one dimension less. For two dimensions, in particular, this alternative structure would be a vector of pointers to vectors, one for each row(pointer on c or c++). Thus an element in row ''i'' and column ''j'' of an array ''A'' would be accessed by double indexing (''A'' 'i''''j''] in typical notation). This alternative structure allows jagged array
In computer science, a jagged array, also known as a ragged array, irregular array is an array of arrays of which the member arrays can be of different lengths, producing rows of jagged edges when visualized as output. In contrast, two-dimensio ...
s, where each row may have a different size—or, in general, where the valid range of each index depends on the values of all preceding indices. It also saves one multiplication (by the column address increment) replacing it by a bit shift (to index the vector of row pointers) and one extra memory access (fetching the row address), which may be worthwhile in some architectures.
Dimension
The ''dimension'' of an array is the number of indices needed to select an element. Thus, if the array is seen as a function on a set of possible index combinations, it is the dimension of the space of which its domain is a discrete subset. Thus a one-dimensional array is a list of data, a two-dimensional array is a rectangle of data, a three-dimensional array a block of data, etc.
This should not be confused with the dimension of the set of all matrices with a given domain, that is, the number of elements in the array. For example, an array with 5 rows and 4 columns is two-dimensional, but such matrices form a 20-dimensional space. Similarly, a three-dimensional vector can be represented by a one-dimensional array of size three.
See also
* Dynamic array
In computer science, a dynamic array, growable array, resizable array, dynamic table, mutable array, or array list is a random access, variable-size list data structure that allows elements to be added or removed. It is supplied with standard lib ...
* Parallel array
* Variable-length array In computer programming, a variable-length array (VLA), also called variable-sized or runtime-sized, is an array data structure whose length is determined at run time (instead of at compile time).
In C, the VLA is said to have a variably modified ...
* Bit array
A bit array (also known as bitmask, bit map, bit set, bit string, or bit vector) is an array data structure that compactly stores bits. It can be used to implement a simple set data structure. A bit array is effective at exploiting bit-level pa ...
* Array slicing
In computer programming, array slicing is an operation that extracts a subset of elements from an array and packages them as another array, possibly in a different dimension from the original.
Common examples of array slicing are extracting a su ...
* Offset (computer science)
In computer science, an offset within an array or other data structure object is an integer indicating the distance (displacement) between the beginning of the object and a given element or point, presumably within the same object. The concept of ...
* Row- and column-major order
In computing, row-major order and column-major order are methods for storing multidimensional arrays in linear storage such as random access memory.
The difference between the orders lies in which elements of an array are contiguous in memory. I ...
* Stride of an array
In computer programming, the stride of an array (also referred to as increment, pitch or step size) is the number of locations in memory between beginnings of successive array elements, measured in bytes or in units of the size of the array's eleme ...
References
External links
*
{{DEFAULTSORT:Array Data Structure
*