In computer science, array programming refers to solutions which allow the application of operations to an entire set of values at once. Such solutions are commonly used in scientific and engineering settings. Modern programming languages that support array programming (also known as vector or multidimensional languages) have been engineered specifically to generalize operations on scalars to apply transparently to vectors, matrices, and higher-dimensional arrays. These include APL, J, Fortran 90, MATLAB, Analytica, lists),

Octave In music, an octave ( la, octavus: eighth) or perfect octave (sometimes called the diapason) is the interval between one musical pitch and another with double its frequency. The octave relationship is a natural phenomenon that has been refer ...

, R, Cilk Plus, Julia, Perl Data Language (PDL). In these languages, an operation that operates on entire arrays can be called a ''vectorized'' operation, regardless of whether it is executed on a vector processor, which implements vector instructions. Array programming primitives concisely express broad ideas about data manipulation. The level of concision can be dramatic in certain cases: it is not uncommon to find array programming language one-liners that require several pages of object-oriented code.

Concepts of array

The fundamental idea behind array programming is that operations apply at once to an entire set of values. This makes it a high-level programming model as it allows the programmer to think and operate on whole aggregates of data, without having to resort to explicit loops of individual scalar operations.

Kenneth E. Iverson Kenneth Eugene Iverson (17 December 1920 – 19 October 2004) was a Canadian computer scientist noted for the development of the programming language APL. He was honored with the Turing Award in 1979 "for his pioneering effort in programming l ...

described the rationale behind array programming (actually referring to APL) as follows: The basis behind array programming and thinking is to find and exploit the properties of data where individual elements are similar or adjacent. Unlike object orientation which implicitly breaks down data to its constituent parts (or scalar quantities), array orientation looks to group data and apply a uniform handling. Function rank is an important concept to array programming languages in general, by analogy to tensor rank in mathematics: functions that operate on data may be classified by the number of dimensions they act on. Ordinary multiplication, for example, is a scalar ranked function because it operates on zero-dimensional data (individual numbers). The

cross product In mathematics, the cross product or vector product (occasionally directed area product, to emphasize its geometric significance) is a binary operation on two vectors in a three-dimensional oriented Euclidean vector space (named here E), and is ...

operation is an example of a vector rank function because it operates on vectors, not scalars. Matrix multiplication is an example of a 2-rank function, because it operates on 2-dimensional objects (matrices). Collapse operators reduce the dimensionality of an input data array by one or more dimensions. For example, summing over elements collapses the input array by 1 dimension.

Uses

Array programming is very well suited to implicit parallelization; a topic of much research nowadays. Further, Intel and compatible CPUs developed and produced after 1997 contained various instruction set extensions, starting from MMX and continuing through SSSE3 and

3DNow! 3DNow! is a deprecated extension to the x86 instruction set developed by Advanced Micro Devices (AMD). It adds single instruction multiple data (SIMD) instructions to the base x86 instruction set, enabling it to perform vector processing of float ...

, which include rudimentary SIMD array capabilities. Array processing is distinct from parallel processing in that one physical processor performs operations on a group of items simultaneously while parallel processing aims to split a larger problem into smaller ones ( MIMD) to be solved piecemeal by numerous processors. Processors with two or more cores are increasingly common today.

Languages

The canonical examples of array programming languages are Fortran, APL, and J. Others include: A+, Analytica, Chapel,

IDL IDL may refer to: Computing * Interface description language, any computer language used to describe a software component's interface ** IDL specification language, the original IDL created by Lamb, Wulf and Nestor at Queen's University, Canada ...

, Julia, K, Klong, Q, MATLAB, GNU Octave,

PDL PDL is an initialism for: Politics *Democratic Liberal Party (Romania), Democratic Liberal Party (''Partidul Democrat Liberal''), a former political party in Romania *Labour Democratic Party (''Partito Democratico del Lavoro''), a former politi ...

, R,

S-Lang The S-Lang programming library is a software library for Unix, Windows, VMS, OS/2, and Mac OS X. It provides routines for embedding an interpreter for the S-Lang scripting language, and components to facilitate the creation of text-based applic ...

SAC SAC or Sac may refer to: Organizations Education * Santa Ana College, California, US * San Antonio College, Texas, US * St. Andrew's College, Aurora, Canada * Students' Administrative Council, University of Toronto, Canada * SISD Student Activiti ...

, Nial, ZPL and TI-BASIC.

Scalar languages

In scalar languages such as C and

Pascal Pascal, Pascal's or PASCAL may refer to: People and fictional characters * Pascal (given name), including a list of people with the name * Pascal (surname), including a list of people and fictional characters with the name ** Blaise Pascal, Fren ...

, operations apply only to single values, so ''a''+''b'' expresses the addition of two numbers. In such languages, adding one array to another requires indexing and looping, the coding of which is tedious. for (i = 0; i < n; i++) for (j = 0; j < n; j++) a j] += b j]; In array-based languages, for example in Fortran, the nested for-loop above can be written in array-format in one line, a = a + b or alternatively, to emphasize the array nature of the objects, a(:,:) = a(:,:) + b(:,:) While scalar languages like C do not have native array programming elements as part of the language proper, this does not mean programs written in these languages never take advantage of the underlying techniques of vectorization (i.e., utilizing a CPU's Single instruction, multiple data, vector-based instructions if it has them or by using multiple CPU cores). Some C compilers like GCC at some optimization levels detect and vectorize sections of code that its heuristics determine would benefit from it. Another approach is given by the OpenMP API, which allows one to parallelize applicable sections of code by taking advantage of multiple CPU cores.

Array languages

In array languages, operations are generalized to apply to both scalars and arrays. Thus, ''a''+''b'' expresses the sum of two scalars if ''a'' and ''b'' are scalars, or the sum of two arrays if they are arrays. An array language simplifies programming but possibly at a cost known as the ''abstraction penalty''. Because the additions are performed in isolation from the rest of the coding, they may not produce the optimally most efficient code. (For example, additions of other elements of the same array may be subsequently encountered during the same execution, causing unnecessary repeated lookups.) Even the most sophisticated optimizing compiler would have an extremely hard time amalgamating two or more apparently disparate functions which might appear in different program sections or sub-routines, even though a programmer could do this easily, aggregating sums on the same pass over the array to minimize overhead).

Ada

The previous C code would become the following in the Ada language, which supports array-programming syntax. A := A + B;

APL

APL uses single character Unicode symbols with no syntactic sugar. A ← A + B This operation works on arrays of any rank (including rank 0), and on a scalar and an array. Dyalog APL extends the original language with augmented assignments: A +← B

Analytica

Analytica provides the same economy of expression as Ada.

A := A + B;

BASIC

Dartmouth BASIC had MAT statements for matrix and array manipulation in its third edition (1966). DIM A(4),B(4),C(4) MAT A = 1 MAT B = 2 * A MAT C = A + B MAT PRINT A,B,C

Mata

Stata Stata (, , alternatively , occasionally stylized as STATA) is a general-purpose statistical software package developed by StataCorp for data manipulation, visualization, statistics, and automated reporting. It is used by researchers in many fie ...

's matrix programming language Mata supports array programming. Below, we illustrate addition, multiplication, addition of a matrix and a scalar, element by element multiplication, subscripting, and one of Mata's many inverse matrix functions. . mata: : A = (1,2,3) \(4,5,6) : A 1 2 3 +-------------+ 1 , 1 2 3 , 2 , 4 5 6 , +-------------+ : B = (2..4) \(1..3) : B 1 2 3 +-------------+ 1 , 2 3 4 , 2 , 1 2 3 , +-------------+ : C = J(3,2,1) // A 3 by 2 matrix of ones : C 1 2 +---------+ 1 , 1 1 , 2 , 1 1 , 3 , 1 1 , +---------+ : D = A + B : D 1 2 3 +-------------+ 1 , 3 5 7 , 2 , 5 7 9 , +-------------+ : E = A*C : E 1 2 +-----------+ 1 , 6 6 , 2 , 15 15 , +-----------+ : F = A:*B : F 1 2 3 +----------------+ 1 , 2 6 12 , 2 , 4 10 18 , +----------------+ : G = E :+ 3 : G 1 2 +-----------+ 1 , 9 9 , 2 , 18 18 , +-----------+ : H = F 2\1), (1, 2) // Subscripting to get a submatrix of F and : // switch row 1 and 2 : H 1 2 +-----------+ 1 , 4 10 , 2 , 2 6 , +-----------+ : I = invsym(F'*F) // Generalized inverse (F*F^(-1)F=F) of a : // symmetric positive semi-definite matrix : I ymmetric 1 2 3 +-------------------------------------------+ 1 , 0 , 2 , 0 3.25 , 3 , 0 -1.75 .9444444444 , +-------------------------------------------+ : end

MATLAB

The implementation in MATLAB allows the same economy allowed by using the Fortran language. A = A + B; A variant of the MATLAB language is the GNU Octave language, which extends the original language with augmented assignments: A += B; Both MATLAB and GNU Octave natively support linear algebra operations such as matrix multiplication, matrix inversion, and the numerical solution of

system of linear equations In mathematics, a system of linear equations (or linear system) is a collection of one or more linear equations involving the same variable (math), variables. For example, :\begin 3x+2y-z=1\\ 2x-2y+4z=-2\\ -x+\fracy-z=0 \end is a system of three ...

, even using the Moore–Penrose pseudoinverse. The Nial example of the inner product of two arrays can be implemented using the native matrix multiplication operator. If a is a row vector of size nand b is a corresponding column vector of size 1 a * b; The inner product between two matrices having the same number of elements can be implemented with the auxiliary operator (:), which reshapes a given matrix into a column vector, and the transpose operator ': A(:)' * B(:);

rasql

The rasdaman query language is a database-oriented array-programming language. For example, two arrays could be added with the following query: SELECT A + B FROM A, B

R

The R language supports

array paradigm An array is a systematic arrangement of similar objects, usually in rows and columns. Things called an array include: {{TOC right Music * In twelve-tone and serial composition, the presentation of simultaneous twelve-tone sets such that the ...

by default. The following example illustrates a process of multiplication of two matrices followed by an addition of a scalar (which is, in fact, a one-element vector) and a vector: > A <- matrix(1:6, nrow=2) # !!this has nrow=2 ... and A has 2 rows > A 1 2 3 , 1 3 5 , 2 4 6 > B <- t( matrix(6:1, nrow=2) ) # t() is a transpose operator !!this has nrow=2 ... and B has 3 rows --- a clear contradiction to the definition of A > B 1 2 , 6 5 , 4 3 , 2 1 > C <- A %*% B > C 1 2 , 28 19 , 40 28 > D <- C + 1 > D 1 2 , 29 20 , 41 29 > D + c(1, 1) # c() creates a vector 1 2 , 30 21 , 42 30

Mathematical reasoning and language notation

The matrix left-division operator concisely expresses some semantic properties of matrices. As in the scalar equivalent, if the ( determinant of the) coefficient (matrix) A is not null then it is possible to solve the (vectorial) equation A * x = b by left-multiplying both sides by the

inverse Inverse or invert may refer to: Science and mathematics * Inverse (logic), a type of conditional sentence which is an immediate inference made from another conditional sentence * Additive inverse (negation), the inverse of a number that, when ad ...

of A: A⁻¹ (in both MATLAB and GNU Octave languages: A^-1). The following mathematical statements hold when A is a

full rank In linear algebra, the rank of a matrix is the dimension of the vector space generated (or spanned) by its columns. p. 48, § 1.16 This corresponds to the maximal number of linearly independent columns of . This, in turn, is identical to the dime ...

square matrix In mathematics, a square matrix is a matrix with the same number of rows and columns. An ''n''-by-''n'' matrix is known as a square matrix of order Any two square matrices of the same order can be added and multiplied. Square matrices are often ...

: :

A^-1  *(A * x)
A^-1 * (b)

(A^-1 * A)* x 
A^-1 *  b

(matrix-multiplication associativity) :x = A^-1 * b where is the equivalence relational operator. The previous statements are also valid MATLAB expressions if the third one is executed before the others (numerical comparisons may be false because of round-off errors). If the system is overdetermined – so that A has more rows than columns – the pseudoinverse A⁺ (in MATLAB and GNU Octave languages: pinv(A)) can replace the inverse A⁻¹, as follows: :

pinv(A)  *(A * x)
pinv(A) * (b)

(pinv(A) * A)* x 
pinv(A) * b

(matrix-multiplication associativity) :x = pinv(A) * b However, these solutions are neither the most concise ones (e.g. still remains the need to notationally differentiate overdetermined systems) nor the most computationally efficient. The latter point is easy to understand when considering again the scalar equivalent a * x = b, for which the solution x = a^-1 * b would require two operations instead of the more efficient x = b / a. The problem is that generally matrix multiplications are not commutative as the extension of the scalar solution to the matrix case would require: :

(a * x)/ a 
b / a

(x * a)/ a 
b / a

(commutativity does not hold for matrices!) :

x * (a / a)
b / a

(associativity also holds for matrices) :x = b / a The MATLAB language introduces the left-division operator \ to maintain the essential part of the analogy with the scalar case, therefore simplifying the mathematical reasoning and preserving the conciseness: :

A \ (A * x)
A \ b

(A \ A)* x 
A \ b

(associativity also holds for matrices, commutativity is no more required) :x = A \ b This is not only an example of terse array programming from the coding point of view but also from the computational efficiency perspective, which in several array programming languages benefits from quite efficient linear algebra libraries such as ATLAS or LAPACK. Returning to the previous quotation of Iverson, the rationale behind it should now be evident:

Third-party libraries

The use of specialized and efficient libraries to provide more terse abstractions is also common in other programming languages. In C++ several linear algebra libraries exploit the language's ability to overload operators. In some cases a very terse abstraction in those languages is explicitly influenced by the array programming paradigm, as the NumPy extension library to Python,

Armadillo Armadillos (meaning "little armored ones" in Spanish) are New World placental mammals in the order Cingulata. The Chlamyphoridae and Dasypodidae are the only surviving families in the order, which is part of the superorder Xenarthra, along wi ...

and

Blitz++ Blitz++ is a high-performance vector mathematics library written in C++. This library is intended for use in scientific applications that might otherwise be implemented with Fortran or MATLAB MATLAB (an abbreviation of "MATrix LABoratory") ...

libraries do.

References

External links

"No stinking loops" programming
{{Types of programming languages Programming paradigms Articles with example MATLAB/Octave code Articles with example BASIC code Articles with example Ada code Articles with example R code