HOME

TheInfoList



OR:

In computing, row-major order and column-major order are methods for storing multidimensional arrays in linear storage such as
random access memory Random-access memory (RAM; ) is a form of electronic computer memory that can be read and changed in any order, typically used to store working data and machine code. A random-access memory device allows data items to be read or written ...
. The difference between the orders lies in which elements of an array are contiguous in memory. In row-major order, the consecutive elements of a row reside next to each other, whereas the same holds true for consecutive elements of a column in column-major order. While the terms allude to the rows and columns of a two-dimensional array, i.e. a
matrix Matrix (: matrices or matrixes) or MATRIX may refer to: Science and mathematics * Matrix (mathematics), a rectangular array of numbers, symbols or expressions * Matrix (logic), part of a formula in prenex normal form * Matrix (biology), the m ...
, the orders can be generalized to arrays of any dimension by noting that the terms row-major and column-major are equivalent to lexicographic and colexicographic orders, respectively. It is also worth noting that matrices, being commonly represented as collections of row or column vectors, using this approach are effectively stored as consecutive vectors or consecutive vector components. Such ways of storing data are referred to as
AoS and SoA In computing, an array of structures (AoS), structure of arrays (SoA) or array of structures of arrays (AoSoA) are contrasting ways to arrange a sequence of records in memory, with regard to interleaving, and are of interest in SIMD and SIMT pro ...
respectively. Data layout is critical for correctly passing arrays between programs written in different programming languages. It is also important for performance when traversing an array because modern CPUs process sequential data more efficiently than nonsequential data. This is primarily due to CPU caching which exploits spatial
locality of reference In computer science, locality of reference, also known as the principle of locality, is the tendency of a processor to access the same set of memory locations repetitively over a short period of time. There are two basic types of reference localit ...
. In addition, contiguous access makes it possible to use
SIMD Single instruction, multiple data (SIMD) is a type of parallel computer, parallel processing in Flynn's taxonomy. SIMD describes computers with multiple processing elements that perform the same operation on multiple data points simultaneousl ...
instructions that operate on vectors of data. In some media such as
magnetic-tape data storage Magnetic-tape data storage is a system for storing digital information on magnetic tape using digital recording. Tape was an important medium for primary data storage in early computers, typically using large open reels of 7-track, later ...
, accessing sequentially is
orders of magnitude In a ratio scale based on powers of ten, the order of magnitude is a measure of the nearness of two figures. Two numbers are "within an order of magnitude" of each other if their ratio is between 1/10 and 10. In other words, the two numbers are wi ...
faster than nonsequential access.


Explanation and example

The terms row-major and column-major stem from the terminology related to ordering objects. A general way to order objects with many attributes is to first group and order them by one attribute, and then, within each such group, group and order them by another attribute, etc. If more than one attribute participates in ordering, the first would be called ''major'' and the last ''minor''. If two attributes participate in ordering, it is sufficient to name only the major attribute. In the case of arrays, the attributes are the indices along each dimension. For
matrices Matrix (: matrices or matrixes) or MATRIX may refer to: Science and mathematics * Matrix (mathematics), a rectangular array of numbers, symbols or expressions * Matrix (logic), part of a formula in prenex normal form * Matrix (biology), the ...
in mathematical notation, the first index indicates the ''row'', and the second indicates the ''column'', e.g., given a matrix A, the entry a_ is in its first row and second column. This convention is carried over to the syntax in programming languages, although often with indexes starting at 0 instead of 1. Even though the row is indicated by the ''first'' index and the column by the ''second'' index, no grouping order between the dimensions is implied by this. The choice of how to group and order the indices, either by row-major or column-major methods, is thus a matter of convention. The same terminology can be applied to even higher dimensional arrays. Row-major grouping starts from the ''leftmost'' index and column-major from the ''rightmost'' index, leading to lexicographic and colexicographic (or colex) orders, respectively. For example, the array :A = a_ = \begin \color a_ & \color a_ & \color a_ \\ \color a_ & \color a_ & \color a_ \end could be stored in two possible ways: Programming languages handle this in different ways. In C, multidimensional arrays are stored in row-major order, and the array indexes are written row-first (lexicographical access order): On the other hand, in Fortran, arrays are stored in column-major order, while the array indexes are still written row-first (colexicographical access order): Note how the use of A j] with multi-step indexing as in C, as opposed to a neutral notation like A(i,j) as in Fortran, almost inevitably implies row-major order for syntactic reasons, so to speak, because it can be rewritten as (A /code>, and the A /code> row part can even be assigned to an intermediate variable that is then indexed in a separate expression. (No other implications should be assumed, e.g., Fortran is not column-major simply ''because'' of its notation, and even the above implication could intentionally be circumvented in a new language.) To use column-major order in a row-major environment, or vice versa, for whatever reason, one workaround is to assign non-conventional roles to the indexes (using the first index for the column and the second index for the row), and another is to bypass language syntax by explicitly computing positions in a one-dimensional array. Of course, deviating from convention probably incurs a cost that increases with the degree of necessary interaction with conventional language features and other code, not only in the form of increased vulnerability to mistakes (forgetting to also invert matrix multiplication order, reverting to convention during code maintenance, etc.), but also in the form of having to actively rearrange elements, all of which have to be weighed against any original purpose such as increasing performance. Running the loop row-wise is preferred in row-major languages like C and vice versa for column-major languages.


Programming languages and libraries

Programming languages or their standard libraries that support multi-dimensional arrays typically have a native row-major or column-major storage order for these arrays. Row-major order is used in C/ C++/
Objective-C Objective-C is a high-level general-purpose, object-oriented programming language that adds Smalltalk-style message passing (messaging) to the C programming language. Originally developed by Brad Cox and Tom Love in the early 1980s, it was ...
(for C-style arrays),
PL/I PL/I (Programming Language One, pronounced and sometimes written PL/1) is a procedural, imperative computer programming language initially developed by IBM. It is designed for scientific, engineering, business and system programming. It has b ...
, Pascal,
Speakeasy A speakeasy, also called a beer flat or blind pig or blind tiger, was an illicit establishment that sold alcoholic beverages. The term may also refer to a retro style bar that replicates aspects of historical speakeasies. In the United State ...
, and SAS. Column-major order is used in Fortran, IDL,
MATLAB MATLAB (an abbreviation of "MATrix LABoratory") is a proprietary multi-paradigm programming language and numeric computing environment developed by MathWorks. MATLAB allows matrix manipulations, plotting of functions and data, implementat ...
,MATLAB documentation
MATLAB Data Storage
(retrieved from Mathworks.co.uk, January 2014).
GNU Octave GNU Octave is a scientific programming language for scientific computing and numerical computation. Octave helps in solving linear and nonlinear problems numerically, and for performing other numerical experiments using a language that is mostly ...
, Julia, S,
S-PLUS S-PLUS is a commercial implementation of the S (programming language), S programming language sold by TIBCO Software Inc. It features object-oriented programming capabilities and advanced analytical algorithms. Its statistical analysis capabilit ...
,: R,
Scilab Scilab is a free and open-source, cross-platform numerical computational package and a high-level, numerically oriented programming language. It can be used for signal processing, statistical analysis, image enhancement, fluid dynamics simul ...
,
Yorick Yorick is an unseen character in William Shakespeare's play ''Hamlet''. He is the dead court jester whose Human skull, skull is exhumed by the The Gravediggers, First Gravedigger in Act 5, Scene 1, of the play. The sight of Yorick's skull evokes ...
, and
Rasdaman rasdaman ("raster data manager") is an Array DBMS, that is: a Database Management System which adds capabilities for storage and retrieval of massive multi-dimensional arrays, such as sensor, image, simulation, and statistics data. A frequently ...
.


Neither row-major nor column-major

A typical alternative for dense array storage is to use
Iliffe vector In computer programming, an Iliffe vector, also known as a display, is a data structure used to implement multi-dimensional array data structure, arrays. Data structure An Iliffe vector for an ''n''-dimensional array (where ''n'' ≥ 2) ...
s, which typically store pointers to elements in the same row contiguously (like row-major order), but not the rows themselves. They are used in (ordered by age):
Java Java is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea (a part of Pacific Ocean) to the north. With a population of 156.9 million people (including Madura) in mid 2024, proje ...
, C#/
CLI CLI may refer to: Computing * Call Level Interface, an SQL database management API * Command-line interface, of a computer program * Command-line interpreter or command language interpreter; see List of command-line interpreters * CLI (x86 instruc ...
/
.Net The .NET platform (pronounced as "''dot net"'') is a free and open-source, managed code, managed computer software framework for Microsoft Windows, Windows, Linux, and macOS operating systems. The project is mainly developed by Microsoft emplo ...
, Scala, and
Swift Swift or SWIFT most commonly refers to: * SWIFT, an international organization facilitating transactions between banks ** SWIFT code * Swift (programming language) * Swift (bird), a family of birds It may also refer to: Organizations * SWIF ...
. Even less dense is to use lists of lists, e.g., in Python, and in the
Wolfram Language The Wolfram Language ( ) is a proprietary, very high-level multi-paradigm programming language developed by Wolfram Research. It emphasizes symbolic computation, functional programming, and rule-based programming and can employ arbitrary stru ...
of
Wolfram Mathematica Wolfram (previously known as Mathematica and Wolfram Mathematica) is a software system with built-in libraries for several areas of technical computing that allows machine learning, statistics, symbolic computation, data manipulation, network ...
. An alternative approach uses tables of tables, e.g., in Lua.


External libraries

Support for multi-dimensional arrays may also be provided by external libraries, which may even support arbitrary orderings, where each dimension has a stride value, and row-major or column-major are just two possible resulting interpretations. Row-major order is the default in
NumPy NumPy (pronounced ) is a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays. The predeces ...
(for Python). Column-major order is the default in
Eigen Eigen may refer to: People with the given name *, Japanese sport shooter *, Japanese professional wrestler * Frauke Eigen (born 1969) German photographer, photojournalist and artist * Manfred Eigen (1927–2019), German biophysicist * Michael Ei ...
an
Armadillo
(both for C++). A special case would be
OpenGL OpenGL (Open Graphics Library) is a Language-independent specification, cross-language, cross-platform application programming interface (API) for rendering 2D computer graphics, 2D and 3D computer graphics, 3D vector graphics. The API is typic ...
(and
OpenGL ES OpenGL for Embedded Systems (OpenGL ES or GLES) is a subset of the OpenGL computer graphics rendering application programming interface (API) for rendering 2D and 3D computer graphics such as those used by video games, typically hardware-accelerate ...
) for graphics processing. Since "recent mathematical treatments of linear algebra and related fields invariably treat vectors as columns," designer Mark Segal decided to substitute this for the convention in predecessor IRIS GL, which was to write vectors as rows; for compatibility, transformation matrices would still be stored in vector-major (=row-major) rather than coordinate-major (=column-major) order, and he then used the trick " osay that matrices in OpenGL are stored in column-major order". This was really only relevant for presentation, because matrix multiplication was stack-based and could still be interpreted as post-multiplication, but, worse, reality leaked through the C-based
API An application programming interface (API) is a connection between computers or between computer programs. It is a type of software interface, offering a service to other pieces of software. A document or standard that describes how to build ...
because individual elements would be accessed as M ectorcoordinate] or, effectively, M olumnrow], which unfortunately muddled the convention that the designer sought to adopt, and this was even preserved in the OpenGL Shading Language that was later added (although this also makes it possible to access coordinates by name instead, e.g., M ectory). As a result, many developers will now simply declare that having the column as the first index is the definition of column-major, even though this is clearly not the case with a real column-major language like Fortran.
Torch A torch is a stick with combustible material at one end which can be used as a light source or to set something on fire. Torches have been used throughout history and are still used in processions, symbolic and religious events, and in juggl ...
(for Lua) changed from column-major to row-major default order.


Transposition

As exchanging the indices of an array is the essence of array transposition, an array stored as row-major but read as column-major (or vice versa) will appear transposed. As actually performing this rearrangement in memory is typically an expensive operation, some systems provide options to specify individual matrices as being stored transposed. The programmer must then decide whether or not to rearrange the elements in memory, based on the actual usage (including the number of times that the array is reused in a computation). For example, the
Basic Linear Algebra Subprograms Basic Linear Algebra Subprograms (BLAS) is a specification (technical standard), specification that prescribes a set of low-level routines for performing common linear algebra operations such as vector space, vector addition, scalar multiplicati ...
functions are passed flags indicating which arrays are transposed.


Address calculation in general

The concept generalizes to arrays with more than two dimensions. For a ''d''-dimensional N_1 \times N_2 \times \cdots \times N_d array with dimensions ''N''''k'' (''k''=1...''d''), a given element of this array is specified by a
tuple In mathematics, a tuple is a finite sequence or ''ordered list'' of numbers or, more generally, mathematical objects, which are called the ''elements'' of the tuple. An -tuple is a tuple of elements, where is a non-negative integer. There is o ...
(n_1, n_2, \ldots, n_d) of ''d'' (zero-based) indices n_k \in ,N_k - 1/math>. In row-major order, the ''last'' dimension is contiguous, so that the memory-offset of this element is given by: n_d + N_d \cdot (n_ + N_ \cdot (n_ + N_ \cdot (\cdots + N_2 n_1)\cdots)) = \sum_^d \left( \prod_^d N_\ell \right) n_k In column-major order, the ''first'' dimension is contiguous, so that the memory-offset of this element is given by: n_1 + N_1 \cdot (n_2 + N_2 \cdot (n_3 + N_3 \cdot (\cdots + N_ n_d)\cdots)) = \sum_^d \left( \prod_^ N_\ell \right) n_k where the
empty product In mathematics, an empty product, or nullary product or vacuous product, is the result of multiplication, multiplying no factors. It is by convention equal to the multiplicative identity (assuming there is an identity for the multiplication operat ...
is the multiplicative
identity element In mathematics, an identity element or neutral element of a binary operation is an element that leaves unchanged every element when the operation is applied. For example, 0 is an identity element of the addition of real numbers. This concept is use ...
, i.e., \prod_^ N_\ell = \prod_^ N_\ell = 1. For a given order, the stride in dimension ''k'' is given by the multiplication value in parentheses before index ''n''''k'' in the right-hand side summations above. More generally, there are ''d!'' possible orders for a given array, one for each
permutation In mathematics, a permutation of a set can mean one of two different things: * an arrangement of its members in a sequence or linear order, or * the act or process of changing the linear order of an ordered set. An example of the first mean ...
of dimensions (with row-major and column-order just 2 special cases), although the lists of stride values are not necessarily permutations of each other, e.g., in the 2-by-3 example above, the strides are (3,1) for row-major and (1,2) for column-major.


See also

*
Array (data structure) In computer science, an array is a data structure consisting of a collection of ''elements'' (value (computer science), values or variable (programming), variables), of same memory size, each identified by at least one ''array index'' or ''key' ...
* Comparison of programming languages (array) * Index origin, another difference between array types across programming languages * Matrix representation *
Morton order In mathematical analysis and computer science, functions which are Z-order, Lebesgue curve, Morton space-filling curve, Morton order or Morton code map multidimensional data to one dimension while preserving locality of the data points (two ...
, another way of mapping multidimensional data to a one-dimensional index, useful in tree data structures * CSR format, a technique for storing sparse matrices in memory *
Vectorization (mathematics) In mathematics, especially in linear algebra and matrix theory, the vectorization of a matrix is a linear transformation which converts the matrix into a vector. Specifically, the vectorization of a matrix ''A'', denoted vec(''A''), is the ...
, the equivalent of turning a matrix into the corresponding column-major vector


References

{{Reflist


Sources

* Donald E. Knuth, ''
The Art of Computer Programming ''The Art of Computer Programming'' (''TAOCP'') is a comprehensive multi-volume monograph written by the computer scientist Donald Knuth presenting programming algorithms and their analysis. it consists of published volumes 1, 2, 3, 4A, and 4 ...
Volume 1: Fundamental Algorithms'', third edition, section 2.2.6 (Addison-Wesley: New York, 1997). Arrays