In
computing
Computing is any goal-oriented activity requiring, benefiting from, or creating computer, computing machinery. It includes the study and experimentation of algorithmic processes, and the development of both computer hardware, hardware and softw ...
, an array of structures (AoS), structure of arrays (SoA) or array of structures of arrays (AoSoA) are contrasting ways to arrange a sequence of
records in
memory
Memory is the faculty of the mind by which data or information is encoded, stored, and retrieved when needed. It is the retention of information over time for the purpose of influencing future action. If past events could not be remembe ...
, with regard to
interleaving
Interleaving may refer to:
* Interleaving, a technique for making forward error correction more robust with respect to burst errors
* An optical interleaver, a fiber-optic device to combine two sets of dense wavelength-division multiplexing (DW ...
, and are of interest in
SIMD
Single instruction, multiple data (SIMD) is a type of parallel computer, parallel processing in Flynn's taxonomy. SIMD describes computers with multiple processing elements that perform the same operation on multiple data points simultaneousl ...
and
SIMT programming.
Structure of arrays
Structure of arrays (SoA) is a layout separating elements of a
record (or 'struct' in the
C programming language
C (''pronounced'' '' – like the letter c'') is a general-purpose programming language. It was created in the 1970s by Dennis Ritchie and remains very widely used and influential. By design, C's features cleanly reflect the capabilities of ...
) into one parallel array per
field
Field may refer to:
Expanses of open ground
* Field (agriculture), an area of land used for agricultural purposes
* Airfield, an aerodrome that lacks the infrastructure of an airport
* Battlefield
* Lawn, an area of mowed grass
* Meadow, a grass ...
. The motivation is easier manipulation with packed
SIMD instructions in most
instruction set architecture
In computer science, an instruction set architecture (ISA) is an abstract model that generally defines how software controls the CPU in a computer or a family of computers. A device or program that executes instructions described by that ISA, ...
s, since a single
SIMD register can load
homogeneous data, possibly transferred by a wide
internal datapath (e.g.
128-bit
General home computing and gaming utility emerged at 8-bit word sizes, as 28=256 Word (computer architecture), words, a natural unit of data, became possible. Early 8-bit CPUs (such as the Zilog Z80 and MOS Technology 6502, used in the 1977 Co ...
). If only a specific part of the record is needed, only those parts need to be iterated over, allowing more data to fit onto a single cache line. The downside is requiring more
cache ways when traversing data, and inefficient
indexed addressing.
For example, to store N points in 3D space using a structure of arrays:
struct pointlist3D ;
struct pointlist3D points;
float get_point_x(int i)
Array of structures
Array of structures (AoS) is the opposite (and more conventional) layout, in which data for different fields is interleaved.
This is often more intuitive, and supported directly by most
programming languages
A programming language is a system of notation for writing computer programs.
Programming languages are described in terms of their syntax (form) and semantics (meaning), usually defined by a formal language. Languages usually provide features ...
.
For example, to store N points in 3D space using an array of structures:
struct point3D ;
struct point3D points
float get_point_x(int i)
Array of structures of arrays
Array of structures of arrays (AoSoA) or tiled array of structs is a hybrid approach between the previous layouts, in which data for different fields is interleaved using tiles or blocks with size equal to the SIMD vector size. This is often less intuitive, but can achieve the memory throughput of the SoA approach, while being more friendly to the cache locality and load port architectures of modern processors. In particular, memory requests in modern processors have to be fulfilled in fixed width (e.g., size of a cacheline). The tiled storage of AoSoA aligns the memory access pattern to the requests' fixed width, leading to fewer access operations to complete a memory request and thus increasing the efficiency.
For example, to store N points in 3D space using an array of structures of arrays with a SIMD register width of 8 floats (or 8×32 = 256 bits):
struct point3Dx8 ;
struct point3Dx8 points N+7)/8
float get_point_x(int i)
A different width may be needed depending on the actual SIMD register width. The interior arrays may be replaced with SIMD types such as for languages with such support.
Alternatives
It is possible to split some subset of a structure (rather than each individual field) into a
parallel array In computing, a group of parallel arrays (also known as structure of arrays or SoA) is a form of implicit data structure that uses multiple arrays to represent a singular array of records. It keeps a separate, homogeneous data array for each fi ...
and this can actually improve
locality of reference
In computer science, locality of reference, also known as the principle of locality, is the tendency of a processor to access the same set of memory locations repetitively over a short period of time. There are two basic types of reference localit ...
if different pieces of fields are used at different times in the program (see
data oriented design).
Some
SIMD
Single instruction, multiple data (SIMD) is a type of parallel computer, parallel processing in Flynn's taxonomy. SIMD describes computers with multiple processing elements that perform the same operation on multiple data points simultaneousl ...
architectures provide
strided load/store instructions to load homogeneous data from the SoA format. Yet another option used in some
Cell
Cell most often refers to:
* Cell (biology), the functional basic unit of life
* Cellphone, a phone connected to a cellular network
* Clandestine cell, a penetration-resistant form of a secret or outlawed organization
* Electrochemical cell, a de ...
libraries is to de-interleave data from the AoS format when loading sources into registers, and interleave when writing out results (facilitated by the
superscalar issue of
permutes). Some
vector maths libraries align
floating point
In computing, floating-point arithmetic (FP) is arithmetic on subsets of real numbers formed by a ''significand'' (a signed sequence of a fixed number of digits in some base) multiplied by an integer power of that base.
Numbers of this form ...
4D vectors with the SIMD register to leverage the associated data path and instructions, while still providing programmer convenience, although this does not scale to SIMD units wider than four lanes.
4D vectors
AoS vs. SoA presents a choice when considering 3D or
4D vector
In computer science, a 4D vector is a 4-component vector data type. Uses include homogeneous coordinates for 3-dimensional space in computer graphics, and ''red green blue alpha'' (RGBA) values for bitmap images with a color and alpha channel ( ...
data on machines with four-lane SIMD hardware. SIMD ISAs are usually designed for homogeneous data, however some provide a
dot product
In mathematics, the dot product or scalar productThe term ''scalar product'' means literally "product with a Scalar (mathematics), scalar as a result". It is also used for other symmetric bilinear forms, for example in a pseudo-Euclidean space. N ...
instruction and additional permutes, making the AoS case easier to handle.
Although most
GPU
A graphics processing unit (GPU) is a specialized electronic circuit designed for digital image processing and to accelerate computer graphics, being present either as a discrete video card or embedded on motherboards, mobile phones, personal ...
hardware has moved away from 4D instructions to scalar
SIMT pipelines, modern
compute kernel
In computing, a compute kernel is a routine compiled for high throughput accelerators (such as graphics processing units (GPUs), digital signal processors (DSPs) or field-programmable gate arrays (FPGAs)), separate from but used by a main pro ...
s using SoA instead of AoS can still give better performance due to memory coalescing.
Software support
Most languages support the AoS format more naturally by combining
records and various array
abstract data types
In computer science, an abstract data type (ADT) is a mathematical model for data types, defined by its behavior (semantics) from the point of view of a '' user'' of the data, specifically in terms of possible values, possible operations on data ...
.
SoA is mostly found in languages, libraries, or
metaprogramming
Metaprogramming is a computer programming technique in which computer programs have the ability to treat other programs as their data. It means that a program can be designed to read, generate, analyse, or transform other programs, and even modi ...
tools used to support a
data-oriented design. Examples include:
* "Data frames," as implemented in
R,
Python
Python may refer to:
Snakes
* Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia
** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia
* Python (mythology), a mythical serpent
Computing
* Python (prog ...
's Pandas package, and
Julia
Julia may refer to:
People
*Julia (given name), including a list of people with the name
*Julia (surname), including a list of people with the name
*Julia gens, a patrician family of Ancient Rome
*Julia (clairvoyant) (fl. 1689), lady's maid of Qu ...
's DataFrames.jl package, are interfaces to access SoA like AoS.
* The Julia package StructArrays.jl allows for accessing SoA as AoS to combine the performance of SoA with the intuitiveness of AoS.
* Code generators for the C language, includin
Datadrawand the
X Macro technique.
Automated creation of AoSoA is more complex. An example of AoSoA in metaprogramming is found in
LANL
Los Alamos National Laboratory (often shortened as Los Alamos and LANL) is one of the sixteen research and development laboratories of the United States Department of Energy (DOE), located a short distance northwest of Santa Fe, New Mexico, in ...
's Cabana library written in C++; it assumes a vector width of 16 lanes by default.
References
{{reflist
SIMD computing