The flattening transformation is an

algorithm In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing ...

that transforms nested

data parallelism Data parallelism is parallelization across multiple processors in parallel computing environments. It focuses on distributing the data across different nodes, which operate on the data in parallel. It can be applied on regular data structures lik ...

into flat data parallelism. It was pioneered by Guy Blelloch as part of the

NESL NESL is a parallel programming language developed at Carnegie Mellon by the SCandAL project and released in 1993. It integrates various ideas from parallel algorithms, functional programming, and array programming languages. The most important ne ...

programming language. The flattening transformation is also sometimes called vectorization, but is completely unrelated to

automatic vectorization Automatic vectorization, in parallel computing, is a special case of automatic parallelization, where a computer program is converted from a scalar implementation, which processes a single pair of operands at a time, to a vector implementation, w ...

. The original flattening algorithm was concerned solely with first-order multidimensional arrays containing primitive types, but was extended to handle higher-order and recursive data types in the work on Data Parallel Haskell.

Overview

Flattening works by ''lifting'' functions to operate on arrays instead of on single values. For example, a function

f : A \rightarrow B

is lifted to a function

f' : \rightarrow /math>.  This means an expression map~f~x can be replaced with an application of the lifted function: f'~x .  Intuitively, flattening thus works by replacing all function applications with applications of the corresponding lifted function.

After flattening, arrays are represented as single-dimensional value vector ''V'' containing scalar elements, alongside auxiliary information recording the nested structure, typically in the form of a boolean flag vector ''F''.  The flag vector indicates, for the corresponding element in the value vector, whether it is the beginning of a new ''segment''.   For example, the two-dimensional irregular array A=

1,2,3 1-2-3; 1, 2, 3; or One, Two, Three may refer to: Brands * 1-2-3 (fuel station), in Norway * Lotus 1-2-3, a computer spreadsheet program * .123, a file extension used by Lotus 1-2-3 * Jell-O 1-2-3, a dessert Film, TV and books * ''One, Two, Three'' ...

,5 [], [6 can be represented as the data vector

V = [1,2,3,4,5,6,7]

alongside the flag vector

F = [1, 0, 0, 1, 0, 1, 1]

. This flag vector is necessary in order to correctly flatten nested parallelism. For example, it is used in the flattening of

prefix sum In computer science, the prefix sum, cumulative sum, inclusive scan, or simply scan of a sequence of numbers is a second sequence of numbers , the sums of prefixes (running totals) of the input sequence: : : : :... For instance, the prefix sums of ...

to segmented scan. Flattening can increase the asymptotic work and space complexity of the original program, leading to a much less efficient result.

Usage

Flattening was originally developed for vector machines such as the

Connection Machine A Connection Machine (CM) is a member of a series of massively parallel supercomputers that grew out of doctoral research on alternatives to the traditional von Neumann architecture of computers by Danny Hillis at Massachusetts Institute of Te ...

, and often produces code that is not a good fit for modern

multicore A multi-core processor is a microprocessor on a single integrated circuit with two or more separate processing units, called cores, each of which reads and executes program instructions. The instructions are ordinary CPU instructions (such ...

CPUs. However, the principles underlying its simpler cases can be found in constructs such as the vmap in

Google Jax Google JAX is a machine learning framework for transforming numerical functions. It is described as bringing together a modified version oautograd(automatic obtaining of the gradient function through differentiation of a function) and TensorFlow'X ...

References

{{Reflist Compiler optimizations Parallel computing