In
parallel computing
Parallel computing is a type of computing, computation in which many calculations or Process (computing), processes are carried out simultaneously. Large problems can often be divided into smaller ones, which can then be solved at the same time. ...
, an embarrassingly parallel workload or problem (also called embarrassingly parallelizable, perfectly parallel, delightfully parallel or pleasingly parallel) is one where little or no effort is needed to split the problem into a number of parallel tasks. This is due to minimal or no dependency upon communication between the parallel tasks, or for results between them.
[Section 1.4.4 of: ]
These differ from
distributed computing
Distributed computing is a field of computer science that studies distributed systems, defined as computer systems whose inter-communicating components are located on different networked computers.
The components of a distributed system commu ...
problems, which need communication between tasks, especially communication of intermediate results. They are easier to perform on
server farm
A server farm or server cluster is a collection of Server (computing), computer servers, usually maintained by an organization to supply server functionality far beyond the capability of a single machine. They often consist of thousands of compu ...
s which lack the special infrastructure used in a true
supercomputer
A supercomputer is a type of computer with a high level of performance as compared to a general-purpose computer. The performance of a supercomputer is commonly measured in floating-point operations per second (FLOPS) instead of million instruc ...
cluster. They are well-suited to large, Internet-based
volunteer computing
Volunteer computing is a type of distributed computing in which people donate their computers' unused resources to a research-oriented project, and sometimes in exchange for credit points. The fundamental idea behind it is that a modern desktop ...
platforms such as
BOINC, and suffer less from
parallel slowdown. The opposite of embarrassingly parallel problems are
inherently serial problems, which cannot be parallelized at all.
A common example of an embarrassingly parallel problem is 3D video rendering handled by a
graphics processing unit
A graphics processing unit (GPU) is a specialized electronic circuit designed for digital image processing and to accelerate computer graphics, being present either as a discrete video card or embedded on motherboards, mobile phones, personal ...
, where each frame (forward method) or pixel (
ray tracing method) can be handled with no interdependency.
Some forms of
password cracking are another embarrassingly parallel task that is easily distributed on
central processing unit
A central processing unit (CPU), also called a central processor, main processor, or just processor, is the primary Processor (computing), processor in a given computer. Its electronic circuitry executes Instruction (computing), instructions ...
s,
CPU cores, or clusters.
Etymology
"Embarrassingly" is used here to refer to parallelization problems which are "embarrassingly easy". The term may imply embarrassment on the part of developers or compilers: "Because so many important problems remain unsolved mainly due to their intrinsic computational complexity, it would be embarrassing not to develop parallel implementations of polynomial
homotopy
In topology, two continuous functions from one topological space to another are called homotopic (from and ) if one can be "continuously deformed" into the other, such a deformation being called a homotopy ( ; ) between the two functions. ...
continuation methods." The term is first found in the literature in a 1986 book on multiprocessors by
MATLAB
MATLAB (an abbreviation of "MATrix LABoratory") is a proprietary multi-paradigm programming language and numeric computing environment developed by MathWorks. MATLAB allows matrix manipulations, plotting of functions and data, implementat ...
's creator
Cleve Moler,
who claims to have invented the term.
An alternative term, ''pleasingly parallel'', has gained some use, perhaps to avoid the negative connotations of embarrassment in favor of a positive reflection on the parallelizability of the problems: "Of course, there is nothing embarrassing about these programs at all."
Examples
A trivial example involves serving static data. It would take very little effort to have many processing units produce the same set of bits. Indeed, the famous
Hello World problem could easily be parallelized with few programming considerations or computational costs.
Some examples of embarrassingly parallel problems include:
*
Monte Carlo method
Monte Carlo methods, or Monte Carlo experiments, are a broad class of computational algorithms that rely on repeated random sampling to obtain numerical results. The underlying concept is to use randomness to solve problems that might be ...
* Distributed relational database queries usin
distributed set processing
*
Numerical integration
In analysis, numerical integration comprises a broad family of algorithms for calculating the numerical value of a definite integral.
The term numerical quadrature (often abbreviated to quadrature) is more or less a synonym for "numerical integr ...
* Bulk processing of unrelated files of similar nature in general, such as photo gallery resizing and conversion.
* The
Mandelbrot set
The Mandelbrot set () is a two-dimensional set (mathematics), set that is defined in the complex plane as the complex numbers c for which the function f_c(z)=z^2+c does not Stability theory, diverge to infinity when Iteration, iterated starting ...
,
Perlin noise and similar images, where each point is calculated independently.
*
Rendering of
computer graphics
Computer graphics deals with generating images and art with the aid of computers. Computer graphics is a core technology in digital photography, film, video games, digital art, cell phone and computer displays, and many specialized applications. ...
. In
computer animation
Computer animation is the process used for digitally generating Film, moving images. The more general term computer-generated imagery (CGI) encompasses both still images and moving images, while computer animation refers to moving images. Virtu ...
, each
frame or pixel may be rendered independently .
* Some
brute-force search
In computer science, brute-force search or exhaustive search, also known as generate and test, is a very general problem-solving technique and algorithmic paradigm that consists of Iteration#Computing, systematically checking all possible candida ...
es in
cryptography
Cryptography, or cryptology (from "hidden, secret"; and ''graphein'', "to write", or ''-logy, -logia'', "study", respectively), is the practice and study of techniques for secure communication in the presence of Adversary (cryptography), ...
. Notable real-world examples include
distributed.net and
proof-of-work systems used in
cryptocurrency
A cryptocurrency (colloquially crypto) is a digital currency designed to work through a computer network that is not reliant on any central authority, such as a government or bank, to uphold or maintain it.
Individual coin ownership record ...
.
*
BLAST searches in
bioinformatics
Bioinformatics () is an interdisciplinary field of science that develops methods and Bioinformatics software, software tools for understanding biological data, especially when the data sets are large and complex. Bioinformatics uses biology, ...
with split databases.
* Large scale
facial recognition system
A facial recognition system is a technology potentially capable of matching a human face from a digital image or a Film frame, video frame against a database of faces. Such a system is typically employed to authenticate users through ID verif ...
s that compare thousands of arbitrary acquired faces (e.g., a security or surveillance video via
closed-circuit television
Closed-circuit television (CCTV), also known as video surveillance, is the use of closed-circuit television cameras to transmit a signal to a specific place on a limited set of monitors. It differs from broadcast television in that the signa ...
) with similarly large number of previously stored faces (e.g., a ''
rogues gallery'' or similar
watch list).
* Computer simulations comparing many independent scenarios.
*
Genetic algorithm
In computer science and operations research, a genetic algorithm (GA) is a metaheuristic inspired by the process of natural selection that belongs to the larger class of evolutionary algorithms (EA). Genetic algorithms are commonly used to g ...
s.
*
Ensemble calculations of
numerical weather prediction
Numerical weather prediction (NWP) uses mathematical models of the atmosphere and oceans to weather forecasting, predict the weather based on current weather conditions. Though first attempted in the 1920s, it was not until the advent of comput ...
.
* Event simulation and reconstruction in
particle physics
Particle physics or high-energy physics is the study of Elementary particle, fundamental particles and fundamental interaction, forces that constitute matter and radiation. The field also studies combinations of elementary particles up to the s ...
.
* The
marching squares algorithm.
* Sieving step of the
quadratic sieve and the
number field sieve.
* Tree growth step of the
random forest machine learning technique.
*
Discrete Fourier transform
In mathematics, the discrete Fourier transform (DFT) converts a finite sequence of equally-spaced Sampling (signal processing), samples of a function (mathematics), function into a same-length sequence of equally-spaced samples of the discre ...
where each harmonic is independently calculated.
*
Convolutional neural network
A convolutional neural network (CNN) is a type of feedforward neural network that learns features via filter (or kernel) optimization. This type of deep learning network has been applied to process and make predictions from many different ty ...
s running on
GPUs.
* Parallel search in
constraint programming
Constraint programming (CP) is a paradigm for solving combinatorial problems that draws on a wide range of techniques from artificial intelligence, computer science, and operations research. In constraint programming, users declaratively state t ...
Implementations
* In
R (programming language)
R is a programming language for statistical computing and Data and information visualization, data visualization. It has been widely adopted in the fields of data mining, bioinformatics, data analysis, and data science.
The core R language is ...
– The Simple Network of Workstations (SNOW) package implements a simple mechanism for using a set of workstations or a
Beowulf cluster for embarrassingly parallel computations.
Simple Network of Workstations (SNOW) package
/ref> Similar R packages include "future", "parallel" and others.
See also
* Amdahl's law defines value '' P'', which would be almost or exactly equal to 1 for embarrassingly parallel problems.
*Cellular automaton
A cellular automaton (pl. cellular automata, abbrev. CA) is a discrete model of computation studied in automata theory. Cellular automata are also called cellular spaces, tessellation automata, homogeneous structures, cellular structures, tessel ...
*Connection Machine
The Connection Machine (CM) is a member of a series of massively parallel supercomputers sold by Thinking Machines Corporation. The idea for the Connection Machine grew out of doctoral research on alternatives to the traditional von Neumann arch ...
* CUDA framework
* Manycore processor
* Map (parallel pattern)
* Massively parallel
*Multiprocessing
Multiprocessing (MP) is the use of two or more central processing units (CPUs) within a single computer system. The term also refers to the ability of a system to support more than one processor or the ability to allocate tasks between them. The ...
*Parallel computing
Parallel computing is a type of computing, computation in which many calculations or Process (computing), processes are carried out simultaneously. Large problems can often be divided into smaller ones, which can then be solved at the same time. ...
* Process-oriented programming
* Shared-nothing architecture (SN)
* Symmetric multiprocessing (SMP)
* Vector processor
References
External links
Embarrassingly Parallel Computations
Engineering a Beowulf-style Compute Cluster
Star-P: High Productivity Parallel Computing
{{Parallel computing
Applications of distributed computing
Distributed computing problems
Parallel computing