HOME

TheInfoList



OR:

In parallel computing, an embarrassingly parallel workload or problem (also called embarrassingly parallelizable, perfectly parallel, delightfully parallel or pleasingly parallel) is one where little or no effort is needed to separate the problem into a number of parallel tasks. This is often the case where there is little or no dependency or need for communication between those parallel tasks, or for results between them.Section 1.4.4 of: Thus, these are different from
distributed computing A distributed system is a system whose components are located on different networked computers, which communicate and coordinate their actions by passing messages to one another from any system. Distributed computing is a field of computer sci ...
problems that need communication between tasks, especially communication of intermediate results. They are easy to perform on
server farm A server farm or server cluster is a collection of computer servers, usually maintained by an organization to supply server functionality far beyond the capability of a single machine. They often consist of thousands of computers which require ...
s which lack the special infrastructure used in a true supercomputer cluster. They are thus well suited to large, Internet-based volunteer computing platforms such as BOINC, and do not suffer from
parallel slowdown Parallel slowdown is a phenomenon in parallel computing where parallelization of a parallel algorithm beyond a certain point causes the program to run slower (take more time to run to completion). Parallel slowdown is typically the result of ...
. The opposite of embarrassingly parallel problems are inherently serial problems, which cannot be parallelized at all. A common example of an embarrassingly parallel problem is 3D video rendering handled by a
graphics processing unit A graphics processing unit (GPU) is a specialized electronic circuit designed to manipulate and alter memory to accelerate the creation of images in a frame buffer intended for output to a display device. GPUs are used in embedded systems, mo ...
, where each frame (forward method) or pixel ( ray tracing method) can be handled with no interdependency. Some forms of
password cracking In cryptanalysis and computer security, password cracking is the process of recovering passwords from data that has been stored in or transmitted by a computer system in scrambled form. A common approach ( brute-force attack) is to repeatedly t ...
are another embarrassingly parallel task that is easily distributed on
central processing unit A central processing unit (CPU), also called a central processor, main processor or just processor, is the electronic circuitry that executes instructions comprising a computer program. The CPU performs basic arithmetic, logic, controlling, an ...
s, CPU cores, or clusters.


Etymology

"Embarrassingly" is used here in the same sense as in the phrase "an embarrassment of riches", meaning an overabundance—here referring to parallelization problems which are "embarrassingly easy". The term may also imply embarrassment on the part of developers or compilers: "Because so many important problems remain unsolved mainly due to their intrinsic computational complexity, it would be embarrassing not to develop parallel implementations of polynomial
homotopy In topology, a branch of mathematics, two continuous functions from one topological space to another are called homotopic (from grc, ὁμός "same, similar" and "place") if one can be "continuously deformed" into the other, such a defor ...
continuation methods." The term is first found in the literature in a 1986 book on multiprocessors by
MATLAB MATLAB (an abbreviation of "MATrix LABoratory") is a proprietary multi-paradigm programming language and numeric computing environment developed by MathWorks. MATLAB allows matrix manipulations, plotting of functions and data, implementa ...
's creator Cleve Moler, who claims to have invented the term. An alternative term, ''pleasingly parallel'', has gained some use, perhaps to avoid the negative connotations of embarrassment in favor of a positive reflection on the parallelizability of the problems: "Of course, there is nothing embarrassing about these programs at all."


Examples

Some examples of embarrassingly parallel problems include: * Monte Carlo analysis * Distributed relational database queries usin
distributed set processing
*
Numerical integration In analysis, numerical integration comprises a broad family of algorithms for calculating the numerical value of a definite integral, and by extension, the term is also sometimes used to describe the numerical solution of differential equations ...
* Serving static files on a webserver to multiple users at once. * Bulk processing of unrelated files of similar nature in general, such as photo gallery resizing and conversion. * The
Mandelbrot set The Mandelbrot set () is the set of complex numbers c for which the function f_c(z)=z^2+c does not diverge to infinity when iterated from z=0, i.e., for which the sequence f_c(0), f_c(f_c(0)), etc., remains bounded in absolute value. This ...
,
Perlin noise Perlin noise is a type of gradient noise developed by Ken Perlin. History Ken Perlin developed Perlin noise in 1983 as a result of his frustration with the "machine-like" look of computer-generated imagery (CGI) at the time. He formally descr ...
and similar images, where each point is calculated independently. * Rendering of
computer graphics Computer graphics deals with generating images with the aid of computers. Today, computer graphics is a core technology in digital photography, film, video games, cell phone and computer displays, and many specialized applications. A great deal ...
. In
computer animation Computer animation is the process used for digitally generating animations. The more general term computer-generated imagery (CGI) encompasses both static scenes ( still images) and dynamic images ( moving images), while computer animation re ...
, each frame or pixel may be rendered independently (see parallel rendering). * Some brute-force searches in
cryptography Cryptography, or cryptology (from grc, , translit=kryptós "hidden, secret"; and ''graphein'', "to write", or ''-logia'', "study", respectively), is the practice and study of techniques for secure communication in the presence of adve ...
. Notable real-world examples include
distributed.net Distributed.net is a volunteer computing effort that is attempting to solve large scale problems using otherwise idle CPU or GPU time. It is governed by Distributed Computing Technologies, Incorporated (DCTI), a non-profit organization under U. ...
and proof-of-work systems used in
cryptocurrency A cryptocurrency, crypto-currency, or crypto is a digital currency designed to work as a medium of exchange through a computer network that is not reliant on any central authority, such as a government or bank, to uphold or maintain it. It ...
. * BLAST searches in
bioinformatics Bioinformatics () is an interdisciplinary field that develops methods and software tools for understanding biological data, in particular when the data sets are large and complex. As an interdisciplinary field of science, bioinformatics combin ...
with split databases. * Large scale
facial recognition system A facial recognition system is a technology capable of matching a human face from a digital image or a video frame against a database of faces. Such a system is typically employed to authenticate users through ID verification services, and wor ...
s that compare thousands of arbitrary acquired faces (e.g., a security or surveillance video via closed-circuit television) with similarly large number of previously stored faces (e.g., a '' rogues gallery'' or similar watch list). * Computer simulations comparing many independent scenarios. *
Genetic algorithm In computer science and operations research, a genetic algorithm (GA) is a metaheuristic inspired by the process of natural selection that belongs to the larger class of evolutionary algorithms (EA). Genetic algorithms are commonly used to gen ...
s. * Ensemble calculations of
numerical weather prediction Numerical weather prediction (NWP) uses mathematical models of the atmosphere and oceans to predict the weather based on current weather conditions. Though first attempted in the 1920s, it was not until the advent of computer simulation in th ...
. * Event simulation and reconstruction in
particle physics Particle physics or high energy physics is the study of fundamental particles and forces that constitute matter and radiation. The fundamental particles in the universe are classified in the Standard Model as fermions (matter particles) and ...
. * The
marching squares In computer graphics, marching squares is an algorithm that generates contours for a two-dimensional scalar field (rectangular array of individual numerical values). A similar method can be used to contour 2D triangle meshes. The contours can b ...
algorithm. * Sieving step of the quadratic sieve and the number field sieve. * Tree growth step of the
random forest Random forests or random decision forests is an ensemble learning method for classification, regression and other tasks that operates by constructing a multitude of decision trees at training time. For classification tasks, the output of th ...
machine learning technique. *
Discrete Fourier transform In mathematics, the discrete Fourier transform (DFT) converts a finite sequence of equally-spaced Sampling (signal processing), samples of a function (mathematics), function into a same-length sequence of equally-spaced samples of the discre ...
where each harmonic is independently calculated. *
Convolutional neural network In deep learning, a convolutional neural network (CNN, or ConvNet) is a class of artificial neural network (ANN), most commonly applied to analyze visual imagery. CNNs are also known as Shift Invariant or Space Invariant Artificial Neural Netwo ...
s running on GPUs. * Hyperparameter grid search in machine learning. * Parallel search in
constraint programming Constraint programming (CP) is a paradigm for solving combinatorial problems that draws on a wide range of techniques from artificial intelligence, computer science, and operations research. In constraint programming, users declaratively state th ...


Implementations

* In
R (programming language) R is a programming language for statistical computing and graphics supported by the R Core Team and the R Foundation for Statistical Computing. Created by statisticians Ross Ihaka and Robert Gentleman, R is used among data miners, bioinform ...
– The Simple Network of Workstations (SNOW) package implements a simple mechanism for using a set of workstations or a Beowulf cluster for embarrassingly parallel computations.Simple Network of Workstations (SNOW) package
/ref> Similar R packages include "future", "parallel" and others.


See also

* Amdahl's law defines value '' P'', which would be almost or exactly equal to 1 for embarrassingly parallel problems. * Map (parallel pattern) *
Multiprocessing Multiprocessing is the use of two or more central processing units (CPUs) within a single computer system. The term also refers to the ability of a system to support more than one processor or the ability to allocate tasks between them. There ar ...
*
Massively parallel Massively parallel is the term for using a large number of computer processors (or separate computers) to simultaneously perform a set of coordinated computations in parallel. GPUs are massively parallel architecture with tens of thousands of t ...
* Parallel computing * Process-oriented programming * Shared-nothing architecture (SN) *
Symmetric multiprocessing Symmetric multiprocessing or shared-memory multiprocessing (SMP) involves a multiprocessor computer hardware and software architecture where two or more identical processors are connected to a single, shared main memory, have full access to all ...
(SMP) *
Connection Machine A Connection Machine (CM) is a member of a series of massively parallel supercomputers that grew out of doctoral research on alternatives to the traditional von Neumann architecture of computers by Danny Hillis at Massachusetts Institute of Te ...
*
Cellular automaton A cellular automaton (pl. cellular automata, abbrev. CA) is a discrete model of computation studied in automata theory. Cellular automata are also called cellular spaces, tessellation automata, homogeneous structures, cellular structures, tess ...
* CUDA framework * Manycore processor *
Vector processor In computing, a vector processor or array processor is a central processing unit (CPU) that implements an instruction set where its instructions are designed to operate efficiently and effectively on large one-dimensional arrays of data called ...


References


External links


Embarrassingly Parallel Computations
Engineering a Beowulf-style Compute Cluster *
Star-P: High Productivity Parallel Computing
{{Parallel computing Parallel computing Distributed computing problems Applications of distributed computing