
In computer science, a parallel external memory (PEM) model is a
cache-aware, external-memory
abstract machine
An abstract machine is a computer science theoretical model that allows for a detailed and precise analysis of how a computer system functions. It is analogous to a mathematical function in that it receives inputs and produces outputs based on p ...
.
It is the parallel-computing analogy to the single-processor
external memory (EM) model. In a similar way, it is the cache-aware analogy to the
parallel random-access machine
In computer science, a parallel random-access machine (parallel RAM or PRAM) is a shared-memory abstract machine. As its name indicates, the PRAM is intended as the parallel-computing analogy to the random-access machine (RAM) (not to be confuse ...
(PRAM). The PEM model consists of a number of processors, together with their respective private caches and a shared main memory.
__TOC__
Model
Definition
The PEM model
is a combination of the EM model and the PRAM model. The PEM model is a computation model which consists of
processors and a two-level
memory hierarchy
In computer architecture, the memory hierarchy separates computer storage into a hierarchy based on response time. Since response time, complexity, and capacity are related, the levels may also be distinguished by their performance and controlli ...
. This memory hierarchy consists of a large
external memory (main memory) of size
and
small
internal memories (caches). The processors share the main memory. Each cache is exclusive to a single processor. A processor can't access another’s cache. The caches have a size
which is partitioned in blocks of size
. The processors can only perform operations on data which are in their cache. The data can be transferred between the main memory and the cache in blocks of size
.
I/O complexity
The
complexity measure of the PEM model is the I/O complexity,
which determines the number of parallel blocks transfers between the main memory and the cache. During a parallel block transfer each processor can transfer a block. So if
processors load parallelly a data block of size
form the main memory into their caches, it is considered as an I/O complexity of
not
. A program in the PEM model should minimize the data transfer between main memory and caches and operate as much as possible on the data in the caches.
Read/write conflicts
In the PEM model, there is no
direct communication network between the P processors. The processors have to communicate indirectly over the main memory. If multiple processors try to access the same block in main memory concurrently read/write conflicts
occur. Like in the PRAM model, three different variations of this problem are considered:
* Concurrent Read Concurrent Write (CRCW): The same block in main memory can be read and written by multiple processors concurrently.
* Concurrent Read Exclusive Write (CREW): The same block in main memory can be read by multiple processors concurrently. Only one processor can write to a block at a time.
* Exclusive Read Exclusive Write (EREW): The same block in main memory cannot be read or written by multiple processors concurrently. Only one processor can access a block at a time.
The following two algorithms
solve the CREW and EREW problem if
processors write to the same block simultaneously.
A first approach is to serialize the write operations. Only one processor after the other writes to the block. This results in a total of
parallel block transfers. A second approach needs
parallel block transfers and an additional block for each processor. The main idea is to schedule the write operations in a
binary tree fashion and gradually combine the data into a single block. In the first round
processors combine their blocks into
blocks. Then
processors combine the
blocks into
. This procedure is continued until all the data is combined in one block.
Comparison to other models
Examples
Multiway partitioning
Let
be a vector of d-1 pivots sorted in increasing order. Let
be an unordered set of N elements. A d-way partition
of
is a set
, where
and
for