Explicitly parallel instruction computing
   HOME

TheInfoList



OR:

Explicitly parallel instruction computing (EPIC) is a term coined in 1997 by the HP–Intel alliance to describe a
computing paradigm Programming paradigms are a way to classify programming languages based on their features. Languages can be classified into multiple paradigms. Some paradigms are concerned mainly with implications for the execution model of the language, suc ...
that researchers had been investigating since the early 1980s. This paradigm is also called ''Independence'' architectures. It was the basis for
Intel Intel Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California. It is the world's largest semiconductor chip manufacturer by revenue, and is one of the developers of the x86 seri ...
and HP development of the Intel
Itanium Itanium ( ) is a discontinued family of 64-bit Intel microprocessors that implement the Intel Itanium architecture (formerly called IA-64). Launched in June 2001, Intel marketed the processors for enterprise servers and high-performance computin ...
architecture, and HP later asserted that "EPIC" was merely an old term for the Itanium architecture. EPIC permits microprocessors to execute software instructions in parallel by using the
compiler In computing, a compiler is a computer program that translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primarily used for programs that ...
, rather than complex on-
die Die, as a verb, refers to death, the cessation of life. Die may also refer to: Games * Die, singular of dice, small throwable objects used for producing random numbers Manufacturing * Die (integrated circuit), a rectangular piece of a semicondu ...
circuitry, to control parallel instruction execution. This was intended to allow simple performance scaling without resorting to higher clock frequencies.


Roots in VLIW

By 1989, researchers at HP recognized that
reduced instruction set computer In computer engineering, a reduced instruction set computer (RISC) is a computer designed to simplify the individual instructions given to the computer to accomplish tasks. Compared to the instructions given to a complex instruction set comput ...
(RISC) architectures were reaching a limit at one
instruction per cycle In computer architecture, instructions per cycle (IPC), commonly called instructions per clock is one aspect of a processor's performance: the average number of instructions executed for each clock cycle. It is the multiplicative inverse of cycl ...
. They began an investigation into a new architecture, later named EPIC. The basis for the research was
VLIW Very long instruction word (VLIW) refers to instruction set architectures designed to exploit instruction level parallelism (ILP). Whereas conventional central processing units (CPU, processor) mostly allow programs to specify instructions to exe ...
, in which multiple operations are encoded in every instruction, and then processed by multiple execution units. One goal of EPIC was to move the complexity of instruction scheduling from the CPU hardware to the software
compiler In computing, a compiler is a computer program that translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primarily used for programs that ...
, which can do the instruction scheduling statically (with help of trace feedback information). This eliminates the need for complex scheduling circuitry in the CPU, which frees up space and power for other functions, including additional execution resources. An equally important goal was to further exploit
instruction level parallelism Instruction-level parallelism (ILP) is the parallel or simultaneous execution of a sequence of instructions in a computer program. More specifically ILP refers to the average number of instructions run per step of this parallel execution. Discu ...
(''ILP'') by using the compiler to find and exploit additional opportunities for parallel execution. ''VLIW'' (at least the original forms) has several short-comings that precluded it from becoming mainstream: * VLIW
instruction set In computer science, an instruction set architecture (ISA), also called computer architecture, is an abstract model of a computer. A device that executes instructions described by that ISA, such as a central processing unit (CPU), is called an ' ...
s are not
backward compatible Backward compatibility (sometimes known as backwards compatibility) is a property of an operating system, product, or technology that allows for interoperability with an older legacy system, or with input designed for such a system, especially i ...
between implementations. When wider implementations (more
execution unit In computer engineering, an execution unit (E-unit or EU) is a part of the central processing unit (CPU) that performs the operations and calculations as instructed by the computer program. It may have its own internal control sequence unit (not ...
s) are built, the instruction set for the wider machines is not backward compatible with older, narrower implementations. * Load responses from a
memory hierarchy In computer architecture, the memory hierarchy separates computer storage into a hierarchy based on response time. Since response time, complexity, and capacity are related, the levels may also be distinguished by their performance and controlli ...
which includes
CPU cache A CPU cache is a hardware cache used by the central processing unit (CPU) of a computer to reduce the average cost (time or energy) to access data from the main memory. A cache is a smaller, faster memory, located closer to a processor core, which ...
s and
DRAM Dynamic random-access memory (dynamic RAM or DRAM) is a type of random-access semiconductor memory that stores each bit of data in a memory cell, usually consisting of a tiny capacitor and a transistor, both typically based on metal-oxid ...
do not have a deterministic delay. This makes static scheduling of load instructions by the compiler very difficult. EPIC architecture evolved from VLIW architecture, but retained many concepts of the
superscalar A superscalar processor is a CPU that implements a form of parallelism called instruction-level parallelism within a single processor. In contrast to a scalar processor, which can execute at most one single instruction per clock cycle, a sup ...
architecture.


Moving beyond VLIW

''EPIC'' architectures add several features to get around the deficiencies of VLIW: * Each group of multiple software instructions is called a ''bundle''. Each of the bundles has a
stop bit Asynchronous serial communication is a form of serial communication in which the communicating endpoints' interfaces are not continuously synchronized by a common clock signal. Instead of a common synchronization signal, the data stream contai ...
indicating if this set of operations is depended upon by the subsequent bundle. With this capability, future implementations can be built to issue multiple bundles in parallel. The dependency information is calculated by the compiler, so the hardware does not have to perform operand dependency checking. * A software prefetch instruction is used as a type of data prefetch. This prefetch increases the chances for a cache hit for loads, and can indicate the degree of temporal locality needed in various levels of the cache. * A speculative load instruction is used to speculatively load data before it is known whether it will be used (bypassing control dependencies), or whether it will be modified before it is used (bypassing data dependencies). * A check load instruction aids speculative loads by checking whether a speculative load was dependent on a later store, and thus must be reloaded. The ''EPIC'' architecture also includes a ''grab-bag'' of architectural concepts to increase ''ILP'': * Predicated execution is used to decrease the occurrence of branches and to increase the
speculative execution Speculative execution is an optimization technique where a computer system performs some task that may not be needed. Work is done before it is known whether it is actually needed, so as to prevent a delay that would have to be incurred by doing t ...
of instructions. In this feature, branch conditions are converted to predicate registers which are used to kill results of executed instructions from the side of the branch which is not taken. * Delayed exceptions, using a not a thing bit within the general purpose registers, allow speculative execution past possible exceptions. * Very large architectural
register file A register file is an array of processor registers in a central processing unit (CPU). Register banking is the method of using a single name to access multiple different physical registers depending on the operating mode. Modern integrated circuit- ...
s avoid the need for
register renaming In computer architecture, register renaming is a technique that abstracts logical registers from physical registers. Every logical register has a set of physical registers associated with it. When a machine language instruction refers to a particu ...
. * Multi-way branch instructions improve branch prediction by combining many alternative branches into one bundle. The
Itanium Itanium ( ) is a discontinued family of 64-bit Intel microprocessors that implement the Intel Itanium architecture (formerly called IA-64). Launched in June 2001, Intel marketed the processors for enterprise servers and high-performance computin ...
architecture also added rotating register files, a tool useful for
software pipelining In computer science, software pipelining is a technique used to optimize loops, in a manner that parallels hardware pipelining. Software pipelining is a type of out-of-order execution, except that the reordering is done by a compiler (or in the ...
since it avoids having to manually unroll and rename registers.


Other research and development

There have been other investigations into EPIC architectures that are not directly tied to the development of the Itanium architecture: *The
IMPACT Impact may refer to: * Impact (mechanics), a high force or shock (mechanics) over a short time period * Impact, Texas, a town in Taylor County, Texas, US Science and technology * Impact crater, a meteor crater caused by an impact event * Impact ...
project at
University of Illinois at Urbana–Champaign The University of Illinois Urbana-Champaign (U of I, Illinois, University of Illinois, or UIUC) is a public land-grant research university in Illinois in the twin cities of Champaign and Urbana. It is the flagship institution of the Universit ...
, led by Wen-mei Hwu, was the source of much influential research on this topic. *The PlayDoh architecture from HP-labs was another major research project. *
Gelato Gelato (; ) is the common word in Italian for all kinds of ice cream. In English, it specifically refers to a frozen dessert of Italian origin. Artisanal gelato in Italy generally contains 6%–9% butterfat, which is lower than other styles o ...
was an open source development community in which academic and commercial researchers worked to develop more effective compilers for Linux applications running on Itanium servers.


See also

*
Complex instruction set computer A complex instruction set computer (CISC ) is a computer architecture in which single instructions can execute several low-level operations (such as a load from memory, an arithmetic operation, and a memory store) or are capable of multi-step ...
(CISC) *
Reduced instruction set computer In computer engineering, a reduced instruction set computer (RISC) is a computer designed to simplify the individual instructions given to the computer to accomplish tasks. Compared to the instructions given to a complex instruction set comput ...
(RISC) *
Very long instruction word Very long instruction word (VLIW) refers to instruction set architectures designed to exploit instruction level parallelism (ILP). Whereas conventional central processing units (CPU, processor) mostly allow programs to specify instructions to exe ...
(VLIW) *
Computer architecture In computer engineering, computer architecture is a description of the structure of a computer system made from component parts. It can sometimes be a high-level description that ignores details of the implementation. At a more detailed level, t ...
*
Superscalar A superscalar processor is a CPU that implements a form of parallelism called instruction-level parallelism within a single processor. In contrast to a scalar processor, which can execute at most one single instruction per clock cycle, a sup ...
*
Wide-issue A wide-issue architecture is a computer processor that issues more than one instruction per clock cycle. They can be considered in three broad types: * Statically-scheduled superscalar architectures execute instructions in the order presented; ...


References


External links


Historical background for EPIC
* Mark Smotherman (2002)
Understanding EPIC Architectures and Implementations
{{DEFAULTSORT:Explicitly Parallel Instruction Computing Instruction processing Very long instruction word computing