IBM 801
   HOME

TheInfoList



OR:

The 801 was an experimental
central processing unit A central processing unit (CPU), also called a central processor, main processor or just processor, is the electronic circuitry that executes instructions comprising a computer program. The CPU performs basic arithmetic, logic, controlling, a ...
(CPU) design developed by IBM during the 1970s. It is considered to be the first modern
RISC In computer engineering, a reduced instruction set computer (RISC) is a computer designed to simplify the individual instructions given to the computer to accomplish tasks. Compared to the instructions given to a complex instruction set comp ...
design, relying on
processor register A processor register is a quickly accessible location available to a computer's processor. Registers usually consist of a small amount of fast storage, although some registers have specific hardware functions, and may be read-only or write-only. ...
s for all computations and eliminating the many variant
addressing mode Addressing modes are an aspect of the instruction set architecture in most central processing unit (CPU) designs. The various addressing modes that are defined in a given instruction set architecture define how the machine language instructions i ...
s found in CISC designs. Originally developed as the processor for a
telephone switch telephone exchange, telephone switch, or central office is a telecommunications system used in the public switched telephone network (PSTN) or in large enterprises. It interconnects telephone subscriber lines or virtual circuits of digital syste ...
, it was later used as the basis for a
minicomputer A minicomputer, or colloquially mini, is a class of smaller general purpose computers that developed in the mid-1960s and sold at a much lower price than mainframe and mid-size computers from IBM and its direct competitors. In a 1970 survey, ' ...
and a number of products for their
mainframe A mainframe computer, informally called a mainframe or big iron, is a computer used primarily by large organizations for critical applications like bulk data processing for tasks such as censuses, industry and consumer statistics, enterprise ...
line. The initial design was a
24-bit Notable 24-bit machines include the CDC 924 – a 24-bit version of the CDC 1604, CDC lower 3000 series, SDS 930 and SDS 940, the ICT 1900 series, the Elliott 4100 series, and the Datacraft minicomputers/Harris H series. The term SWORD i ...
processor; that was soon replaced by
32-bit In computer architecture, 32-bit computing refers to computer systems with a processor, memory, and other major system components that operate on data in 32- bit units. Compared to smaller bit widths, 32-bit computers can perform large calculati ...
implementations of the same concepts and the original 24-bit 801 was used only into the early 1980s. The 801 was extremely influential in the computer market. Armed with huge amounts of performance data, IBM was able to demonstrate that the simple design was able to easily outperform even the most powerful classic CPU designs, while at the same time producing
machine code In computer programming, machine code is any low-level programming language, consisting of machine language instructions, which are used to control a computer's central processing unit (CPU). Each instruction causes the CPU to perform a ve ...
that was only marginally larger than the heavily optimized CISC instructions. Applying these same techniques even to existing processors like the
System/370 The IBM System/370 (S/370) is a model range of IBM mainframe computers announced on June 30, 1970, as the successors to the System/360 family. The series mostly maintains backward compatibility with the S/360, allowing an easy migration path ...
generally doubled the performance of those systems as well. This demonstrated the value of the RISC concept, and all of IBM's future systems were based on the principles developed during the 801 project. For his work on the 801, John Cocke was recognized with several awards and medals, including the
Turing Award The ACM A. M. Turing Award is an annual prize given by the Association for Computing Machinery (ACM) for contributions of lasting and major technical importance to computer science. It is generally recognized as the highest distinction in compu ...
in 1987,
National Medal of Technology The National Medal of Technology and Innovation (formerly the National Medal of Technology) is an honor granted by the President of the United States to American inventors and innovators who have made significant contributions to the development ...
in 1991, and the
National Medal of Science The National Medal of Science is an honor bestowed by the President of the United States to individuals in science and engineering who have made important contributions to the advancement of knowledge in the fields of behavioral and social scienc ...
in 1994.


History


Original concept

In 1974, IBM began examining the possibility of constructing a
telephone switch telephone exchange, telephone switch, or central office is a telecommunications system used in the public switched telephone network (PSTN) or in large enterprises. It interconnects telephone subscriber lines or virtual circuits of digital syste ...
to handle one million calls an hour, or about 300 calls per second. They calculated that each call would require 20,000 instructions to complete, and when one added timing overhead and other considerations, such a machine required performance of about 12 MIPS. This would require a significant advance in performance; their current top-of-the-line machine, the
IBM System/370 Model 168 The IBM System/370 Model 168 and Model 158 were both announced on August 2, 1972. Prior 370 systems had not "offered virtual storage capability, which was to be a hallmark of the 370 line," and some said that the 168 and 158 were the first "real ...
of late 1972, offered about 3 MIPS. The group working on this project at the
Thomas J. Watson Research Center The Thomas J. Watson Research Center is the headquarters for IBM Research. The center comprises three sites, with its main laboratory in Yorktown Heights, New York, U.S., 38 miles (61 km) north of New York City, Albany, New York and wit ...
, including John Cocke, designed a processor for this purpose. To reach the required performance, they considered the sort of operations such a machine required and removed any that were not appropriate. This led to the removal of a
floating-point unit In computing, floating-point arithmetic (FP) is arithmetic that represents real numbers approximately, using an integer with a fixed precision, called the significand, scaled by an integer exponent of a fixed base. For example, 12.345 can be ...
for instance, which would not be needed in this application. More critically, they also removed many of the instructions that worked on data in
main memory Computer data storage is a technology consisting of computer components and recording media that are used to retain digital data. It is a core function and fundamental component of computers. The central processing unit (CPU) of a comput ...
and left only those instructions that worked on the internal
processor register A processor register is a quickly accessible location available to a computer's processor. Registers usually consist of a small amount of fast storage, although some registers have specific hardware functions, and may be read-only or write-only. ...
s, as these were much faster to use and the simple code in a telephone switch could be written to use only these types instructions. The result of this work was a conceptual design for a simplified processor with the required performance. The telephone switch project was canceled in 1975, but the team had made considerable progress on the concept and in October IBM decided to continue it as a general-purpose design. With no obvious project to attach it to, the team decided to call it the "801" after the building they worked in. For the general-purpose role, the team began to consider real-world programs that would be run on a typical
minicomputer A minicomputer, or colloquially mini, is a class of smaller general purpose computers that developed in the mid-1960s and sold at a much lower price than mainframe and mid-size computers from IBM and its direct competitors. In a 1970 survey, ' ...
. IBM had collected enormous amounts of statistical data on the performance of real-world workloads on their machines and this data demonstrated that over half the time in a typical program was spent performing only five instructions: load value from memory, store value to memory, branch, compare fixed-point numbers, and add fixed-point numbers. This suggested that the same simplified processor design would work just as well for a general-purpose minicomputer as a special-purpose switch.


Microcode is bad

This conclusion flew in the face of contemporary processor design, which was based on the concept of using
microcode In processor design, microcode (μcode) is a technique that interposes a layer of computer organization between the central processing unit (CPU) hardware and the programmer-visible instruction set architecture of a computer. Microcode is a la ...
. IBM had been among the first to make widespread use of this technique as part of their
System/360 The IBM System/360 (S/360) is a family of mainframe computer systems that was announced by IBM on April 7, 1964, and delivered between 1965 and 1978. It was the first family of computers designed to cover both commercial and scientific applica ...
series. The 360s, and 370s, came in a variety of performance levels that all ran the same
machine language In computer programming, machine code is any low-level programming language, consisting of machine language instructions, which are used to control a computer's central processing unit (CPU). Each instruction causes the CPU to perform a ver ...
code. On the high-end machines, many of these instructions were implemented directly in hardware, like a floating point unit, while low-end machines could instead simulate those instructions using a sequence of other instructions. This allowed a single
application binary interface In computer software, an application binary interface (ABI) is an interface between two binary program modules. Often, one of these modules is a library or operating system facility, and the other is a program that is being run by a user. An ...
to run across the entire line, and allowed the customers to feel confident that if more performance was ever needed they could move up to a faster machine without any other changes. Microcode allowed a simple processor to offer many instructions, which had been used by the designers to implement a wide variety of
addressing mode Addressing modes are an aspect of the instruction set architecture in most central processing unit (CPU) designs. The various addressing modes that are defined in a given instruction set architecture define how the machine language instructions i ...
s. For instance, an instruction like ADD might have a dozen versions, one that adds two numbers in internal registers, one that adds a register to a value in memory, one that adds two values from memory, etc. This allowed the programmer to select the exact variation that they needed for any particular task. The processor would read that instruction and use microcode to break it into a series of internal instructions. For instance, adding two numbers in memory might be implemented by loading those two numbers into registers, adding them, and then saving it back out again. The team noticed a side-effect of this concept; when faced with the plethora of possible versions of a given instruction,
compiler In computing, a compiler is a computer program that translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primarily used for programs tha ...
authors would almost always pick a single version. This was almost always the one that was implemented in hardware on the low-end machines. That ensured that the machine code generated by the compiler would run as fast as possible on the entire lineup. While using other versions of instructions might run even faster on a machine that implemented other versions of the instruction in hardware, the complexity of knowing which one to pick on an ever-changing list of machines made this extremely unattractive, and compiler authors largely ignored these possibilities. As a result, the majority of the instructions available in the instruction set were never used in compiled programs. And it was here that the team made the key realization of the 801 project:
Imposing microcode between a computer and its users imposes an expensive overhead in performing the most frequently executed instructions.
Microcode takes a non-zero time to examine the instruction before it is performed. The same underlying processor with the microcode removed would eliminate this overhead and run those instructions faster. Since microcode essentially ran small
subroutine In computer programming, a function or subroutine is a sequence of program instructions that performs a specific task, packaged as a unit. This unit can then be used in programs wherever that particular task should be performed. Functions may ...
s dedicated to a particular hardware implementation, it was ultimately performing the same basic task that the compiler was, implementing higher-level instructions as a sequence of machine-specific instructions. Simply removing the microcode and implementing that in the compiler could result in a faster machine. One concern was that programs written for such a machine would take up more memory; some tasks that could be accomplished with a single instruction on the 370 would have to be expressed as multiple instructions on the 801. For instance, adding two numbers from memory would require two load-to-register instructions, a register-to-register add, and then a store-to-memory. This could potentially slow the system overall if it had to spend more time reading instructions from memory than it formerly took to decode them. As they continued work on the design and improved their compilers, they found that overall program length continued to fall, eventually becoming roughly the same length as those written for the 370.


First implementations

The initially proposed architecture was a machine with sixteen
24-bit Notable 24-bit machines include the CDC 924 – a 24-bit version of the CDC 1604, CDC lower 3000 series, SDS 930 and SDS 940, the ICT 1900 series, the Elliott 4100 series, and the Datacraft minicomputers/Harris H series. The term SWORD i ...
registers and without
virtual memory In computing, virtual memory, or virtual storage is a memory management technique that provides an "idealized abstraction of the storage resources that are actually available on a given machine" which "creates the illusion to users of a very ...
. It used a two-operand format in the instruction, so that instructions were generally of the form A = A + B, as opposed to the three-operand format, A = B + C. The resulting CPU was operational by the summer of 1980 and was implemented using Motorola MECL-10K discrete component technology on large wire-wrapped custom boards. The CPU was clocked at 66 ns cycles (approximately 15.15 MHz) and could compute at the fast speed of approximately 15  MIPS. The 801 architecture was used in a variety of IBM devices, including
channel controller In computing, channel I/O is a high-performance input/output (I/O) architecture that is implemented in various forms on a number of computer architectures, especially on mainframe computers. In the past, channels were generally implemented with cus ...
s for their S/370 mainframes (such as the
IBM 3090 The IBM 3090 family is a family of mainframe computers that was a high-end successor to the IBM System/370 series, and thus indirectly the successor to the IBM System/360 launched 25 years earlier. Announced on 12 February 1985, the press relea ...
), various networking devices, and as a
vertical microcode In processor design, microcode (μcode) is a technique that interposes a layer of computer organization between the central processing unit (CPU) hardware and the programmer-visible instruction set architecture of a computer. Microcode is a lay ...
execution unit in the 9373 and 9375 processors of the IBM 9370 mainframe family. The original version of the 801 architecture was the basis for the architecture of the IBM ROMP
microprocessor A microprocessor is a computer processor where the data processing logic and control is included on a single integrated circuit, or a small number of integrated circuits. The microprocessor contains the arithmetic, logic, and control circ ...
used in the
IBM RT PC The IBM RT PC (RISC Technology Personal Computer) is a family of workstation computers from IBM introduced in 1986. These were the first commercial computers from IBM that were based on a reduced instruction set computer (RISC) architecture. Th ...
workstation computer A workstation is a special computer designed for technical or scientific applications. Intended primarily to be used by a single user, they are commonly connected to a local area network and run multi-user operating systems. The term ''workst ...
and several experimental computers from
IBM Research IBM Research is the research and development division for IBM, an American multinational information technology company headquartered in Armonk, New York, with operations in over 170 countries. IBM Research is the largest industrial research or ...
. A derivative of the 801 architecture with
32-bit In computer architecture, 32-bit computing refers to computer systems with a processor, memory, and other major system components that operate on data in 32- bit units. Compared to smaller bit widths, 32-bit computers can perform large calculati ...
addressing named ''Iliad'' was intended to serve as the primary processor of the unsuccessful
Fort Knox Fort Knox is a United States Army installation in Kentucky, south of Louisville and north of Elizabethtown. It is adjacent to the United States Bullion Depository, which is used to house a large portion of the United States' official gold re ...
midrange system project.


Later modifications

Having been originally designed for a limited-function system, the 801 design lacked a number of features seen on larger machines. Notable among these was the lack of hardware support for
virtual memory In computing, virtual memory, or virtual storage is a memory management technique that provides an "idealized abstraction of the storage resources that are actually available on a given machine" which "creates the illusion to users of a very ...
, which was not needed for the controller role and had been implemented in software on early 801 systems that needed it. For more widespread use, hardware support was a must-have feature. Additionally, by the 1980s the computer world as a whole was moving towards
32-bit In computer architecture, 32-bit computing refers to computer systems with a processor, memory, and other major system components that operate on data in 32- bit units. Compared to smaller bit widths, 32-bit computers can perform large calculati ...
systems, and there was a desire to do the same with the 801. Moving to a 32-bit format had another significant advantage. In practice, it was found that the two-operand format was difficult to use in typical math code. Ideally, both input operands would remain in registers where they could be re-used in subsequent operations, but as the output of the operation overwrote one of them, it was often the case that one of the values had to be re-loaded from memory. By moving to a 32-bit format, the extra bits in the instruction words allowed an additional register to be specified, so that the output of such operations could be directed to a separate register. The larger instruction word also allowed the number of registers to be increased from sixteen to thirty-two, a change that had clearly been suggested by examination of 801 code. In spite of the expansion of the instruction words from 24 to 32-bits, programs did not grow by the corresponding 33% due to avoided loads and saves due to these two changes. Other desirable additions include instructions for working with string data that was encoded in "packed" format with several
ASCII ASCII ( ), abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. Because ...
characters in a single memory word, and additions for working with
binary-coded decimal In computing and electronic systems, binary-coded decimal (BCD) is a class of binary encodings of decimal numbers where each digit is represented by a fixed number of bits, usually four or eight. Sometimes, special bit patterns are used ...
, including an adder that could carry across four-bit decimal numbers. When the new version of the 801 was run as a simulator on the 370, the team was surprised to find that code compiled to the 801 and run in the simulator would often run faster than the same
source code In computing, source code, or simply code, is any collection of code, with or without comments, written using a human-readable programming language, usually as plain text. The source code of a program is specially designed to facilitate the ...
compiled directly to 370
machine code In computer programming, machine code is any low-level programming language, consisting of machine language instructions, which are used to control a computer's central processing unit (CPU). Each instruction causes the CPU to perform a ve ...
using the 370's
PL/1 PL/I (Programming Language One, pronounced and sometimes written PL/1) is a procedural, imperative computer programming language developed and published by IBM. It is designed for scientific, engineering, business and system programming. I ...
compiler. When they ported their experimental "PL.8" language back to the 370 and compiled applications using it, they also ran faster than existing PL/1 code, as much as three times as fast. This was due to the compiler making RISC-like decisions about how to compile the code to internal registers, thereby optimizing out as many memory accesses as possible. These were just as expensive on the 370 as the 801, but this cost was normally hidden by the simplicity of a single line of CISC code. The PL.8 compiler was much more aggressive about avoiding loads and saves, and thereby resulting in higher performance even on a CISC processor.


The Cheetah, Panther, and America projects

In the early 1980s, the lessons learned on the 801 were combined with those from the IBM Advanced Computer Systems project, resulting in an experimental processor called "Cheetah". Cheetah was a 2-way
superscalar processor A superscalar processor is a CPU that implements a form of parallelism called instruction-level parallelism within a single processor. In contrast to a scalar processor, which can execute at most one single instruction per clock cycle, a sup ...
, which evolved into a processor called "Panther" in 1985, and finally into a 4-way superscalar design called "America" in 1986. This was a three-chip processor set including an instruction processor that fetches and decodes instructions, a fixed-point processor that shares duty with the instruction processor, and a floating-point processor for those systems that require it. Designed by the 801 team, the final design was sent to IBM's Austin office in 1986, where it was developed into the IBM RS/6000 system. The RS/6000 running at 25 MHz was one of the fastest machines of its era. It outperformed other RISC machines by two to three times on common tests, and trivially outperformed older CISC systems. After the RS/6000, the company turned its attention to a version of the 801 concepts that could be efficiently fabricated at various scales. The result was the IBM POWER instruction set architecture and the
PowerPC PowerPC (with the backronym Performance Optimization With Enhanced RISC – Performance Computing, sometimes abbreviated as PPC) is a reduced instruction set computer (RISC) instruction set architecture (ISA) created by the 1991 Apple– IBM– ...
offshoot.


Recognition

For his work on the 801, John Cocke was awarded several awards and medals: * 1985:
Eckert–Mauchly Award The Eckert–Mauchly Award recognizes contributions to digital systems and computer architecture. It is known as the computer architecture community’s most prestigious award. First awarded in 1979, it was named for John Presper Eckert and Jo ...
* 1987: A.M. Turing Award * 1989:
Computer Pioneer Award The Computer Pioneer Award was established in 1981 by the Board of Governors of the IEEE Computer Society to recognize and honor the vision of those people whose efforts resulted in the creation and continued vitality of the computer industry. ...
* 1991:
National Medal of Technology The National Medal of Technology and Innovation (formerly the National Medal of Technology) is an honor granted by the President of the United States to American inventors and innovators who have made significant contributions to the development ...
* 1994:
IEEE John von Neumann Medal The IEEE John von Neumann Medal was established by the IEEE Board of Directors in 1990 and may be presented annually "for outstanding achievements in computer-related science and technology." The achievements may be theoretical, technological, or ...
* 1994:
National Medal of Science The National Medal of Science is an honor bestowed by the President of the United States to individuals in science and engineering who have made important contributions to the advancement of knowledge in the fields of behavioral and social scienc ...
* 2000: Benjamin Franklin Medal (The Franklin Institute)
Michael J. Flynn Michael J. Flynn (born May 20, 1934) is an American professor emeritus at Stanford University. Early life and education Flynn was born in New York City. Career Flynn proposed Flynn's taxonomy, a method of classifying parallel digital compute ...
views the 801 as the first RISC.


References


Citations


Bibliography

* * *


Further reading

*"Altering Computer Architecture is Way to Raise Throughput, Suggests IBM Researchers". ''
Electronics The field of electronics is a branch of physics and electrical engineering that deals with the emission, behaviour and effects of electrons using electronic devices. Electronics uses active devices to control electron flow by amplification ...
'' V. 49, N. 25 (23 December 1976), pp. 3031. *V. McLellan: "IBM Mini a Radical Departure". ''
Datamation ''Datamation'' is a computer magazine that was published in print form in the United States between 1957 and 1998,
'' V. 25, N. 11 (October 1979), pp. 5355. * *


External links


The 801 Minicomputer - An OverviewIBM System 801 Principles of Operation, Version 2801 I/O Subsystem Definition
*IBM Archives
A Brief History of RISC, the IBM RS/6000 and the IBM eServer pSeries
{{IBM midrange computers 801 Minicomputers