The Harvard architecture is a computer architecture with separate storage and signal pathways for instructions and

data Data ( , ) are a collection of discrete or continuous values that convey information, describing the quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpreted for ...

. It is often contrasted with the

von Neumann architecture The von Neumann architecture—also known as the von Neumann model or Princeton architecture—is a computer architecture based on the '' First Draft of a Report on the EDVAC'', written by John von Neumann in 1945, describing designs discus ...

, where program instructions and data share the same memory and pathways. This architecture is often used in real-time processing or low-power applications. The term is often stated as having originated from the Harvard Mark I relay-based computer, which stored instructions on

punched tape file:PaperTapes-5and8Hole.jpg, Five- and eight-hole wide punched paper tape file:Harwell-dekatron-witch-10.jpg, Paper tape reader on the Harwell computer with a small piece of five-hole tape connected in a circle – creating a physical program ...

(24 bits wide) and data in electro-mechanical counters. These early machines had data storage entirely contained within the

central processing unit A central processing unit (CPU), also called a central processor, main processor, or just processor, is the primary Processor (computing), processor in a given computer. Its electronic circuitry executes Instruction (computing), instructions ...

, and provided no access to the instruction storage as data. Programs needed to be loaded by an operator; the processor could not initialize itself. However, in the only peer-reviewed paper on the topic published in 2022 the author states that: * 'The term "Harvard architecture" was coined decades later, in the context of microcontroller design' and only 'retrospectively applied to the Harvard machines and subsequently applied to RISC microprocessors with separated caches'; * 'The so-called "Harvard" and "von Neumann" architectures are often portrayed as a

dichotomy A dichotomy () is a partition of a set, partition of a whole (or a set) into two parts (subsets). In other words, this couple of parts must be * jointly exhaustive: everything must belong to one part or the other, and * mutually exclusive: nothi ...

, but the various devices labeled as the former have far more in common with the latter than they do with each other'; * 'In short he Harvard architectureisn't an architecture and didn't derive from work at Harvard'. Modern processors appear to the user to be systems with von Neumann architectures, with the program code stored in the same

main memory Computer data storage or digital data storage is a technology consisting of computer components and recording media that are used to retain digital data. It is a core function and fundamental component of computers. The central processin ...

as the data. For performance reasons, internally and largely invisible to the user, most designs have separate processor caches for the instructions and data, with separate pathways into the processor for each. This is one form of what is known as the modified Harvard architecture. Harvard architecture is historically, and traditionally, split into two address spaces, but having three, i.e. two extra (and all accessed in each cycle) is also done, while rare.

Memory details

In a Harvard architecture, there is no need to make the two memories share characteristics. In particular, the

word A word is a basic element of language that carries semantics, meaning, can be used on its own, and is uninterruptible. Despite the fact that language speakers often have an intuitive grasp of what a word is, there is no consensus among linguist ...

width, timing, implementation technology, and

memory address In computing, a memory address is a reference to a specific memory location in memory used by both software and hardware. These addresses are fixed-length sequences of digits, typically displayed and handled as unsigned integers. This numeric ...

structure can differ. In some systems, instructions for pre-programmed tasks can be stored in

read-only memory Read-only memory (ROM) is a type of non-volatile memory used in computers and other electronic devices. Data stored in ROM cannot be electronically modified after the manufacture of the memory device. Read-only memory is useful for storing sof ...

while data memory generally requires read-write memory. In some systems, there is much more instruction memory than data memory so instruction addresses are wider than data addresses.

Contrast with von Neumann architectures

In a system with a pure

, instructions and data are stored in the same memory, so instructions are fetched over the same data path used to fetch data. This means that a CPU cannot simultaneously read an instruction and read or write data from or to the memory. In a computer using the Harvard architecture, the CPU can both read an instruction and perform a data memory access at the same time,386 vs. 030: the Crowded Fast Lane
. ''Dr. Dobb's Journal'', January 1988. even without a cache. A Harvard architecture computer can thus be faster for a given circuit complexity because instruction fetches and data access do not contend for a single memory pathway. Also, a Harvard architecture machine has distinct code and data address spaces: instruction address zero is not the same as data address zero. Instruction address zero might identify a twenty-four-bit value, while data address zero might indicate an eight-bit byte that is not part of that twenty-four-bit value.

Contrast with modified Harvard architecture

A modified Harvard architecture machine is very much like a Harvard architecture machine, but it relaxes the strict separation between instruction and data while still letting the CPU concurrently access two (or more) memory buses. The most common modification includes separate instruction and data caches backed by a common address space. While the CPU executes from cache, it acts as a pure Harvard machine. When accessing backing memory, it acts like a von Neumann machine (where code can be moved around like data, which is a powerful technique). This modification is widespread in modern processors, such as the

ARM architecture ARM (stylised in lowercase as arm, formerly an acronym for Advanced RISC Machines and originally Acorn RISC Machine) is a family of reduced instruction set computer, RISC instruction set architectures (ISAs) for central processing unit, com ...

Power ISA Power ISA is a reduced instruction set computer (RISC) instruction set architecture (ISA) currently developed by the OpenPOWER Foundation, led by IBM. It was originally developed by IBM and the now-defunct Power.org industry group. Power IS ...

and

x86 x86 (also known as 80x86 or the 8086 family) is a family of complex instruction set computer (CISC) instruction set architectures initially developed by Intel, based on the 8086 microprocessor and its 8-bit-external-bus variant, the 8088. Th ...

processors. It is sometimes loosely called a Harvard architecture, overlooking the fact that it is actually "modified". Another modification provides a pathway between the instruction memory (such as ROM or

flash memory Flash memory is an Integrated circuit, electronic Non-volatile memory, non-volatile computer memory storage medium that can be electrically erased and reprogrammed. The two main types of flash memory, NOR flash and NAND flash, are named for t ...

) and the CPU to allow words from the instruction memory to be treated as read-only data. This technique is used in some microcontrollers, including the

Atmel AVR AVR is a family of microcontrollers developed since 1996 by Atmel, acquired by Microchip Technology in 2016. They are 8-bit RISC single-chip microcontrollers based on a modified Harvard architecture. AVR was one of the first microcontroller ...

. This allows constant data, such as text strings or function tables, to be accessed without first having to be copied into data memory, preserving scarce (and power-hungry) data memory for read/write variables. Special

machine language In computer programming, machine code is computer code consisting of machine language instructions, which are used to control a computer's central processing unit (CPU). For conventional binary computers, machine code is the binaryOn nonb ...

instructions are provided to read data from the instruction memory, or the instruction memory can be accessed using a peripheral interface. (This is distinct from instructions which themselves embed constant data, although for individual constants the two mechanisms can substitute for each other.)

Speed

In recent years, the speed of the CPU has grown many times in comparison to the access speed of the main memory. Care needs to be taken to reduce the number of times main memory is accessed in order to maintain performance. If, for instance, every instruction run in the CPU requires an access to memory, the computer gains nothing for increased CPU speed—a problem referred to as being memory bound. It is possible to make extremely fast memory, but this is only practical for small amounts of memory for cost, power and signal routing reasons. The solution is to provide a small amount of very fast memory known as a

CPU cache A CPU cache is a hardware cache used by the central processing unit (CPU) of a computer to reduce the average cost (time or energy) to access data from the main memory. A cache is a smaller, faster memory, located closer to a processor core, whi ...

which holds recently accessed data. As long as the data that the CPU needs is in the cache, the performance is much higher than it is when the CPU has to get the data from the main memory. On the other side, however, it may still be limited to storing repetitive programs or data and still has a storage size limitation, and other potential problems associated with it.

Internal versus external design

Modern high performance CPU chip designs incorporate aspects of both Harvard and von Neumann architecture. In particular, the "split cache" version of the modified Harvard architecture is very common. CPU cache memory is divided into an instruction cache and a data cache. Harvard architecture is used as the CPU accesses the cache. In the case of a cache miss, however, the data is retrieved from the main memory, which is not formally divided into separate instruction and data sections, although it may well have separate memory controllers used for concurrent access to RAM, ROM and (NOR) flash memory. Thus, while a von Neumann architecture is visible in some contexts, such as when data and code come through the same memory controller, the hardware implementation gains the efficiencies of the Harvard architecture for cache accesses and at least some main memory accesses. In addition, CPUs often have write buffers which let CPUs proceed after writes to non-cached regions. The von Neumann nature of memory is then visible when instructions are written as data by the CPU and software must ensure that the caches (data and instruction) and write buffer are synchronized before trying to execute those just-written instructions.

Modern uses of the Harvard architecture

The principal advantage of the pure Harvard architecture—simultaneous access to more than one memory system—has been reduced by modified Harvard processors using modern

systems. Relatively pure Harvard architecture machines are used mostly in applications where trade-offs, like the cost and power savings from omitting caches, outweigh the programming penalties from featuring distinct code and data address spaces. * Digital signal processors (DSPs) generally execute small, highly optimized audio or video processing algorithms using a Harvard architecture. They avoid caches because their behavior must be extremely reproducible. The difficulties of coping with multiple address spaces are of secondary concern to speed of execution. Consequently, some DSPs feature multiple data memories in distinct address spaces to facilitate

SIMD Single instruction, multiple data (SIMD) is a type of parallel computer, parallel processing in Flynn's taxonomy. SIMD describes computers with multiple processing elements that perform the same operation on multiple data points simultaneousl ...

and

VLIW Very long instruction word (VLIW) refers to instruction set architectures that are designed to exploit instruction-level parallelism (ILP). A VLIW processor allows programs to explicitly specify instructions to execute in parallel computing, para ...

processing. Texas Instruments TMS320 C55x processors, for one example, feature multiple parallel data buses (two write, three read) and one instruction bus. *

Microcontrollers A microcontroller (MC, uC, or μC) or microcontroller unit (MCU) is a small computer on a single integrated circuit. A microcontroller contains one or more CPUs (processor cores) along with memory and programmable input/output peripherals. Pro ...

are characterized by having small amounts of program (flash memory) and data ( SRAM) memory, and take advantage of the Harvard architecture to speed processing by concurrent instruction and data. The separate storage means the program and data memories may feature different bit widths, for example using 16-bit-wide instructions and 8-bit-wide data. They also mean that instruction prefetch can be performed in parallel with other activities. Examples include the PIC by Microchip Technology, Inc. and the AVR by Atmel Corp (now part of Microchip Technology). Even in these cases, it is common to employ special instructions in order to access program memory as though it were data for read-only tables, or for reprogramming; those processors are modified Harvard architecture processors.

Notes

References

External links

Harvard Architecture

Harvard vs von Neumann Architectures

Difference Between Harvard Architecture And Von Neumann Architecture
{{CPU technologies Computer architecture Classes of computers