HOME

TheInfoList



OR:

ARM (stylised in lowercase as arm, formerly an acronym for Advanced RISC Machines and originally Acorn RISC Machine) is a family of
RISC In electronics and computer science, a reduced instruction set computer (RISC) is a computer architecture designed to simplify the individual instructions given to the computer to accomplish tasks. Compared to the instructions given to a comp ...
instruction set architecture In computer science, an instruction set architecture (ISA) is an abstract model that generally defines how software controls the CPU in a computer or a family of computers. A device or program that executes instructions described by that ISA, ...
s (ISAs) for computer processors. Arm Holdings develops the ISAs and licenses them to other companies, who build the physical devices that use the instruction set. It also designs and licenses cores that implement these ISAs. Due to their low costs, low power consumption, and low heat generation, ARM processors are useful for light, portable, battery-powered devices, including
smartphone A smartphone is a mobile phone with advanced computing capabilities. It typically has a touchscreen interface, allowing users to access a wide range of applications and services, such as web browsing, email, and social media, as well as multi ...
s,
laptop A laptop computer or notebook computer, also known as a laptop or notebook, is a small, portable personal computer (PC). Laptops typically have a Clamshell design, clamshell form factor (design), form factor with a flat-panel computer scree ...
s, and
tablet computer A tablet computer, commonly shortened to tablet, is a mobile device, typically with a mobile operating system and touchscreen display processing circuitry, and a rechargeable battery in a single, thin and flat package. Tablets, being computers ...
s, as well as
embedded system An embedded system is a specialized computer system—a combination of a computer processor, computer memory, and input/output peripheral devices—that has a dedicated function within a larger mechanical or electronic system. It is e ...
s. However, ARM processors are also used for desktops and servers, including Fugaku, the world's fastest
supercomputer A supercomputer is a type of computer with a high level of performance as compared to a general-purpose computer. The performance of a supercomputer is commonly measured in floating-point operations per second (FLOPS) instead of million instruc ...
from 2020 to 2022. With over 230 billion ARM chips produced, , ARM is the most widely used family of instruction set architectures. There have been several generations of the ARM design. The original ARM1 used a
32-bit In computer architecture, 32-bit computing refers to computer systems with a processor, memory, and other major system components that operate on data in a maximum of 32- bit units. Compared to smaller bit widths, 32-bit computers can perform la ...
internal structure but had a 26-bit
address space In computing, an address space defines a range of discrete addresses, each of which may correspond to a network host, peripheral device, disk sector, a memory cell or other logical or physical entity. For software programs to save and retrieve ...
that limited it to 64 MB of
main memory Computer data storage or digital data storage is a technology consisting of computer components and recording media that are used to retain digital data. It is a core function and fundamental component of computers. The central processin ...
. This limitation was removed in the ARMv3 series, which has a 32-bit address space, and several additional generations up to ARMv7 remained 32-bit. Released in 2011, the ARMv8-A architecture added support for a 64-bit address space and 64-bit arithmetic with its new 32-bit fixed-length instruction set. Arm Holdings has also released a series of additional instruction sets for different roles: the "Thumb" extensions add both 32- and 16-bit instructions for improved code density, while Jazelle added instructions for directly handling
Java bytecode Java bytecode is the instruction set of the Java virtual machine (JVM), the language to which Java and other JVM-compatible source code is compiled. Each instruction is represented by a single byte, hence the name bytecode, making it a compact ...
. More recent changes include the addition of
simultaneous multithreading Simultaneous multithreading (SMT) is a technique for improving the overall efficiency of superscalar CPUs with hardware multithreading. SMT permits multiple independent threads of execution to better use the resources provided by modern proces ...
(SMT) for improved performance or
fault tolerance Fault tolerance is the ability of a system to maintain proper operation despite failures or faults in one or more of its components. This capability is essential for high-availability, mission-critical, or even life-critical systems. Fault t ...
.


History


BBC Micro

Acorn Computers Acorn Computers Ltd. was a British computer company established in Cambridge, England in 1978 by Hermann Hauser, Christopher Curry (businessman), Chris Curry and Andy Hopper. The company produced a number of computers during the 1980s with asso ...
' first widely successful design was the
BBC Micro The BBC Microcomputer System, or BBC Micro, is a family of microcomputers developed and manufactured by Acorn Computers in the early 1980s as part of the BBC's Computer Literacy Project. Launched in December 1981, it was showcased across severa ...
, introduced in December 1981. This was a relatively conventional machine based on the
MOS Technology 6502 The MOS Technology 6502 (typically pronounced "sixty-five-oh-two" or "six-five-oh-two") William Mensch and the moderator both pronounce the 6502 microprocessor as ''"sixty-five-oh-two"''. is an 8-bit computing, 8-bit microprocessor that was desi ...
CPU but ran at roughly double the performance of competing designs like the
Apple II Apple II ("apple Roman numerals, two", stylized as Apple ][) is a series of microcomputers manufactured by Apple Computer, Inc. from 1977 to 1993. The Apple II (original), original Apple II model, which gave the series its name, was designed ...
due to its use of faster dynamic random-access memory (DRAM). Typical DRAM of the era ran at about 2 MHz; Acorn arranged a deal with
Hitachi () is a Japanese Multinational corporation, multinational Conglomerate (company), conglomerate founded in 1910 and headquartered in Chiyoda, Tokyo. The company is active in various industries, including digital systems, power and renewable ener ...
for a supply of faster 4 MHz parts. Machines of the era generally shared memory between the processor and the framebuffer, which allowed the processor to quickly update the contents of the screen without having to perform separate
input/output In computing, input/output (I/O, i/o, or informally io or IO) is the communication between an information processing system, such as a computer, and the outside world, such as another computer system, peripherals, or a human operator. Inputs a ...
(I/O). As the timing of the video display is exacting, the video hardware had to have priority access to that memory. Due to a quirk of the 6502's design, the CPU left the memory untouched for half of the time. Thus by running the CPU at 1 MHz, the video system could read data during those down times, taking up the total 2 MHz bandwidth of the RAM. In the BBC Micro, the use of 4 MHz RAM allowed the same technique to be used, but running at twice the speed. This allowed it to outperform any similar machine on the market.


Acorn Business Computer

1981 was also the year that the IBM Personal Computer was introduced. Using the recently introduced
Intel 8088 The Intel 8088 ("''eighty-eighty-eight''", also called iAPX 88) microprocessor is a variant of the Intel 8086. Introduced on June 1, 1979, the 8088 has an eight-bit external data bus instead of the 16-bit bus of the 8086. The 16-bit registers ...
, a
16-bit 16-bit microcomputers are microcomputers that use 16-bit microprocessors. A 16-bit register can store 216 different values. The range of integer values that can be stored in 16 bits depends on the integer representation used. With the two ...
CPU compared to the 6502's 8-bit design, it offered higher overall performance. Its introduction changed the desktop computer market radically: what had been largely a hobby and gaming market emerging over the prior five years began to change to a must-have business tool where the earlier 8-bit designs simply could not compete. Even newer
32-bit In computer architecture, 32-bit computing refers to computer systems with a processor, memory, and other major system components that operate on data in a maximum of 32- bit units. Compared to smaller bit widths, 32-bit computers can perform la ...
designs were also coming to market, such as the
Motorola 68000 The Motorola 68000 (sometimes shortened to Motorola 68k or m68k and usually pronounced "sixty-eight-thousand") is a 16/32-bit complex instruction set computer (CISC) microprocessor, introduced in 1979 by Motorola Semiconductor Products Sector ...
and National Semiconductor NS32016. Acorn began considering how to compete in this market and produced a new paper design named the Acorn Business Computer. They set themselves the goal of producing a machine with ten times the performance of the BBC Micro, but at the same price. This would outperform and underprice the PC. At the same time, the recent introduction of the Apple Lisa brought the
graphical user interface A graphical user interface, or GUI, is a form of user interface that allows user (computing), users to human–computer interaction, interact with electronic devices through Graphics, graphical icon (computing), icons and visual indicators such ...
(GUI) concept to a wider audience and suggested the future belonged to machines with a GUI. The Lisa, however, cost $9,995, as it was packed with support chips, large amounts of memory, and a
hard disk drive A hard disk drive (HDD), hard disk, hard drive, or fixed disk is an electro-mechanical data storage device that stores and retrieves digital data using magnetic storage with one or more rigid rapidly rotating hard disk drive platter, pla ...
, all very expensive then. The engineers then began studying all of the CPU designs available. Their conclusion about the existing 16-bit designs was that they were a lot more expensive and were still "a bit crap", offering only slightly higher performance than their BBC Micro design. They also almost always demanded a large number of support chips to operate even at that level, which drove up the cost of the computer as a whole. These systems would simply not hit the design goal. They also considered the new 32-bit designs, but these cost even more and had the same issues with support chips. According to
Sophie Wilson Sophie Mary Wilson (born Roger Wilson; June 1957) is an English computer scientist, a co-designer of the instruction set for the ARM architecture. Wilson first designed a microcomputer during a break from studies at Selwyn College, Cambridge. ...
, all the processors tested at that time performed about the same, with about a 4 Mbit/s bandwidth. Two key events led Acorn down the path to ARM. One was the publication of a series of reports from the
University of California, Berkeley The University of California, Berkeley (UC Berkeley, Berkeley, Cal, or California), is a Public university, public Land-grant university, land-grant research university in Berkeley, California, United States. Founded in 1868 and named after t ...
, which suggested that a simple chip design could nevertheless have extremely high performance, much higher than the latest 32-bit designs on the market. The second was a visit by Steve Furber and Sophie Wilson to the Western Design Center, a company run by
Bill Mensch William David Mensch, Jr. (born February 9, 1945) is an American Electrical engineering, electrical engineer born in Quakertown, Pennsylvania. He was a major contributor to the design of the Motorola 6800 8-bit microprocessor and was part of the ...
and his sister, which had become the logical successor to the MOS team and was offering new versions like the
WDC 65C02 The Western Design Center (WDC) 65C02 microprocessor is an enhanced CMOS version of the popular nMOS-based 8-bit MOS Technology 6502. It uses less power than the original 6502, fixes several problems, and adds new instructions and addressing ...
. The Acorn team saw high school students producing chip layouts on Apple II machines, which suggested that anyone could do it. In contrast, a visit to another design firm working on modern 32-bit CPU revealed a team with over a dozen members who were already on revision H of their design and yet it still contained bugs. This cemented their late 1983 decision to begin their own CPU design, the Acorn RISC Machine.


Design concepts

The original Berkeley RISC designs were in some sense teaching systems, not designed specifically for outright performance. To the RISC's basic register-heavy and load/store concepts, ARM added a number of the well-received design notes of the 6502. Primary among them was the ability to quickly service
interrupt In digital computers, an interrupt (sometimes referred to as a trap) is a request for the processor to ''interrupt'' currently executing code (when permitted), so that the event can be processed in a timely manner. If the request is accepted ...
s, which allowed the machines to offer reasonable
input/output In computing, input/output (I/O, i/o, or informally io or IO) is the communication between an information processing system, such as a computer, and the outside world, such as another computer system, peripherals, or a human operator. Inputs a ...
performance with no added external hardware. To offer interrupts with similar performance as the 6502, the ARM design limited its physical
address space In computing, an address space defines a range of discrete addresses, each of which may correspond to a network host, peripheral device, disk sector, a memory cell or other logical or physical entity. For software programs to save and retrieve ...
to 64 MB of total addressable space, requiring 26 bits of address. As instructions were 4 bytes (32 bits) long, and required to be aligned on 4-byte boundaries, the lower 2 bits of an instruction address were always zero. This meant the
program counter The program counter (PC), commonly called the instruction pointer (IP) in Intel x86 and Itanium microprocessors, and sometimes called the instruction address register (IAR), the instruction counter, or just part of the instruction sequencer, ...
(PC) only needed to be 24 bits, allowing it to be stored along with the eight bit processor flags in a single 32-bit register. That meant that upon receiving an interrupt, the entire machine state could be saved in a single operation, whereas had the PC been a full 32-bit value, it would require separate operations to store the PC and the status flags. This decision halved the interrupt overhead. Another change, and among the most important in terms of practical real-world performance, was the modification of the
instruction set In computer science, an instruction set architecture (ISA) is an abstract model that generally defines how software controls the CPU in a computer or a family of computers. A device or program that executes instructions described by that ISA, s ...
to take advantage of page mode DRAM. Recently introduced, page mode allowed subsequent accesses of memory to run twice as fast if they were roughly in the same location, or "page", in the DRAM chip. Berkeley's design did not consider page mode and treated all memory equally. The ARM design added special vector-like memory access instructions, the "S-cycles", that could be used to fill or save multiple registers in a single page using page mode. This doubled memory performance when they could be used, and was especially important for graphics performance. The Berkeley RISC designs used register windows to reduce the number of register saves and restores performed in procedure calls; the ARM design did not adopt this. Wilson developed the instruction set, writing a simulation of the processor in BBC BASIC that ran on a BBC Micro with a second 6502 processor. This convinced Acorn engineers they were on the right track. Wilson approached Acorn's CEO, Hermann Hauser, and requested more resources. Hauser gave his approval and assembled a small team to design the actual processor based on Wilson's ISA. The official Acorn RISC Machine project started in October 1983.


ARM1

Acorn chose VLSI Technology as the "silicon partner", as they were a source of ROMs and custom chips for Acorn. Acorn provided the design and VLSI provided the layout and production. The first samples of ARM silicon worked properly when first received and tested on 26 April 1985. Known as ARM1, these versions ran at 6 MHz. The first ARM application was as a second processor for the BBC Micro, where it helped in developing simulation software to finish development of the support chips (VIDC, IOC, MEMC), and sped up the CAD software used in ARM2 development. Wilson subsequently rewrote
BBC BASIC BBC BASIC is an interpreted version of the BASIC programming language. It was developed by Acorn Computers Ltd when they were selected by the BBC to supply the computer for their BBC Literacy Project in 1981. It was originally supplied on ...
in ARM
assembly language In computing, assembly language (alternatively assembler language or symbolic machine code), often referred to simply as assembly and commonly abbreviated as ASM or asm, is any low-level programming language with a very strong correspondence bet ...
. The in-depth knowledge gained from designing the instruction set enabled the code to be very dense, making ARM BBC BASIC an extremely good test for any ARM emulator.


ARM2

The result of the simulations on the ARM1 boards led to the late 1986 introduction of the ARM2 design running at 8 MHz, and the early 1987 speed-bumped version at 10 to 12 MHz. A significant change in the underlying architecture was the addition of a Booth multiplier, whereas formerly multiplication had to be carried out in software. Further, a new Fast Interrupt reQuest mode, FIQ for short, allowed registers 8 through 14 to be replaced as part of the interrupt itself. This meant FIQ requests did not have to save out their registers, further speeding interrupts. The first use of the ARM2 was in ARM Evaluations systems, supplied as a second processor for BBC Micro and Master machines, from July 1986, internal Acorn A500 development machines, and the
Acorn Archimedes The Acorn Archimedes is a family of personal computers designed by Acorn Computers of Cambridge, England. The systems in this family use Acorn's own ARM architecture processors and initially ran the Arthur operating system, with later models ...
personal computer models A305, A310, and A440, launched on the 6th June 1987. According to the Dhrystone benchmark, the ARM2 was roughly seven times the performance of a typical 7 MHz 68000-based system like the
Amiga Amiga is a family of personal computers produced by Commodore International, Commodore from 1985 until the company's bankruptcy in 1994, with production by others afterward. The original model is one of a number of mid-1980s computers with 16-b ...
or Macintosh SE. It was twice as fast as an
Intel 80386 The Intel 386, originally released as the 80386 and later renamed i386, is the third-generation x86 architecture microprocessor from Intel. It was the first 32-bit computing, 32-bit processor in the line, making it a significant evolution in ...
running at 16 MHz, and about the same speed as a multi-processor VAX-11/784
superminicomputer A superminicomputer, colloquially supermini, is a high-end minicomputer. The term is used to distinguish the emerging 32-bit architecture midrange computers introduced in the mid to late 1970s from the classical 16-bit systems that preceded them ...
. The only systems that beat it were the Sun SPARC and MIPS R2000 RISC-based
workstation A workstation is a special computer designed for technical or computational science, scientific applications. Intended primarily to be used by a single user, they are commonly connected to a local area network and run multi-user operating syste ...
s. Further, as the CPU was designed for high-speed I/O, it dispensed with many of the support chips seen in these machines; notably, it lacked any dedicated
direct memory access Direct memory access (DMA) is a feature of computer systems that allows certain hardware subsystems to access main system computer memory, memory independently of the central processing unit (CPU). Without DMA, when the CPU is using programmed i ...
(DMA) controller which was often found on workstations. The graphics system was also simplified based on the same set of underlying assumptions about memory and timing. The result was a dramatically simplified design, offering performance on par with expensive workstations but at a price point similar to contemporary desktops. The ARM2 featured a
32-bit In computer architecture, 32-bit computing refers to computer systems with a processor, memory, and other major system components that operate on data in a maximum of 32- bit units. Compared to smaller bit widths, 32-bit computers can perform la ...
data bus, 26-bit address space and 27 32-bit registers, of which 16 are accessible at any one time (including the PC). The ARM2 had a
transistor count The transistor count is the number of transistors in an electronic device (typically on a single substrate or silicon die). It is the most common measure of integrated circuit complexity (although the majority of transistors in modern microproc ...
of just 30,000, compared to Motorola's six-year-older 68000 model with around 68,000. Much of this simplicity came from the lack of
microcode In processor design, microcode serves as an intermediary layer situated between the central processing unit (CPU) hardware and the programmer-visible instruction set architecture of a computer. It consists of a set of hardware-level instructions ...
, which represents about one-quarter to one-third of the 68000's transistors, and the lack of (like most CPUs of the day) a cache. This simplicity enabled the ARM2 to have a low power consumption and simpler thermal packaging by having fewer powered transistors. Nevertheless, ARM2 offered better performance than the contemporary 1987 IBM PS/2 Model 50, which initially utilised an
Intel 80286 The Intel 80286 (also marketed as the iAPX 286 and often called Intel 286) is a 16-bit microprocessor that was introduced on February 1, 1982. It was the first 8086-based CPU with separate, non- multiplexed address and data buses and also the f ...
, offering 1.8 MIPS @ 10 MHz, and later in 1987, the 2 MIPS of the PS/2 70, with its
Intel 386 The Intel 386, originally released as the 80386 and later renamed i386, is the third-generation x86 architecture microprocessor from Intel. It was the first 32-bit processor in the line, making it a significant evolution in the x86 architect ...
DX @ 16 MHz. A successor, ARM3, was produced with a 4 KB cache, which further improved performance. The address bus was extended to 32 bits in the ARM6, but program code still had to lie within the first 64 MB of memory in 26-bit compatibility mode, due to the reserved bits for the status flags.


Advanced RISC Machines Ltd. – ARM6

In the late 1980s,
Apple Computer Apple Inc. is an American multinational corporation and technology company headquartered in Cupertino, California, in Silicon Valley. It is best known for its consumer electronics, software, and services. Founded in 1976 as Apple Computer Co ...
and VLSI Technology started working with Acorn on newer versions of the ARM core. In 1990, Acorn spun off the design team into a new company named Advanced RISC Machines Ltd., which became ARM Ltd. when its parent company, Arm Holdings plc, floated on the
London Stock Exchange The London Stock Exchange (LSE) is a stock exchange based in London, England. the total market value of all companies trading on the LSE stood at US$3.42 trillion. Its current premises are situated in Paternoster Square close to St Paul's Cath ...
and
Nasdaq The Nasdaq Stock Market (; National Association of Securities Dealers Automated Quotations) is an American stock exchange based in New York City. It is the most active stock trading venue in the U.S. by volume, and ranked second on the list ...
in 1998. The new Apple–ARM work would eventually evolve into the ARM6, first released in early 1992. Apple used the ARM6-based ARM610 as the basis for their
Apple Newton The Newton is a specified standard and series of personal digital assistants (PDAs) developed and marketed by Apple Inc., Apple Computer, Inc. from 1993 to 1998. An early device in the PDA categorythe term itself originating with the Newtonit w ...
PDA.


Early licensees

In 1994, Acorn used the ARM610 as the main
central processing unit A central processing unit (CPU), also called a central processor, main processor, or just processor, is the primary Processor (computing), processor in a given computer. Its electronic circuitry executes Instruction (computing), instructions ...
(CPU) in their RiscPC computers. DEC licensed the ARMv4 architecture and produced the
StrongARM The StrongARM is a family of computer microprocessors developed by Digital Equipment Corporation and manufactured in the late 1990s which implemented the ARM v4 instruction set architecture. It was later acquired by Intel in 1997 from DEC's o ...
. At 233 
MHz The hertz (symbol: Hz) is the unit of frequency in the International System of Units (SI), often described as being equivalent to one event (or cycle) per second. The hertz is an SI derived unit whose formal expression in terms of SI base u ...
, this CPU drew only one watt (newer versions draw far less). This work was later passed to Intel as part of a lawsuit settlement, and Intel took the opportunity to supplement their i960 line with the StrongARM. Intel later developed its own high performance implementation named
XScale XScale is a microarchitecture for central processing units initially designed by Intel implementing the ARM architecture (version 5) instruction set. XScale comprises several distinct families: IXP, IXC, IOP, PXA and CE (see more below), with some ...
, which it has since sold to Marvell. Transistor count of the ARM core remained essentially the same throughout these changes; ARM2 had 30,000 transistors, while ARM6 grew only to 35,000.


Market share

In 2005, about 98% of all mobile phones sold used at least one ARM processor. In 2010, producers of chips based on ARM architectures reported shipments of 6.1 billion ARM-based processors, representing 95% of
smartphone A smartphone is a mobile phone with advanced computing capabilities. It typically has a touchscreen interface, allowing users to access a wide range of applications and services, such as web browsing, email, and social media, as well as multi ...
s, 35% of
digital television Digital television (DTV) is the transmission of television signals using Digital signal, digital encoding, in contrast to the earlier analog television technology which used analog signals. At the time of its development it was considered an ...
s and
set-top box A set-top box (STB), also known as a cable converter box, cable box, receiver, or simply box, and historically television decoder or a converter, is an information appliance device that generally contains a Tuner (radio)#Television, TV tuner inpu ...
es, and 10% of
mobile computer Mobile computing is human–computer interaction in which a computer is expected to be transported during normal usage and allow for transmission of data, which can include voice and video transmissions. Mobile computing involves mobile commun ...
s. In 2011, the 32-bit ARM architecture was the most widely used architecture in mobile devices and the most popular 32-bit one in embedded systems. In 2013, 10 billion were produced and "ARM-based chips are found in nearly 60 percent of the world's mobile devices".


Licensing


Core licence

Arm Holdings's primary business is selling IP cores, which licensees use to create
microcontroller A microcontroller (MC, uC, or μC) or microcontroller unit (MCU) is a small computer on a single integrated circuit. A microcontroller contains one or more CPUs (processor cores) along with memory and programmable input/output peripherals. Pro ...
s (MCUs), CPUs, and systems-on-chips based on those cores. The original design manufacturer combines the ARM core with other parts to produce a complete device, typically one that can be built in existing semiconductor fabrication plants (fabs) at low cost and still deliver substantial performance. The most successful implementation has been the ARM7TDMI with hundreds of millions sold. Atmel has been a precursor design center in the ARM7TDMI-based embedded system. The ARM architectures used in smartphones, PDAs and other
mobile device A mobile device or handheld device is a computer small enough to hold and operate in hand. Mobile devices are typically battery-powered and possess a flat-panel display and one or more built-in input devices, such as a touchscreen or keypad. ...
s range from ARMv5 to . In 2009, some manufacturers introduced netbooks based on ARM architecture CPUs, in direct competition with netbooks based on
Intel Atom Intel Atom is a line of IA-32 and x86-64 instruction set ultra-low-voltage processors by Intel Corporation designed to reduce electric consumption and power dissipation in comparison with ordinary processors of the Intel Core series. Atom is m ...
. Arm Holdings offers a variety of licensing terms, varying in cost and deliverables. Arm Holdings provides to all licensees an integratable hardware description of the ARM core as well as complete software development toolset (
compiler In computing, a compiler is a computer program that Translator (computing), translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primaril ...
,
debugger A debugger is a computer program used to test and debug other programs (the "target" programs). Common features of debuggers include the ability to run or halt the target program using breakpoints, step through code line by line, and display ...
,
software development kit A software development kit (SDK) is a collection of software development tools in one installable package. They facilitate the creation of applications by having a compiler, debugger and sometimes a software framework. They are normally specific t ...
), and the right to sell manufactured
silicon Silicon is a chemical element; it has symbol Si and atomic number 14. It is a hard, brittle crystalline solid with a blue-grey metallic lustre, and is a tetravalent metalloid (sometimes considered a non-metal) and semiconductor. It is a membe ...
containing the ARM CPU. SoC packages integrating ARM's core designs include Nvidia Tegra's first three generations, CSR plc's Quatro family, ST-Ericsson's Nova and NovaThor, Silicon Labs's Precision32 MCU, Texas Instruments's OMAP products, Samsung's Hummingbird and
Exynos The Samsung Exynos (stylized as SΛMSUNG Exynos), formerly Hummingbird (), is a series of ARM architecture, Arm-based System on a chip, system-on-chips developed by Samsung Electronics' System LSI division and manufactured by Samsung Foundry. I ...
products, Apple's A4, A5, and A5X, and NXP's i.MX.
Fabless Fabless manufacturing is the design and sale of hardware devices and semiconductor chips while outsourcing their fabrication (or ''fab'') to a specialized manufacturer called a semiconductor foundry. These foundries are typically, but not exclu ...
licensees, who wish to integrate an ARM core into their own chip design, are usually only interested in acquiring a ready-to-manufacture verified
semiconductor intellectual property core A semiconductor is a material with electrical conductivity between that of a conductor and an insulator. Its conductivity can be modified by adding impurities (" doping") to its crystal structure. When two regions with different doping level ...
. For these customers, Arm Holdings delivers a gate netlist description of the chosen ARM core, along with an abstracted simulation model and test programs to aid design integration and verification. More ambitious customers, including integrated device manufacturers (IDM) and foundry operators, choose to acquire the processor IP in synthesizable RTL (
Verilog Verilog, standardized as IEEE 1364, is a hardware description language (HDL) used to model electronic systems. It is most commonly used in the design and verification of digital circuits, with the highest level of abstraction being at the re ...
) form. With the synthesizable RTL, the customer has the ability to perform architectural level optimisations and extensions. This allows the designer to achieve exotic design goals not otherwise possible with an unmodified netlist ( high clock speed, very low power consumption, instruction set extensions, etc.). While Arm Holdings does not grant the licensee the right to resell the ARM architecture itself, licensees may freely sell manufactured products such as chip devices, evaluation boards and complete systems. Merchant foundries can be a special case; not only are they allowed to sell finished silicon containing ARM cores, they generally hold the right to re-manufacture ARM cores for other customers. Arm Holdings prices its IP based on perceived value. Lower performing ARM cores typically have lower licence costs than higher performing cores. In implementation terms, a synthesisable core costs more than a hard macro (blackbox) core. Complicating price matters, a merchant foundry that holds an ARM licence, such as Samsung or Fujitsu, can offer fab customers reduced licensing costs. In exchange for acquiring the ARM core through the foundry's in-house design services, the customer can reduce or eliminate payment of ARM's upfront licence fee. Compared to dedicated semiconductor foundries (such as
TSMC Taiwan Semiconductor Manufacturing Company Limited (TSMC or Taiwan Semiconductor) is a Taiwanese multinational semiconductor contract manufacturing and design company. It is one of the world's most valuable semiconductor companies, the world' ...
and UMC) without in-house design services, Fujitsu/Samsung charge two- to three-times more per manufactured wafer. For low to mid volume applications, a design service foundry offers lower overall pricing (through subsidisation of the licence fee). For high volume mass-produced parts, the long term cost reduction achievable through lower wafer pricing reduces the impact of ARM's NRE (
non-recurring engineering Non-recurring engineering (NRE) cost refers to the one-time cost to research, design, develop and test a new product or product enhancement. When budgeting for a new product, NRE must be considered to analyze if a new product will be profitable. ...
) costs, making the dedicated foundry a better choice. Companies that have developed chips with cores designed by Arm include Amazon.com's Annapurna Labs subsidiary,
Analog Devices Analog Devices, Inc. (ADI), also known simply as Analog, is an American multinational corporation, multinational semiconductor company specializing in data conversion, signal processing, and power management technology, headquartered in Wilming ...
,
Apple An apple is a round, edible fruit produced by an apple tree (''Malus'' spp.). Fruit trees of the orchard or domestic apple (''Malus domestica''), the most widely grown in the genus, are agriculture, cultivated worldwide. The tree originated ...
, AppliedMicro (now: MACOM Technology Solutions), Atmel,
Broadcom Broadcom Inc. is an American multinational corporation, multinational designer, developer, manufacturer, and global supplier of a wide range of semiconductor and infrastructure software products. Broadcom's product offerings serve the data cen ...
,
Cavium Cavium, Inc. was a fabless semiconductor company based in San Jose, California, specializing in ARM-based and MIPS-based network, video and security processors and SoCs. The company was co-founded in 2000 by Syed B. Ali and M. Raghib Hussain, ...
, Cypress Semiconductor, Freescale Semiconductor (now
NXP Semiconductors NXP Semiconductors N.V. is a Dutch semiconductor manufacturing and design company with headquarters in Eindhoven, Netherlands. It is the third largest European semiconductor company by market capitalization as of 2024. The company employs approx ...
),
Huawei Huawei Technologies Co., Ltd. ("Huawei" sometimes stylized as "HUAWEI"; ; zh, c=华为, p= ) is a Chinese multinational corporationtechnology company in Longgang, Shenzhen, Longgang, Shenzhen, Guangdong. Its main product lines include teleco ...
,
Intel Intel Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California, and Delaware General Corporation Law, incorporated in Delaware. Intel designs, manufactures, and sells computer compo ...
, Maxim Integrated,
Nvidia Nvidia Corporation ( ) is an American multinational corporation and technology company headquartered in Santa Clara, California, and incorporated in Delaware. Founded in 1993 by Jensen Huang (president and CEO), Chris Malachowsky, and Curti ...
, NXP,
Qualcomm Qualcomm Incorporated () is an American multinational corporation headquartered in San Diego, California, and Delaware General Corporation Law, incorporated in Delaware. It creates semiconductors, software and services related to wireless techn ...
, Renesas,
Samsung Electronics Samsung Electronics Co., Ltd. (SEC; stylized as SΛMSUNG; ) is a South Korean multinational major appliance and consumer electronics corporation founded on 13 January 1969 and headquartered in Yeongtong District, Suwon, South Korea. It is curr ...
, ST Microelectronics,
Texas Instruments Texas Instruments Incorporated (TI) is an American multinational semiconductor company headquartered in Dallas, Texas. It is one of the top 10 semiconductor companies worldwide based on sales volume. The company's focus is on developing analog ...
, and
Xilinx Xilinx, Inc. ( ) was an American technology and semiconductor company that primarily supplied programmable logic devices. The company is renowned for inventing the first commercially viable field-programmable gate array (FPGA). It also pioneered ...
.


Built on ARM Cortex Technology licence

In February 2016, ARM announced the Built on ARM Cortex Technology licence, often shortened to Built on Cortex (BoC) licence. This licence allows companies to partner with ARM and make modifications to ARM Cortex designs. These design modifications will not be shared with other companies. These semi-custom core designs also have brand freedom, for example Kryo 280. Companies that are current licensees of Built on ARM Cortex Technology include
Qualcomm Qualcomm Incorporated () is an American multinational corporation headquartered in San Diego, California, and Delaware General Corporation Law, incorporated in Delaware. It creates semiconductors, software and services related to wireless techn ...
.


Architectural licence

Companies can also obtain an ARM ''architectural licence'' for designing their own CPU cores using the ARM instruction sets. These cores must comply fully with the ARM architecture. Companies that have designed cores that implement an ARM architecture include Apple, AppliedMicro (now:
Ampere Computing Ampere Computing LLC is an American Fabless manufacturing, fabless semiconductor industry, semiconductor company based in Santa Clara County, California, Santa Clara, California that develops processors for servers operating in large scale envi ...
), Broadcom,
Cavium Cavium, Inc. was a fabless semiconductor company based in San Jose, California, specializing in ARM-based and MIPS-based network, video and security processors and SoCs. The company was co-founded in 2000 by Syed B. Ali and M. Raghib Hussain, ...
(now: Marvell),
Digital Equipment Corporation Digital Equipment Corporation (DEC ), using the trademark Digital, was a major American company in the computer industry from the 1960s to the 1990s. The company was co-founded by Ken Olsen and Harlan Anderson in 1957. Olsen was president until ...
, Intel, Nvidia, Qualcomm, Samsung Electronics, Fujitsu, and NUVIA Inc. (acquired by Qualcomm in 2021).


ARM Flexible Access

On 16 July 2019, ARM announced ARM Flexible Access. ARM Flexible Access provides unlimited access to included ARM
intellectual property Intellectual property (IP) is a category of property that includes intangible creations of the human intellect. There are many types of intellectual property, and some countries recognize more than others. The best-known types are patents, co ...
(IP) for development. Per product licence fees are required once a customer reaches foundry tapeout or prototyping. 75% of ARM's most recent IP over the last two years are included in ARM Flexible Access. As of October 2019: * CPUs: Cortex-A5, Cortex-A7, Cortex-A32, Cortex-A34, Cortex-A35, Cortex-A53, Cortex-R5, Cortex-R8, Cortex-R52, Cortex-M0, Cortex-M0+, Cortex-M3, Cortex-M4, Cortex-M7, Cortex-M23, Cortex-M33 * GPUs: Mali-G52, Mali-G31. Includes Mali Driver Development Kits (DDK). * Interconnect: CoreLink NIC-400, CoreLink NIC-450, CoreLink CCI-400, CoreLink CCI-500, CoreLink CCI-550, ADB-400 AMBA, XHB-400 AXI-AHB * System Controllers: CoreLink GIC-400, CoreLink GIC-500, PL192 VIC, BP141 TrustZone Memory Wrapper, CoreLink TZC-400, CoreLink L2C-310, CoreLink MMU-500, BP140 Memory Interface * Security IP: CryptoCell-312, CryptoCell-712, TrustZone True Random Number Generator * Peripheral Controllers: PL011 UART, PL022 SPI, PL031 RTC * Debug & Trace: CoreSight SoC-400, CoreSight SDC-600, CoreSight STM-500, CoreSight System Trace Macrocell, CoreSight Trace Memory Controller * Design Kits: Corstone-101, Corstone-201 * Physical IP: Artisan PIK for Cortex-M33 TSMC 22ULL including memory compilers, logic libraries, GPIOs and documentation * Tools & Materials: Socrates IP ToolingARM Design Studio, Virtual System Models * Support: Standard ARM Technical support, ARM online training, maintenance updates, credits toward onsite training and design reviews


Cores

Arm provides a list of vendors who implement ARM cores in their design (application specific standard products (ASSP), microprocessor and microcontrollers).


Example applications of ARM cores

ARM cores are used in a number of products, particularly PDAs and
smartphone A smartphone is a mobile phone with advanced computing capabilities. It typically has a touchscreen interface, allowing users to access a wide range of applications and services, such as web browsing, email, and social media, as well as multi ...
s. Some
computing Computing is any goal-oriented activity requiring, benefiting from, or creating computer, computing machinery. It includes the study and experimentation of algorithmic processes, and the development of both computer hardware, hardware and softw ...
examples are
Microsoft Microsoft Corporation is an American multinational corporation and technology company, technology conglomerate headquartered in Redmond, Washington. Founded in 1975, the company became influential in the History of personal computers#The ear ...
's first generation Surface, Surface 2 and
Pocket PC A Pocket PC (P/PC, PPC) is a class of personal digital assistant (PDA) that runs the Windows Mobile operating system, which is based on Windows Embedded Compact, Windows CE/Windows Embedded Compact, and that has some of the abilities of modern ...
devices (following
2002 The effects of the September 11 attacks of the previous year had a significant impact on the affairs of 2002. The war on terror was a major political focus. Without settled international law, several nations engaged in anti-terror operation ...
),
Apple An apple is a round, edible fruit produced by an apple tree (''Malus'' spp.). Fruit trees of the orchard or domestic apple (''Malus domestica''), the most widely grown in the genus, are agriculture, cultivated worldwide. The tree originated ...
's
iPad The iPad is a brand of tablet computers developed and marketed by Apple Inc., Apple that run the company's mobile operating systems iOS and later iPadOS. The IPad (1st generation), first-generation iPad was introduced on January 27, 2010. ...
s, and Asus's Eee Pad Transformer
tablet computer A tablet computer, commonly shortened to tablet, is a mobile device, typically with a mobile operating system and touchscreen display processing circuitry, and a rechargeable battery in a single, thin and flat package. Tablets, being computers ...
s, and several
Chromebook Chromebook (sometimes stylized in lowercase as chromebook) is a line of laptops, desktops, tablets and all-in-one computers that run ChromeOS, a proprietary operating system developed by Google. Chromebooks are optimised for web access. They al ...
laptops. Others include Apple's
iPhone The iPhone is a line of smartphones developed and marketed by Apple that run iOS, the company's own mobile operating system. The first-generation iPhone was announced by then–Apple CEO and co-founder Steve Jobs on January 9, 2007, at ...
smartphone A smartphone is a mobile phone with advanced computing capabilities. It typically has a touchscreen interface, allowing users to access a wide range of applications and services, such as web browsing, email, and social media, as well as multi ...
s and
iPod The iPod is a series of portable media players and multi-purpose mobile devices that were designed and marketed by Apple Inc. from 2001 to 2022. The iPod Classic#1st generation, first version was released on November 10, 2001, about mon ...
portable media player A portable media player (PMP) or digital audio player (DAP) is a portable consumer electronics device capable of storing and playing digital media such as audio, images, and video files. Normally they refer to small, Electric battery, batter ...
s, Canon PowerShot
digital camera A digital camera, also called a digicam, is a camera that captures photographs in Digital data storage, digital memory. Most cameras produced today are digital, largely replacing those that capture images on photographic film or film stock. Dig ...
s,
Nintendo Switch The is a video game console developed by Nintendo and released worldwide in most regions on March 3, 2017. Released in the middle of the Eighth generation of video game consoles, eighth generation of home consoles, the Switch succeeded the ...
hybrid, the Wii security processor and 3DS
handheld game console A handheld game console, or simply handheld console, is a small, portable self-contained video game console with a built-in screen, game controls and speakers. Handheld game consoles are smaller than home video game consoles and contain the con ...
s, and TomTom turn-by-turn navigation systems. In 2005, Arm took part in the development of
Manchester University The University of Manchester is a public university, public research university in Manchester, England. The main campus is south of Manchester city centre, Manchester City Centre on Wilmslow Road, Oxford Road. The University of Manchester is c ...
's computer
SpiNNaker A spinnaker is a sail designed specifically for sailing off the wind on courses between a Point of sail#Reaching, reach (wind at 90° to the course) to Point of sail#Running downwind, downwind (course in the same direction as the wind). Spinna ...
, which used ARM cores to simulate the
human brain The human brain is the central organ (anatomy), organ of the nervous system, and with the spinal cord, comprises the central nervous system. It consists of the cerebrum, the brainstem and the cerebellum. The brain controls most of the activi ...
. ARM chips are also used in
Raspberry Pi Raspberry Pi ( ) is a series of small single-board computers (SBCs) developed in the United Kingdom by the Raspberry Pi Foundation in collaboration with Broadcom Inc., Broadcom. To commercialize the product and support its growing demand, the ...
, BeagleBoard, BeagleBone, PandaBoard, and other
single-board computer A single-board computer (SBC) is a complete computer built on a single circuit board, with microprocessor(s), memory, input/output (I/O) and other features required of a functional computer. Single-board computers are commonly made as demonst ...
s, because they are very small, inexpensive, and consume very little power.


32-bit architecture

The 32-bit ARM architecture (ARM32), such as ARMv7-A (implementing AArch32; see section on Armv8-A for more on it), was the most widely used architecture in mobile devices . Since 1995, various versions of the ''ARM Architecture Reference Manual'' (see ) have been the primary source of documentation on the ARM processor architecture and instruction set, distinguishing interfaces that all ARM processors are required to support (such as instruction semantics) from implementation details that may vary. The architecture has evolved over time, and version seven of the architecture, ARMv7, defines three architecture "profiles": * A-profile, the "Application" profile, implemented by 32-bit cores in the Cortex-A series and by some non-ARM cores * R-profile, the "Real-time" profile, implemented by cores in the Cortex-R series * M-profile, the "Microcontroller" profile, implemented by most cores in the Cortex-M series Although the architecture profiles were first defined for ARMv7, ARM subsequently defined the ARMv6-M architecture (used by the Cortex M0/ M0+/ M1) as a subset of the ARMv7-M profile with fewer instructions.


CPU modes

Except in the M-profile, the 32-bit ARM architecture specifies several CPU modes, depending on the implemented architecture features. At any moment in time, the CPU can be in only one mode, but it can switch modes due to external events (interrupts) or programmatically. * ''User mode:'' The only non-privileged mode. * ''FIQ mode:'' A privileged mode that is entered whenever the processor accepts a fast interrupt request. * ''IRQ mode:'' A privileged mode that is entered whenever the processor accepts an interrupt. * ''Supervisor (svc) mode:'' A privileged mode entered whenever the CPU is reset or when an SVC instruction is executed. * ''Abort mode:'' A privileged mode that is entered whenever a prefetch abort or data abort exception occurs. * ''Undefined mode:'' A privileged mode that is entered whenever an undefined instruction exception occurs. * ''System mode (ARMv4 and above):'' The only privileged mode that is not entered by an exception. It can only be entered by executing an instruction that explicitly writes to the mode bits of the Current Program Status Register (CPSR) from another privileged mode (not from user mode). * ''Monitor mode (ARMv6 and ARMv7 Security Extensions, ARMv8 EL3):'' A monitor mode is introduced to support TrustZone extension in ARM cores. * ''Hyp mode (ARMv7 Virtualization Extensions, ARMv8 EL2):'' A hypervisor mode that supports
Popek and Goldberg virtualization requirements The Popek and Goldberg virtualization requirements are a set of conditions sufficient for a computer architecture to support system virtualization efficiently. They were introduced by Gerald J. Popek and Robert P. Goldberg in their 1974 article " ...
for the non-secure operation of the CPU. * ''Thread mode (ARMv6-M, ARMv7-M, ARMv8-M):'' A mode which can be specified as either privileged or unprivileged. Whether the Main Stack Pointer (MSP) or Process Stack Pointer (PSP) is used can also be specified in CONTROL register with privileged access. This mode is designed for user tasks in RTOS environment but it is typically used in bare-metal for super-loop. * ''Handler mode (ARMv6-M, ARMv7-M, ARMv8-M):'' A mode dedicated for exception handling (except the RESET which are handled in Thread mode). Handler mode always uses MSP and works in privileged level.


Instruction set

The original (and subsequent) ARM implementation was hardwired without
microcode In processor design, microcode serves as an intermediary layer situated between the central processing unit (CPU) hardware and the programmer-visible instruction set architecture of a computer. It consists of a set of hardware-level instructions ...
, like the much simpler 8-bit 6502 processor used in prior Acorn microcomputers. The 32-bit ARM architecture (and the 64-bit architecture for the most part) includes the following RISC features: * Load–store architecture. * No support for unaligned memory accesses in the original version of the architecture. ARMv6 and later, except some microcontroller versions, support unaligned accesses for half-word and single-word load/store instructions with some limitations, such as no guaranteed atomicity. * Uniform 16 × 32-bit register file (including the program counter, stack pointer and the link register). * Fixed instruction width of 32 bits to ease decoding and pipelining, at the cost of decreased code density. Later, the Thumb instruction set added 16-bit instructions and increased code density. * Mostly single clock-cycle execution. To compensate for the simpler design, compared with processors like the Intel 80286 and Motorola 68020, some additional design features were used: * Conditional execution of most instructions reduces branch overhead and compensates for the lack of a
branch predictor In computer architecture, a branch predictor is a digital circuit that tries to guess which way a branch (e.g., an if–then–else structure) will go before this is known definitively. The purpose of the branch predictor is to improve the flow ...
in early chips. * Arithmetic instructions alter condition codes only when desired. * 32-bit
barrel shifter A barrel shifter is a digital circuit that can bit shift, shift a word (data type), data word by a specified number of bits without the use of any sequential logic, only pure combinational logic, i.e. it inherently provides a binary operation. I ...
can be used without performance penalty with most arithmetic instructions and address calculations. * Has powerful indexed
addressing mode Addressing modes are an aspect of the instruction set architecture in most central processing unit (CPU) designs. The various addressing modes that are defined in a given instruction set architecture define how the machine language instructions ...
s. * A link register supports fast leaf function calls. * A simple, but fast, 2-priority-level
interrupt In digital computers, an interrupt (sometimes referred to as a trap) is a request for the processor to ''interrupt'' currently executing code (when permitted), so that the event can be processed in a timely manner. If the request is accepted ...
subsystem has switched register banks.


Arithmetic instructions

ARM includes integer arithmetic operations for add, subtract, and multiply; some versions of the architecture also support divide operations. ARM supports 32-bit × 32-bit multiplies with either a 32-bit result or 64-bit result, though Cortex-M0 / M0+ / M1 cores do not support 64-bit results. Some ARM cores also support 16-bit × 16-bit and 32-bit × 16-bit multiplies. The divide instructions are only included in the following ARM architectures: * Armv7-M and Armv7E-M architectures always include divide instructions. * Armv7-R architecture always includes divide instructions in the Thumb instruction set, but optionally in its 32-bit instruction set. * Armv7-A architecture optionally includes the divide instructions. The instructions might not be implemented, or implemented only in the Thumb instruction set, or implemented in both the Thumb and ARM instruction sets, or implemented if the Virtualization Extensions are included.


Registers

Registers R0 through R7 are the same across all CPU modes; they are never banked. Registers R8 through R12 are the same across all CPU modes except FIQ mode. FIQ mode has its own distinct R8 through R12 registers. R13 and R14 are banked across all privileged CPU modes except system mode. That is, each mode that can be entered because of an exception has its own R13 and R14. These registers generally contain the stack pointer and the return address from function calls, respectively. Aliases: * R13 is also referred to as SP, the stack pointer. * R14 is also referred to as LR, the link register. * R15 is also referred to as PC, the
program counter The program counter (PC), commonly called the instruction pointer (IP) in Intel x86 and Itanium microprocessors, and sometimes called the instruction address register (IAR), the instruction counter, or just part of the instruction sequencer, ...
. The Current Program Status Register (CPSR) has the following 32 bits. * M (bits 0–4) is the processor mode bits. * T (bit 5) is the Thumb state bit. * F (bit 6) is the FIQ disable bit. * I (bit 7) is the IRQ disable bit. * A (bit 8) is the imprecise data abort disable bit. * E (bit 9) is the data endianness bit. * IT (bits 10–15 and 25–26) is the if-then state bits. * GE (bits 16–19) is the greater-than-or-equal-to bits. * DNM (bits 20–23) is the do not modify bits. * J (bit 24) is the Java state bit. * Q (bit 27) is the sticky overflow bit. * V (bit 28) is the overflow bit. * C (bit 29) is the carry/borrow/extend bit. * Z (bit 30) is the zero bit. * N (bit 31) is the negative/less than bit.


Conditional execution

Almost every ARM instruction has a conditional execution feature called predication, which is implemented with a 4-bit condition code selector (the predicate). To allow for unconditional execution, one of the four-bit codes causes the instruction to be always executed. Most other CPU architectures only have condition codes on branch instructions. Though the predicate takes up four of the 32 bits in an instruction code, and thus cuts down significantly on the encoding bits available for displacements in memory access instructions, it avoids branch instructions when generating code for small if statements. Apart from eliminating the branch instructions themselves, this preserves the fetch/decode/execute pipeline at the cost of only one cycle per skipped instruction. An algorithm that provides a good example of conditional execution is the subtraction-based
Euclidean algorithm In mathematics, the Euclidean algorithm,Some widely used textbooks, such as I. N. Herstein's ''Topics in Algebra'' and Serge Lang's ''Algebra'', use the term "Euclidean algorithm" to refer to Euclidean division or Euclid's algorithm, is a ...
for computing the
greatest common divisor In mathematics, the greatest common divisor (GCD), also known as greatest common factor (GCF), of two or more integers, which are not all zero, is the largest positive integer that divides each of the integers. For two integers , , the greatest co ...
. In the
C programming language C (''pronounced'' '' – like the letter c'') is a general-purpose programming language. It was created in the 1970s by Dennis Ritchie and remains very widely used and influential. By design, C's features cleanly reflect the capabilities of ...
, the algorithm can be written as: int gcd(int a, int b) The same algorithm can be rewritten in a way closer to target ARM instructions as: loop: // Compare a and b GT = a > b; LT = a < b; NE = a != b; // Perform operations based on flag results if (GT) a -= b; // Subtract *only* if greater-than if (LT) b -= a; // Subtract *only* if less-than if (NE) goto loop; // Loop *only* if compared values were not equal return a; and coded in
assembly language In computing, assembly language (alternatively assembler language or symbolic machine code), often referred to simply as assembly and commonly abbreviated as ASM or asm, is any low-level programming language with a very strong correspondence bet ...
as: ; assign a to register r0, b to r1 loop: CMP r0, r1 ; set condition "NE" if (a ≠ b), ; "GT" if (a > b), ; or "LT" if (a < b) SUBGT r0, r0, r1 ; if "GT" (Greater Than), then a = a − b SUBLT r1, r1, r0 ; if "LT" (Less Than), then b = b − a BNE loop ; if "NE" (Not Equal), then loop B lr ; return which avoids the branches around the then and else clauses. If r0 and r1 are equal then neither of the SUB instructions will be executed, eliminating the need for a conditional branch to implement the while check at the top of the loop, for example had SUBLE (less than or equal) been used. One of the ways that Thumb code provides a more dense encoding is to remove the four-bit selector from non-branch instructions.


Other features

Another feature of the
instruction set In computer science, an instruction set architecture (ISA) is an abstract model that generally defines how software controls the CPU in a computer or a family of computers. A device or program that executes instructions described by that ISA, s ...
is the ability to fold shifts and rotates into the ''data processing'' (arithmetic, logical, and register-register move) instructions, so that, for example, the statement in C language: a += (j << 2); could be rendered as a one-word, one-cycle instruction: ADD Ra, Ra, Rj, LSL #2 This results in the typical ARM program being denser than expected with fewer memory accesses; thus the pipeline is used more efficiently. The ARM processor also has features rarely seen in other RISC architectures, such as PC-relative addressing (indeed, on the 32-bit ARM the PC is one of its 16 registers) and pre- and post-increment addressing modes. The ARM instruction set has increased over time. Some early ARM processors (before ARM7TDMI), for example, have no instruction to store a two-byte quantity.


Pipelines and other implementation issues

The ARM7 and earlier implementations have a three-stage
pipeline A pipeline is a system of Pipe (fluid conveyance), pipes for long-distance transportation of a liquid or gas, typically to a market area for consumption. The latest data from 2014 gives a total of slightly less than of pipeline in 120 countries ...
; the stages being fetch, decode, and execute. Higher-performance designs, such as the ARM9, have deeper pipelines: Cortex-A8 has thirteen stages. Additional implementation changes for higher performance include a faster adder and more extensive
branch prediction In computer architecture, a branch predictor is a digital circuit that tries to guess which way a branch (e.g., an if–then–else structure) will go before this is known definitively. The purpose of the branch predictor is to improve the flow ...
logic. The difference between the ARM7DI and ARM7DMI cores, for example, was an improved multiplier; hence the added "M".


Coprocessors

The ARM architecture (pre-Armv8) provides a non-intrusive way of extending the instruction set using "coprocessors" that can be addressed using MCR, MRC, MRRC, MCRR, and similar instructions. The coprocessor space is divided logically into 16 coprocessors with numbers from 0 to 15, coprocessor 15 (cp15) being reserved for some typical control functions like managing the caches and MMU operation on processors that have one. In ARM-based machines, peripheral devices are usually attached to the processor by mapping their physical registers into ARM memory space, into the coprocessor space, or by connecting to another device (a bus) that in turn attaches to the processor. Coprocessor accesses have lower latency, so some peripherals—for example, an XScale interrupt controller—are accessible in both ways: through memory and through coprocessors. In other cases, chip designers only integrate hardware using the coprocessor mechanism. For example, an image processing engine might be a small ARM7TDMI core combined with a coprocessor that has specialised operations to support a specific set of HDTV transcoding primitives.


Debugging

All modern ARM processors include hardware debugging facilities, allowing software debuggers to perform operations such as halting, stepping, and breakpointing of code starting from reset. These facilities are built using JTAG support, though some newer cores optionally support ARM's own two-wire "SWD" protocol. In ARM7TDMI cores, the "D" represented JTAG debug support, and the "I" represented presence of an "EmbeddedICE" debug module. For ARM7 and ARM9 core generations, EmbeddedICE over JTAG was a de facto debug standard, though not architecturally guaranteed. The ARMv7 architecture defines basic debug facilities at an architectural level. These include breakpoints, watchpoints and instruction execution in a "Debug Mode"; similar facilities were also available with EmbeddedICE. Both "halt mode" and "monitor" mode debugging are supported. The actual transport mechanism used to access the debug facilities is not architecturally specified, but implementations generally include JTAG support. There is a separate ARM "CoreSight" debug architecture, which is not architecturally required by ARMv7 processors.


Debug Access Port

The Debug Access Port (DAP) is an implementation of an ARM Debug Interface. There are two different supported implementations, the Serial Wire JTAG Debug Port (SWJ-DP) and the Serial Wire Debug Port (SW-DP). CMSIS-DAP is a standard interface that describes how various debugging software on a host PC can communicate over USB to firmware running on a hardware debugger, which in turn talks over SWD or JTAG to a CoreSight-enabled ARM Cortex CPU.


DSP enhancement instructions

To improve the ARM architecture for
digital signal processing Digital signal processing (DSP) is the use of digital processing, such as by computers or more specialized digital signal processors, to perform a wide variety of signal processing operations. The digital signals processed in this manner are a ...
and multimedia applications, DSP instructions were added to the instruction set. These are signified by an "E" in the name of the ARMv5TE and ARMv5TEJ architectures. E-variants also imply T, D, M, and I. The new instructions are common in
digital signal processor A digital signal processor (DSP) is a specialized microprocessor chip, with its architecture optimized for the operational needs of digital signal processing. DSPs are fabricated on metal–oxide–semiconductor (MOS) integrated circuit chips. ...
(DSP) architectures. They include variations on signed multiply–accumulate, saturated add and subtract, and count leading zeros. First introduced in 1999, this extension of the core instruction set contrasted with ARM's earlier DSP coprocessor known as Piccolo, which employed a distinct, incompatible instruction set whose execution involved a separate program counter. Piccolo instructions employed a distinct register file of sixteen 32-bit registers, with some instructions combining registers for use as 48-bit accumulators and other instructions addressing 16-bit half-registers. Some instructions were able to operate on two such 16-bit values in parallel. Communication with the Piccolo register file involved ''load to Piccolo'' and ''store from Piccolo'' coprocessor instructions via two buffers of eight 32-bit entries. Described as reminiscent of other approaches, notably Hitachi's SH-DSP and Motorola's 68356, Piccolo did not employ dedicated local memory and relied on the bandwidth of the ARM core for DSP operand retrieval, impacting concurrent performance. Piccolo's distinct instruction set also proved not to be a "good compiler target".


SIMD extensions for multimedia

Introduced in the ARMv6 architecture, this was a precursor to Advanced SIMD, also named
Neon Neon is a chemical element; it has symbol Ne and atomic number 10. It is the second noble gas in the periodic table. Neon is a colorless, odorless, inert monatomic gas under standard conditions, with approximately two-thirds the density of ...
.


Jazelle

Jazelle DBX (Direct Bytecode eXecution) is a technique that allows
Java bytecode Java bytecode is the instruction set of the Java virtual machine (JVM), the language to which Java and other JVM-compatible source code is compiled. Each instruction is represented by a single byte, hence the name bytecode, making it a compact ...
to be executed directly in the ARM architecture as a third execution state (and instruction set) alongside the existing ARM and Thumb-mode. Support for this state is signified by the "J" in the ARMv5TEJ architecture, and in ARM9EJ-S and ARM7EJ-S core names. Support for this state is required starting in ARMv6 (except for the ARMv7-M profile), though newer cores only include a trivial implementation that provides no hardware acceleration.


Thumb

To improve compiled code density, processors since the ARM7TDMI (released in 1994) have featured the ''Thumb'' compressed instruction set, which have their own state. (The "T" in "TDMI" indicates the Thumb feature.) When in this state, the processor executes the Thumb instruction set, a compact 16-bit encoding for a subset of the ARM instruction set. Most of the Thumb instructions are directly mapped to normal ARM instructions. The space saving comes from making some of the instruction operands implicit and limiting the number of possibilities compared to the ARM instructions executed in the ARM instruction set state. In Thumb, the 16-bit opcodes have less functionality. For example, only branches can be conditional, and many opcodes are restricted to accessing only half of all of the CPU's general-purpose registers. The shorter opcodes give improved code density overall, even though some operations require extra instructions. In situations where the memory port or bus width is constrained to less than 32 bits, the shorter Thumb opcodes allow increased performance compared with 32-bit ARM code, as less program code may need to be loaded into the processor over the constrained memory bandwidth. Unlike processor architectures with variable length (16- or 32-bit) instructions, such as the Cray-1 and
Hitachi () is a Japanese Multinational corporation, multinational Conglomerate (company), conglomerate founded in 1910 and headquartered in Chiyoda, Tokyo. The company is active in various industries, including digital systems, power and renewable ener ...
SuperH SuperH (or SH) is a 32-bit reduced instruction set computing (RISC) instruction set architecture (ISA) developed by Hitachi and currently produced by Renesas. It is implemented by microcontrollers and microprocessors for embedded systems. At the ...
, the ARM and Thumb instruction sets exist independently of each other. Embedded hardware, such as the
Game Boy Advance The (GBA) is a 32-bit handheld game console, manufactured by Nintendo, which was released in Japan on March 21, 2001, and to international markets that June. It was later released in mainland China in 2004, under the name iQue Game Boy Advanc ...
, typically have a small amount of RAM accessible with a full 32-bit datapath; the majority is accessed via a 16-bit or narrower secondary datapath. In this situation, it usually makes sense to compile Thumb code and hand-optimise a few of the most CPU-intensive sections using full 32-bit ARM instructions, placing these wider instructions into the 32-bit bus accessible memory. The first processor with a Thumb instruction decoder was the ARM7TDMI. All processors supporting 32-bit instruction sets, starting with ARM9, and including XScale, have included a Thumb instruction decoder. It includes instructions adopted from the Hitachi
SuperH SuperH (or SH) is a 32-bit reduced instruction set computing (RISC) instruction set architecture (ISA) developed by Hitachi and currently produced by Renesas. It is implemented by microcontrollers and microprocessors for embedded systems. At the ...
(1992), which was licensed by ARM. ARM's smallest processor families (Cortex M0 and M1) implement only the 16-bit Thumb instruction set for maximum performance in lowest cost applications. ARM processors that don't support 32-bit addressing also omit Thumb.


Thumb-2

''Thumb-2'' technology was introduced in the ''ARM1156 core'', announced in 2003. Thumb-2 extends the limited 16-bit instruction set of Thumb with additional 32-bit instructions to give the instruction set more breadth, thus producing a variable-length instruction set. A stated aim for Thumb-2 was to achieve code density similar to Thumb with performance similar to the ARM instruction set on 32-bit memory. Thumb-2 extends the Thumb instruction set with bit-field manipulation, table branches and conditional execution. At the same time, the ARM instruction set was extended to maintain equivalent functionality in both instruction sets. A new "Unified Assembly Language" (UAL) supports generation of either Thumb or ARM instructions from the same source code; versions of Thumb seen on ARMv7 processors are essentially as capable as ARM code (including the ability to write interrupt handlers). This requires a bit of care, and use of a new "IT" (if-then) instruction, which permits up to four successive instructions to execute based on a tested condition, or on its inverse. When compiling into ARM code, this is ignored, but when compiling into Thumb it generates an actual instruction. For example: ; if (r0

r1) CMP r0, r1 ITE EQ ; ARM: no code ... Thumb: IT instruction ; then r0 = r2; MOVEQ r0, r2 ; ARM: conditional; Thumb: condition via ITE 'T' (then) ; else r0 = r3; MOVNE r0, r3 ; ARM: conditional; Thumb: condition via ITE 'E' (else) ; recall that the Thumb MOV instruction has no bits to encode "EQ" or "NE".
All ARMv7 chips support the Thumb instruction set. All chips in the Cortex-A series that support ARMv7, all Cortex-R series, and all ARM11 series support both "ARM instruction set state" and "Thumb instruction set state", while chips in the Cortex-M series support only the Thumb instruction set.


Thumb Execution Environment (ThumbEE)

''ThumbEE'' (erroneously called ''Thumb-2EE'' in some ARM documentation), which was marketed as Jazelle RCT (Runtime Compilation Target), was announced in 2005 and deprecated in 2011. It first appeared in the ''Cortex-A8'' processor. ThumbEE is a fourth instruction set state, making small changes to the Thumb-2 extended instruction set. These changes make the instruction set particularly suited to code generated at runtime (e.g. by JIT compilation) in managed ''Execution Environments''. ThumbEE is a target for languages such as
Java Java is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea (a part of Pacific Ocean) to the north. With a population of 156.9 million people (including Madura) in mid 2024, proje ...
, C#,
Perl Perl is a high-level, general-purpose, interpreted, dynamic programming language. Though Perl is not officially an acronym, there are various backronyms in use, including "Practical Extraction and Reporting Language". Perl was developed ...
, and Python, and allows JIT compilers to output smaller compiled code without reducing performance. New features provided by ThumbEE include automatic null pointer checks on every load and store instruction, an instruction to perform an array bounds check, and special instructions that call a handler. In addition, because it utilises Thumb-2 technology, ThumbEE provides access to registers r8–r15 (where the Jazelle/DBX Java VM state is held). Handlers are small sections of frequently called code, commonly used to implement high level languages, such as allocating memory for a new object. These changes come from repurposing a handful of opcodes, and knowing the core is in the new ThumbEE state. On 23 November 2011, Arm deprecated any use of the ThumbEE instruction set, and Armv8 removes support for ThumbEE.


Floating-point (VFP)

''VFP'' (Vector Floating Point) technology is a
floating-point unit A floating-point unit (FPU), numeric processing unit (NPU), colloquially math coprocessor, is a part of a computer system specially designed to carry out operations on floating-point numbers. Typical operations are addition, subtraction, multip ...
(FPU) coprocessor extension to the ARM architecture (implemented differently in Armv8 – coprocessors not defined there). It provides low-cost single-precision and
double-precision floating-point Double-precision floating-point format (sometimes called FP64 or float64) is a floating-point number format, usually occupying 64 bits in computer memory; it represents a wide range of numeric values by using a floating radix point. Double prec ...
computation fully compliant with the '' ANSI/IEEE Std 754-1985 Standard for Binary Floating-Point Arithmetic''. VFP provides floating-point computation suitable for a wide spectrum of applications such as PDAs, smartphones, voice compression and decompression, three-dimensional graphics and digital audio, printers, set-top boxes, and automotive applications. The VFP architecture was intended to support execution of short "vector mode" instructions but these operated on each vector element sequentially and thus did not offer the performance of true
single instruction, multiple data Single instruction, multiple data (SIMD) is a type of parallel computer, parallel processing in Flynn's taxonomy. SIMD describes computers with multiple processing elements that perform the same operation on multiple data points simultaneousl ...
(SIMD) vector parallelism. This vector mode was therefore removed shortly after its introduction, to be replaced with the much more powerful Advanced SIMD, also named
Neon Neon is a chemical element; it has symbol Ne and atomic number 10. It is the second noble gas in the periodic table. Neon is a colorless, odorless, inert monatomic gas under standard conditions, with approximately two-thirds the density of ...
. Some devices such as the ARM Cortex-A8 have a cut-down ''VFPLite'' module instead of a full VFP module, and require roughly ten times more clock cycles per float operation. Pre-Armv8 architecture implemented floating-point/SIMD with the coprocessor interface. Other floating-point and/or SIMD units found in ARM-based processors using the coprocessor interface include FPA, FPE, iwMMXt, some of which were implemented in software by trapping but could have been implemented in hardware. They provide some of the same functionality as VFP but are not
opcode In computing, an opcode (abbreviated from operation code) is an enumerated value that specifies the operation to be performed. Opcodes are employed in hardware devices such as arithmetic logic units (ALUs), central processing units (CPUs), and ...
-compatible with it. FPA10 also provides
extended precision Extended precision refers to floating-point number formats that provide greater precision than the basic floating-point formats. Extended-precision formats support a basic format by minimizing roundoff and overflow errors in intermediate value ...
, but implements correct rounding (required by IEEE 754) only in single precision. ; VFPv1: Obsolete ; VFPv2: An optional extension to the ARM instruction set in the ARMv5TE, ARMv5TEJ and ARMv6 architectures. VFPv2 has 16 64-bit FPU registers. ; VFPv3 or VFPv3-D32: Implemented on most Cortex-A8 and A9 ARMv7 processors. It is backward-compatible with VFPv2, except that it cannot trap floating-point exceptions. VFPv3 has 32 64-bit FPU registers as standard, adds VCVT instructions to convert between scalar, float and double, adds immediate mode to VMOV such that constants can be loaded into FPU registers. ; VFPv3-D16: As above, but with only 16 64-bit FPU registers. Implemented on Cortex-R4 and R5 processors and the Tegra 2 (Cortex-A9). ; VFPv3-F16: Uncommon; it supports IEEE754-2008 half-precision (16-bit) floating point as a storage format. ; VFPv4 or VFPv4-D32:Implemented on Cortex-A12 and A15 ARMv7 processors, Cortex-A7 optionally has VFPv4-D32 in the case of an FPU with Neon. VFPv4 has 32 64-bit FPU registers as standard, adds both half-precision support as a storage format and fused multiply-accumulate instructions to the features of VFPv3. ; VFPv4-D16: As above, but it has only 16 64-bit FPU registers. Implemented on Cortex-A5 and A7 processors in the case of an FPU without Neon. ; VFPv5-D16-M: Implemented on Cortex-M7 when single and double-precision floating-point core option exists. In
Debian Debian () is a free and open-source software, free and open source Linux distribution, developed by the Debian Project, which was established by Ian Murdock in August 1993. Debian is one of the oldest operating systems based on the Linux kerne ...
Linux Linux ( ) is a family of open source Unix-like operating systems based on the Linux kernel, an kernel (operating system), operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically package manager, pac ...
and derivatives such as
Ubuntu Ubuntu ( ) is a Linux distribution based on Debian and composed primarily of free and open-source software. Developed by the British company Canonical (company), Canonical and a community of contributors under a Meritocracy, meritocratic gover ...
and
Linux Mint Linux Mint is a community-developed Linux distribution. It is based on Ubuntu and designed for x86-64 based computers; another variant is based on Debian which is named Linux Mint Debian Edition (LMDE) and has both 64-bit and IA-32 support. T ...
, armhf (ARM hard float) refers to the ARMv7 architecture including the additional VFP3-D16 floating-point hardware extension (and Thumb-2) above. Software packages and cross-compiler tools use the armhf vs. arm/armel suffixes to differentiate.


Advanced SIMD (Neon)

The ''Advanced SIMD'' extension (also known as ''Neon'' or "MPE" Media Processing Engine) is a combined 64- and
128-bit General home computing and gaming utility emerged at 8-bit word sizes, as 28=256 Word (computer architecture), words, a natural unit of data, became possible. Early 8-bit CPUs (such as the Zilog Z80 and MOS Technology 6502, used in the 1977 Co ...
SIMD instruction set that provides standardised acceleration for media and signal processing applications. Neon is included in all Cortex-A8 devices, but is optional in Cortex-A9 devices. Neon can execute MP3 audio decoding on CPUs running at 10 MHz, and can run the
GSM The Global System for Mobile Communications (GSM) is a family of standards to describe the protocols for second-generation (2G) digital cellular networks, as used by mobile devices such as mobile phones and Mobile broadband modem, mobile broadba ...
adaptive multi-rate (AMR) speech codec at 13 MHz. It features a comprehensive instruction set, separate register files, and independent execution hardware. Neon supports 8-, 16-, 32-, and 64-bit integer and single-precision (32-bit) floating-point data and SIMD operations for handling audio and video processing as well as graphics and gaming processing. In Neon, the SIMD supports up to 16 operations at the same time. The Neon hardware shares the same floating-point registers as used in VFP. Devices such as the ARM Cortex-A8 and Cortex-A9 support 128-bit vectors, but will execute with 64 bits at a time, whereas newer Cortex-A15 devices can execute 128 bits at a time. A quirk of Neon in Armv7 devices is that it flushes all subnormal numbers to zero, and as a result the GCC compiler will not use it unless , which allows losing denormals, is turned on. "Enhanced" Neon defined since Armv8 does not have this quirk, but as of the same flag is still required to enable Neon instructions. On the other hand, GCC does consider Neon safe on AArch64 for Armv8. ProjectNe10 is ARM's first open-source project (from its inception; while they acquired an older project, now named
Mbed TLS Mbed TLS (previously PolarSSL) is an implementation of the Transport Layer Security, TLS and SSL protocols and the respective cryptographic algorithms and support code required. It is distributed under the Apache License version 2.0. Stated on t ...
). The Ne10 library is a set of common, useful functions written in both Neon and C (for compatibility). The library was created to allow developers to use Neon optimisations without learning Neon, but it also serves as a set of highly optimised Neon intrinsic and assembly code examples for common DSP, arithmetic, and image processing routines. The source code is available on GitHub.


ARM Helium technology

Helium is the M-Profile Vector Extension (MVE). It adds more than 150 scalar and vector instructions.


Security extensions


TrustZone (for Cortex-A profile)

The Security Extensions, marketed as TrustZone Technology, is in ARMv6KZ and later application profile architectures. It provides a low-cost alternative to adding another dedicated security core to an SoC, by providing two virtual processors backed by hardware based access control. This lets the application core switch between two states, referred to as ''worlds'' (to reduce confusion with other names for capability domains), to prevent information leaking from the more trusted world to the less trusted world. This world switch is generally orthogonal to all other capabilities of the processor, thus each world can operate independently of the other while using the same core. Memory and peripherals are then made aware of the operating world of the core and may use this to provide access control to secrets and code on the device. Typically, a rich operating system is run in the less trusted world, with smaller security-specialised code in the more trusted world, aiming to reduce the
attack surface The attack surface of a software environment is the sum of the different points (for " attack vectors") where an unauthorized user (the "attacker") can try to enter data to, extract data, control a device or critical software in an environment. Ke ...
. Typical applications include DRM functionality for controlling the use of media on ARM-based devices, and preventing any unapproved use of the device. In practice, since the specific implementation details of proprietary TrustZone implementations have not been publicly disclosed for review, it is unclear what level of assurance is provided for a given
threat model Threat modeling is a process by which potential threats, such as structural vulnerabilities or the absence of appropriate safeguards, can be identified and enumerated, and countermeasures prioritized. The purpose of threat modeling is to provide d ...
, but they are not immune from attack. Open Virtualization is an open source implementation of the trusted world architecture for TrustZone.
AMD Advanced Micro Devices, Inc. (AMD) is an American multinational corporation and technology company headquartered in Santa Clara, California and maintains significant operations in Austin, Texas. AMD is a hardware and fabless company that de ...
has licensed and incorporated TrustZone technology into its Secure Processor Technology. AMD's APUs include a Cortex-A5 processor for handling secure processing, which is enabled in some, but not all products. In fact, the Cortex-A5 TrustZone core had been included in earlier AMD products, but was not enabled due to time constraints. Samsung Knox uses TrustZone for purposes such as detecting modifications to the kernel, storing certificates and attestating keys.


TrustZone for Armv8-M (for Cortex-M profile)

The Security Extension, marketed as TrustZone for Armv8-M Technology, was introduced in the Armv8-M architecture. While containing similar concepts to TrustZone for Armv8-A, it has a different architectural design, as world switching is performed using branch instructions instead of using exceptions. It also supports safe interleaved interrupt handling from either world regardless of the current security state. Together these features provide low latency calls to the secure world and responsive interrupt handling. ARM provides a reference stack of secure world code in the form of Trusted Firmware for M and PSA Certified.


No-execute page protection

As of ARMv6, the ARM architecture supports no-execute page protection, which is referred to as ''XN'', for ''eXecute Never''.


Large Physical Address Extension (LPAE)

The Large Physical Address Extension (LPAE), which extends the physical address size from 32 bits to 40 bits, was added to the Armv7-A architecture in 2011. The physical address size may be even larger in processors based on the 64-bit (Armv8-A) architecture. For example, it is 44 bits in Cortex-A75 and Cortex-A65AE.


Armv8-R and Armv8-M

The Armv8-R and Armv8-M architectures, announced after the Armv8-A architecture, share some features with Armv8-A. However, Armv8-M does not include any 64-bit AArch64 instructions, and Armv8-R originally did not include any AArch64 instructions; those instructions were added to Armv8-R later.


Armv8.1-M

The Armv8.1-M architecture, announced in February 2019, is an enhancement of the Armv8-M architecture. It brings new features including: * A new vector instruction set extension. The M-Profile Vector Extension (MVE), or Helium, is for signal processing and machine learning applications. * Additional instruction set enhancements for loops and branches (Low Overhead Branch Extension). * Instructions for half-precision floating-point support. * Instruction set enhancement for TrustZone management for Floating Point Unit (FPU). * New memory attribute in the Memory Protection Unit (MPU). * Enhancements in debug including Performance Monitoring Unit (PMU), Unprivileged Debug Extension, and additional debug support focus on signal processing application developments. * Reliability, Availability and Serviceability (RAS) extension.


64/32-bit architecture


Armv8


Armv8-A

Announced in October 2011, Armv8-A (often called ARMv8 while the Armv8-R is also available) represents a fundamental change to the ARM architecture. It supports two ''Execution states'': a 64-bit state named ''AArch64'' and a 32-bit state named ''AArch32''. In the AArch64 state, a new 64-bit ''A64'' instruction set is supported; in the AArch32 state, two instruction sets are supported: the original 32-bit instruction set, named ''A32'', and the 32-bit Thumb-2 instruction set, named ''T32''. AArch32 provides user-space compatibility with Armv7-A. The processor state can change on an Exception level change; this allows 32-bit applications to be executed in AArch32 state under a 64-bit OS whose kernel executes in AArch64 state, and allows a 32-bit OS to run in AArch32 state under the control of a 64-bit
hypervisor A hypervisor, also known as a virtual machine monitor (VMM) or virtualizer, is a type of computer software, firmware or hardware that creates and runs virtual machines. A computer on which a hypervisor runs one or more virtual machines is called ...
running in AArch64 state. ARM announced their Cortex-A53 and Cortex-A57 cores on 30 October 2012. Apple was the first to release an Armv8-A compatible core in a consumer product ( Apple A7 in iPhone 5S). AppliedMicro, using an
FPGA A field-programmable gate array (FPGA) is a type of configurable integrated circuit that can be repeatedly programmed after manufacturing. FPGAs are a subset of logic devices referred to as programmable logic devices (PLDs). They consist of a ...
, was the first to demo Armv8-A. The first Armv8-A system on a chip, SoC from Samsung Electronics, Samsung is the Exynos 5433 used in the Samsung Galaxy Note 4, Galaxy Note 4, which features two clusters of four Cortex-A57 and Cortex-A53 cores in a ARM big.LITTLE, big.LITTLE configuration; but it will run only in AArch32 mode. To both AArch32 and AArch64, Armv8-A makes VFPv3/v4 and advanced SIMD (Neon) standard. It also adds cryptography instructions supporting Advanced Encryption Standard, AES, SHA-1/SHA-256 and finite field arithmetic. AArch64 was introduced in Armv8-A and its subsequent revision. AArch64 is not included in the 32-bit Armv8-R and Armv8-M architectures. An ARMv8-A processor can support one or both of AArch32 and AArch64; it may support AArch32 and AArch64 at lower Exception levels and only AArch64 at higher Exception levels. For example, the ARM Cortex-A32 supports only AArch32, the ARM Cortex-A34 supports only AArch64, and the ARM Cortex-A72 supports both AArch64 and AArch32. An ARMv9-A processor must support AArch64 at all Exception levels, and may support AArch32 at EL0.


Armv8-R

Optional AArch64 support was added to the Armv8-R profile, with the first ARM core implementing it being the Cortex-R82. It adds the A64 instruction set.


Armv9


Armv9-A

Announced in March 2021, the updated architecture places a focus on secure execution and Compartmentalization (engineering), compartmentalisation. The first ARMv9-A processors were released later that year, including the Cortex-A510, Cortex-A710 and Cortex-X2.


Arm SystemReady

Arm SystemReady is a compliance program that helps ensure the interoperability of an operating system on Arm-based hardware from datacenter servers to industrial edge and IoT devices. The key building blocks of the program are the specifications for minimum hardware and firmware requirements that the operating systems and hypervisors can rely upon. These specifications are: * Base System Architecture (BSA) and the market segment specific supplements (e.g., Server BSA supplement) * Base Boot Requirements (BBR) and Base Boot Security Requirements (BBSR) These specifications are co-developed by Arm Holdings, Arm and its partners in the System Architecture Advisory Committee (SystemArchAC). Architecture Compliance Suite (ACS) is the test tools that help to check the compliance of these specifications. The Arm SystemReady Requirements Specification documents the requirements of the certifications. This program was introduced by Arm Holdings, Arm in 2020 at the first ARM DevSummit, DevSummit event. Its predecessor Arm ServerReady was introduced in 2018 at the Arm TechCon event. This program currently includes two bands: * SystemReady Band: this band focuses on operating system interoperability for Advanced Configuration and Power Interface ACPI environments, where generic operating systems can be installed on either new or old hardware without modification. This band is relevant for systems using Microsoft Windows, Windows,
Linux Linux ( ) is a family of open source Unix-like operating systems based on the Linux kernel, an kernel (operating system), operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically package manager, pac ...
, VMware, and Berkeley Software Distribution, BSD environments. * SystemReady Devicetree Band: this band optimizes install and boot for embedded systems where device tree, devicetree is the preferred method of describing hardware, with a focus on forward compatibility. This applies to
Linux Linux ( ) is a family of open source Unix-like operating systems based on the Linux kernel, an kernel (operating system), operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically package manager, pac ...
distributions and Berkeley Software Distribution, BSD environments specifically.


PSA Certified

PSA Certified, formerly named Platform Security Architecture, is an architecture-agnostic security framework and evaluation scheme. It is intended to help secure Internet of things (IoT) devices built on system-on-a-chip (SoC) processors. It was introduced to increase security where a full trusted execution environment is too large or complex. The architecture was introduced by Arm Holdings, Arm in 2017 at the annual ARM DevSummit, TechCon event. Although the scheme is architecture agnostic, it was first implemented on Arm Cortex-M processor cores intended for microcontroller use. PSA Certified includes freely available threat models and security analyses that demonstrate the process for deciding on security features in common IoT products. It also provides freely downloadable application programming interface (API) packages, architectural specifications, open-source firmware implementations, and related test suites. Following the development of the architecture security framework in 2017, the PSA Certified assurance scheme launched two years later at Embedded World in 2019. PSA Certified offers a multi-level security evaluation scheme for chip vendors, OS providers and IoT device makers. The Embedded World presentation introduced chip vendors to Level 1 Certification. A draft of Level 2 protection was presented at the same time. Level 2 certification became a usable standard in February 2020. The certification was created by PSA Joint Stakeholders to enable a security-by-design approach for a diverse set of IoT products. PSA Certified specifications are implementation and architecture agnostic, as a result they can be applied to any chip, software or device. The certification also removes industry fragmentation for Internet of Things, IoT product manufacturers and developers.


Operating system support


32-bit operating systems


Historical operating systems

The first 32-bit ARM-based personal computer, the
Acorn Archimedes The Acorn Archimedes is a family of personal computers designed by Acorn Computers of Cambridge, England. The systems in this family use Acorn's own ARM architecture processors and initially ran the Arthur operating system, with later models ...
, was originally intended to run an ambitious operating system called ARX (operating system), ARX. The machines shipped with RISC OS, which was also used on later ARM-based systems from Acorn and other vendors. Some early Acorn machines were also able to run a Unix port called RISC iX. (Neither is to be confused with MIPS RISC/os, RISC/os, a contemporary Unix variant for the MIPS architecture.)


Embedded operating systems

The 32-bit ARM architecture is supported by a large number of embedded operating system, embedded and real-time operating systems, including: * A2 (operating system), A2 * Android (operating system), Android * ChibiOS/RT * Deos * DRYOS * eCos * embOS * FreeBSD * FreeRTOS * Integrity (operating system), INTEGRITY * Linux kernel, Linux * Micro-Controller Operating Systems * Mbed * MINIX 3 * MQX * Nucleus RTOS, Nucleus PLUS * NuttX * L4 microkernel family, OKL4 * Operating System Embedded (OSE) * OS-9 * Pharos * Plan 9 from Bell Labs, Plan 9 * PikeOS * QNX * RIOT (operating system), RIOT * RTEMS * RTXC Quadros * SCIOPTA * ThreadX * TizenRT * T-Kernel * VxWorks * Windows CE, Windows Embedded Compact * Windows 10 IoT Core * Zephyr (operating system), Zephyr


Mobile device operating systems

As of March 2024, the 32-bit ARM architecture used to be the primary hardware environment for most mobile device operating systems such as the following but many of these platforms such as Android and Apple iOS have evolved to the 64-bit ARM architecture: * Android (operating system), Android * ChromeOS * Mobian * Sailfish OS, Sailfish * postmarketOS * Tizen * Ubuntu Touch * webOS Formerly, but now discontinued: * Bada (operating system), Bada * BlackBerry OS/BlackBerry 10 * Firefox OS * MeeGo * Newton OS * iOS 10 and earlier * Symbian * Windows 10 Mobile * Windows RT * Windows Phone * Windows Mobile


Desktop and server operating systems

The 32-bit ARM architecture is supported by RISC OS and by multiple Unix-like operating systems including: * FreeBSD * NetBSD * OpenBSD * OpenSolaris * several
Linux Linux ( ) is a family of open source Unix-like operating systems based on the Linux kernel, an kernel (operating system), operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically package manager, pac ...
distributions, such as: **
Debian Debian () is a free and open-source software, free and open source Linux distribution, developed by the Debian Project, which was established by Ian Murdock in August 1993. Debian is one of the oldest operating systems based on the Linux kerne ...
** Armbian ** Gentoo Linux, Gentoo **
Ubuntu Ubuntu ( ) is a Linux distribution based on Debian and composed primarily of free and open-source software. Developed by the British company Canonical (company), Canonical and a community of contributors under a Meritocracy, meritocratic gover ...
** Raspberry Pi OS (formerly Raspbian) ** Slackware ARM, Slackware


64-bit operating systems


Embedded operating systems

* Integrity (operating system), INTEGRITY * Operating System Embedded, OSE * SCIOPTA * L4 microkernel family#High assurance: seL4, seL4 * Pharos * FreeRTOS * QNX * VxWorks * Zephyr (operating system), Zephyr


Mobile device operating systems

* Android (operating system), Android supports Armv8-A in Android Lollipop (5.0) and later. * iOS supports Armv8-A in iOS 7 and later on 64-bit Apple silicon, Apple SoCs. iOS 11 and later, and iPadOS, only support 64-bit ARM processors and applications. * HarmonyOS NEXT was developed specifically for ARM processors, starting from its launch in 2024. * Mobian * PostmarketOS * Arch Linux ARM * Manjaro


Desktop and server operating systems

* Support for Armv8-A was merged into the Linux kernel version 3.7 in late 2012. Armv8-A is supported by a number of Linux distributions, such as: **
Debian Debian () is a free and open-source software, free and open source Linux distribution, developed by the Debian Project, which was established by Ian Murdock in August 1993. Debian is one of the oldest operating systems based on the Linux kerne ...
** Armbian ** Alpine Linux **
Ubuntu Ubuntu ( ) is a Linux distribution based on Debian and composed primarily of free and open-source software. Developed by the British company Canonical (company), Canonical and a community of contributors under a Meritocracy, meritocratic gover ...
** Fedora Linux, Fedora ** NixOS ** openSUSE ** SUSE Linux Enterprise ** Red Hat Enterprise Linux, RHEL ** Raspberry Pi OS (formerly Raspbian) * Support for Armv8-A was merged into FreeBSD in late 2014. * OpenBSD has Armv8 support . * NetBSD has Armv8 support since early 2018. * Windows - Windows 10 runs 32-bit "x86 and 32-bit ARM applications", as well as native ARM64 desktop apps; Windows 11 runs native ARM64 apps and can also run x86 and x86-64 apps via emulation. Support for 64-bit ARM apps in the Microsoft Store has been available since November 2018. * macOS has ARM support since late 2020; the first release to support ARM is macOS Big Sur. Rosetta 2 adds support for x86-64 applications but not virtualization of x86-64 computer platforms.


Porting to 32- or 64-bit ARM operating systems

Windows applications recompiled for ARM and linked with Winelib, from the Wine (software), Wine project, can run on 32-bit or 64-bit ARM in Linux, FreeBSD, or other compatible operating systems. x86 binaries, e.g. when not specially compiled for ARM, have been demonstrated on ARM using QEMU with Wine (on Linux and more), but do not work at full speed or same capability as with Winelib.


Notes


See also

* Amber (processor), Amber – an open-source ARM-compatible processor core * AMULET (processor), AMULET – an asynchronous implementation of the ARM architecture * Apple silicon * ARM Accredited Engineer – certification program * ARM big.LITTLE – ARM's heterogeneous computing architecture ** DynamIQ * ARMulator – an instruction set simulator * Comparison of ARM processors * Meltdown (security vulnerability) * Reduced instruction set computer (RISC) * RISC-V * Spectre (security vulnerability) * Unicore – a 32-register architecture based heavily on a 32-bit ARM


References


Citations


Bibliography

*


Further reading


External links

* , ARM Ltd.


Architecture manuals

* - covers ARMv4, ARMv4T, ARMv5T, (ARMv5TExP), ARMv5TE, ARMv5TEJ, and ARMv6 * * * * * * * * *


Quick-reference cards


Instructions


Thumb

ARM and Thumb-2

Vector Floating Point


Opcodes


Thumb

ARM

GNU Assembler Directives
{{Authority control ARM architecture, Acorn Computers Articles with example code Computer-related introductions in 1983