HOME

TheInfoList



OR:

The ARM Cortex-A78 is a
central processing unit A central processing unit (CPU), also called a central processor, main processor or just processor, is the electronic circuitry that executes instructions comprising a computer program. The CPU performs basic arithmetic, logic, controlling, an ...
implementing the ARMv8.2-A 64-bit
instruction set In computer science, an instruction set architecture (ISA), also called computer architecture, is an abstract model of a computer. A device that executes instructions described by that ISA, such as a central processing unit (CPU), is called a ...
designed by
ARM Ltd. Arm is a British semiconductor and software design company based in Cambridge, England. Its primary business is in the design of ARM processors (CPUs). It also designs other chips, provides software development tools under the DS-5, RealView an ...
's Austin centre, set to be distributed amongst high-end devices in 2020–2021.


Design

The ARM Cortex-A78 is the successor to the
ARM Cortex-A77 The ARM Cortex-A77 is a central processing unit implementing the ARMv8.2-A 64-bit instruction set designed by ARM Holdings' Austin design centre. ARM announced an increase of 23% and 35% in integer and floating point performance, respectively. M ...
. It can be paired with the ARM Cortex-X1 and/or ARM Cortex-A55 CPUs in a DynamIQ configuration to deliver both performance and efficiency. The processor also claims as much as 50% energy savings over its predecessor. The Cortex-A78 is a 4-wide decode out-of-order
superscalar A superscalar processor is a CPU that implements a form of parallelism called instruction-level parallelism within a single processor. In contrast to a scalar processor, which can execute at most one single instruction per clock cycle, a sup ...
design with a 1.5K macro-OP (MOPs) cache. It can fetch 4 instructions and 6 Mops per cycle, and rename and dispatch 6 Mops, and 13 µops per cycle. The out-of-order window size is 160 entries and the backend has 13 execution ports with a pipeline depth of 13 stages, and the execution latencies consist of 10 stages. The processor is built on a standard Cortex-A roadmap and offers a 2.1 GHz (
5 nm In semiconductor manufacturing, the International Roadmap for Devices and Systems defines the 5  nm process as the MOSFET technology node following the 7 nm node. In 2020, Samsung and TSMC entered volume production of 5 nm chips, ...
) chipset which makes it better than its predecessor in the following ways: * 7% better performance * 4% lower power consumption * 5% smaller, meaning 15% more area serving for a quad-core cluster, extra GPU, NPU There is also extended scalability with extra support from Dynamic Shared Unit for DynamIQ on the chipset. A smaller 32 KB L1 cache from the 64 KB L1 cache configuration is optional. To offset this smaller L1 memory, the
branch predictor In computer architecture, a branch predictor is a digital circuit that tries to guess which way a branch (e.g., an if–then–else structure) will go before this is known definitively. The purpose of the branch predictor is to improve the flow i ...
is better at covering irregular search patterns and is capable of following two taken branches per cycle, which results in fewer L1 cache misses and helps hide pipeline bubbles to keep the core well supplied. The pipeline is one cycle longer compared to the A77, which ensures that the A78 hits a
clock frequency In computing, the clock rate or clock speed typically refers to the frequency at which the clock generator of a processor can generate pulses, which are used to synchronize the operations of its components, and is used as an indicator of the p ...
target of around 3 GHz. The A78 is a 6 instruction per cycle design. ARM also introduced a second integer multiply unit in the execution unit and an additional load Address Generation Unit (AGU) to increase both the data load and bandwidth by 50%. Other optimizations of the chipset include fused instructions and efficiency improvements to instruction schedulers, register renaming structures, and the
re-order buffer A re-order buffer (ROB) is a hardware unit used in an extension to the Tomasulo algorithm to support out-of-order and speculative instruction execution. The extension forces instructions to be committed in-order. The buffer is a circular buff ...
. L2 cache is available up to 512 KB and has double the bandwidth to maximize the performance, while the shared L3 cache is available up to 4 MB, double that of previous generations. A Dynamic Shared Unit (DSU) also allows for an 8 MB configuration with the ARM Cortex-X1.


Licensing

The Cortex-A78 is available as a SIP core to licensees whilst its design makes it suitable for integration with other SIP cores (e.g. GPU,
display controller A video display controller or VDC (also called a display engine or display interface) is an integrated circuit which is the main component in a video-signal generator, a device responsible for the production of a TV video signal in a computin ...
, DSP,
image processor An image processor, also known as an image processing engine, image processing unit (IPU), or image signal processor (ISP), is a type of media processor or specialized digital signal processor (DSP) used for image processing, in digital cameras ...
, etc.) into one die constituting a
system on a chip A system on a chip or system-on-chip (SoC ; pl. ''SoCs'' ) is an integrated circuit that integrates most or all components of a computer or other electronic system. These components almost always include a central processing unit (CPU), memor ...
(SoC).


Usage

The Cortex-A78 was first used in the Samsung
Exynos Exynos, formerly Hummingbird (), is a series of ARM-based system-on-chips developed by Samsung Electronics' System LSI division and manufactured by Samsung Foundry. It is a continuation of Samsung's earlier S3C, S5L and S5P line of SoCs. ...
1080 and 2100 SoC, introduced in November and December 2020 respectively. The custom Kryo 680 Gold core used in the
Snapdragon 888 This is a list of Qualcomm Snapdragon systems on chips (SoC) made by Qualcomm for use in smartphones, tablets, laptops, 2-in-1 PCs, smartwatches, and smartbooks devices. Before Snapdragon SoC made by Qualcomm before it was renamed to Snapdr ...
SoC is based on the Cortex-A78 microarchitecture. The Cortex-A78 is also used in the
MediaTek MediaTek Inc. () is a Taiwanese fabless semiconductor company that provides chips for wireless communications, high-definition television, handheld mobile devices like smartphones and tablet computers, navigation systems, consumer multimedia p ...
Dimensity 8000 series.


See also

* ARM Cortex-X1, related high-performance microarchitecture *
ARM Cortex-A77 The ARM Cortex-A77 is a central processing unit implementing the ARMv8.2-A 64-bit instruction set designed by ARM Holdings' Austin design centre. ARM announced an increase of 23% and 35% in integer and floating point performance, respectively. M ...
, predecessor * Comparison of ARMv8-A cores, ARMv8 family


References

{{Application ARM-based chips ARM processors