SuperH (or SH) is a
32-bit reduced instruction set computing (RISC)
instruction set architecture (ISA) developed by
Hitachi and currently produced by
Renesas. It is implemented by
microcontrollers and
microprocessor
A microprocessor is a computer processor (computing), processor for which the data processing logic and control is included on a single integrated circuit (IC), or a small number of ICs. The microprocessor contains the arithmetic, logic, a ...
s for
embedded systems.
At the time of introduction, SuperH was notable for having fixed-length 16-bit instructions in spite of its 32-bit architecture. Using smaller instructions had consequences: the
register file was smaller and instructions were generally two-operand format. However for the market the SuperH was aimed at, this was a small price to pay for the improved memory and
processor cache efficiency.
Later versions of the design, starting with SH-5, included both 16- and 32-bit instructions, with the 16-bit versions mapping onto the 32-bit version inside the CPU. This allowed the
machine code to continue using the shorter instructions to save memory, while not demanding the amount of instruction decoding logic needed if they were completely separate instructions. This concept is now known as a
compressed instruction set and is also used by other companies, the most notable example being
ARM for its
Thumb instruction set.
In 2015, many of the original
patent
A patent is a type of intellectual property that gives its owner the legal right to exclude others from making, using, or selling an invention for a limited period of time in exchange for publishing an sufficiency of disclosure, enabling discl ...
s for the SuperH architecture expired and the SH-2 CPU was reimplemented as
open source hardware under the name
J2.
History
SH-1 and SH-2
The SuperH processor core family was first developed by
Hitachi in the early 1990s. The design concept was for a single
instruction set
In computer science, an instruction set architecture (ISA) is an abstract model that generally defines how software controls the CPU in a computer or a family of computers. A device or program that executes instructions described by that ISA, s ...
(ISA) that would be
upward compatible across a series of
CPU cores.
In the past, this sort of design problem would have been solved using
microcode
In processor design, microcode serves as an intermediary layer situated between the central processing unit (CPU) hardware and the programmer-visible instruction set architecture of a computer. It consists of a set of hardware-level instructions ...
, with the low-end models in the series performing non-implemented instructions as a series of more basic instructions. For instance, a "long multiply" (multiplying two 32-bit registers to produce a 64-bit product) might be implemented in hardware on high-end models but instead be performed as a series of additions on low-end models.
One of the key realizations during the development of the
RISC
In electronics and computer science, a reduced instruction set computer (RISC) is a computer architecture designed to simplify the individual instructions given to the computer to accomplish tasks. Compared to the instructions given to a comp ...
concept was that the microcode had a finite decoding time, and as processors became faster, this represented an unacceptable performance overhead. To address this, Hitachi instead developed a single ISA for the entire line, with unsupported instructions causing traps on those implementations that didn't include hardware support. For instance, the initial models in the line, the SH-1 and SH-2, differed only in their support for 64-bit multiplication; the SH-2 supported , and , whereas the SH-1 would cause a trap if these were encountered.
The SH-1 was the basic model, supporting a total of 56 instructions. The SH-2 added 64-bit multiplication and a few additional commands for branching and other duties, bringing the total to 62 supported instructions. The SH-1 and the SH-2 were used in the
Sega Saturn
The is a home video game console developed by Sega and released on November 22, 1994, in Japan, May 11, 1995, in North America, and July 8, 1995, in Europe. Part of the fifth generation of video game consoles, it is the successor to the succes ...
,
Sega 32X and
Capcom CPS-3.
The ISA uses
16-bit
16-bit microcomputers are microcomputers that use 16-bit microprocessors.
A 16-bit register can store 216 different values. The range of integer values that can be stored in 16 bits depends on the integer representation used. With the two ...
instructions for better code density than 32-bit instructions, which was important at the time due to the high cost of
main memory and the implementation cost of cache. As of 2023, code density is still important for small embedded systems and massively multicore processors. The downsides to this approach were that there were fewer bits available to encode a register number or a constant value. In the original SuperH ISA, there were only 16 general registers, requiring four bits for the source and another four for the destination; however some instructions have an implied R0, R15, or a system register as an extra operand. The instruction opcode is four, eight, twelve, or sixteen bits long, and the remaining four-bit fields are used for register or immediate operands in various ways: there are twelve classes of instructions, for a total of 142 instructions in SH-2.
Delayed branches are introduced for both SH-1 and SH-2. Unconditional branch instructions have one
delay slot.
SH-3
A few years later, the SH-3 core was added to the family; new features included another
interrupt
In digital computers, an interrupt (sometimes referred to as a trap) is a request for the processor to ''interrupt'' currently executing code (when permitted), so that the event can be processed in a timely manner. If the request is accepted ...
concept, a
memory management unit (MMU), and a modified cache concept. These features required an extended instruction set, adding six new instructions for a total of 68. The SH-3 was
bi-endian, running in either big-endian or little-endian byte ordering.
The SH-3 core also added a
DSP extension, then called SH-3-DSP. With extended data paths for efficient DSP processing, special accumulators and a dedicated
MAC-type DSP engine, this core unified the DSP and the RISC processor world. A derivative of the DSP was also used with the original SH-2 core.
Between 1994 and 1996, 35.1 million SuperH devices were shipped worldwide.
SH-4
In 1997, Hitachi and
STMicroelectronics (STM) started collaborating on the design of the SH-4 for the
Dreamcast. SH-4 featured
superscalar (2-way) instruction execution and a
vector floating-point unit
A floating-point unit (FPU), numeric processing unit (NPU), colloquially math coprocessor, is a part of a computer system specially designed to carry out operations on floating-point numbers. Typical operations are addition, subtraction, multip ...
(particularly suited to
3D graphics). Standard chips based on the SH-4 were introduced around 1998.
[
]
Licensing
In early 2001, Hitachi and STM formed the IP company SuperH, Inc., which was going to license the SH-4 core to other companies and was developing the SH-5 architecture, the first move of SuperH into the 64-bit area. The earlier SH-1 through 3 remained the property of Hitachi.
In 2003, Hitachi and Mitsubishi Electric formed a joint-venture called Renesas Technology, with Hitachi controlling 55% of it. In 2004, Renesas Technology bought STMicroelectronics's share of ownership in the SuperH Inc. and with it the licence to the SH cores. Renesas Technology later became Renesas Electronics, following their merger with NEC Electronics.
The SH-5 design supported two modes of operation: SHcompact mode, which is equivalent to the user-mode instructions of the SH-4 instruction set; and SHmedia mode, which is very different in that it uses 32-bit instructions with sixty-four 64-bit integer registers and SIMD instructions. In SHmedia mode the destination of a branch (jump) is loaded into a branch register separately from the actual branch instruction. This allows the processor to prefetch instructions for a branch without having to snoop the instruction stream. The combination of a compact 16-bit instruction encoding with a more powerful 32-bit instruction encoding is not unique to SH-5; ARM processors have a 16-bit Thumb mode (ARM licensed several patents from SuperH for Thumb) and MIPS processors have a MIPS-16 mode. However, SH-5 differs because its backward compatibility mode is the 16-bit encoding rather than the 32-bit encoding.
The last evolutionary step happened around 2003 where the cores from SH-2 up to SH-4 were getting unified into a superscalar SH-X core which formed a kind of instruction set superset of the previous architectures, and added support for symmetric multiprocessing.
Continued availability
Since 2010, the SuperH CPU cores, architecture and products are with Renesas Electronics and the architecture is consolidated around the SH-2, SH-2A, SH-3, SH-4 and SH-4A platforms. The system-on-chip products based on SH-3, SH-4 and SH-4A microprocessors were subsequently replaced by newer generations based on licensed CPU cores from Arm Ltd., with many of the existing models still marketed and sold until March 2025 through the Renesas Product Longevity Program.
As of 2021, the SH72xx microcontrollers based on the SH-2A continue to be marketed by Renesas with guaranteed availability until February 2029, along with newer products based on several other architectures including Arm, RX, and RH850.
J Core
The last of the SH-2 patents expired in 2014. At LinuxCon Japan 2015, j-core developers presented a cleanroom reimplemention of the SH-2 ISA with extensions (known as the "J2 core" due to the unexpired trademarks). Subsequently, a design walkthrough was presented at ELC 2016.
The open source BSD-licensed VHDL
VHDL (Very High Speed Integrated Circuit Program, VHSIC Hardware Description Language) is a hardware description language that can model the behavior and structure of Digital electronics, digital systems at multiple levels of abstraction, ran ...
code for the J2 core has been proven on Xilinx
Xilinx, Inc. ( ) was an American technology and semiconductor company that primarily supplied programmable logic devices. The company is renowned for inventing the first commercially viable field-programmable gate array (FPGA). It also pioneered ...
FPGAs and on ASICs manufactured on TSMC's 180 nm process, and is capable of booting μClinux. J2 is backwards ISA compatible with SH-2, implemented as a 5-stage pipeline with separate Instruction and Data memory interfaces, and a machine-generated Instruction Decoder supporting the densely packed and complex (relative to other RISC machines) ISA. Additional instructions are easy to add. J2 implements instructions for dynamic shift (using the SH-3 and later instruction patterns), extended atomic operations (used for threading primitives) and locking/interfaces for symmetric multiprocessor support. Plans to implement the SH-2A (as "J2+") and SH-4 (as "J4") instruction sets as the relevant patents expire in 2016–2017.
Several features of SuperH have been cited as motivations for designing new cores based on this architecture:
* High code density compared to other 32-bit RISC
In electronics and computer science, a reduced instruction set computer (RISC) is a computer architecture designed to simplify the individual instructions given to the computer to accomplish tasks. Compared to the instructions given to a comp ...
ISAs such as ARM or MIPS important for cache and memory bandwidth performance
* Existing compiler
In computing, a compiler is a computer program that Translator (computing), translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primaril ...
and operating system
An operating system (OS) is system software that manages computer hardware and software resources, and provides common daemon (computing), services for computer programs.
Time-sharing operating systems scheduler (computing), schedule tasks for ...
support (Linux
Linux ( ) is a family of open source Unix-like operating systems based on the Linux kernel, an kernel (operating system), operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically package manager, pac ...
, Windows Embedded, QNX)
* Extremely low ASIC fabrication costs now that the patents are expiring (around for a dual-core J2 core on TSMC's 180 nm process).
* Patent- and royalty-free (BSD-licensed) implementation
* Full and vibrant community support
* Availability of low cost hardware development platform for zero cost FPGA tools
* CPU and SoC RTL generation and integration tools, producing FPGA and ASIC portable RTL and documentation
* Clean, modern design with open source design, generation, simulation and verification environment
Models
The family of SuperH CPU cores includes:
* SH-1 – used in microcontrollers for deeply embedded applications (CD-ROM
A CD-ROM (, compact disc read-only memory) is a type of read-only memory consisting of a pre-pressed optical compact disc that contains computer data storage, data computers can read, but not write or erase. Some CDs, called enhanced CDs, hold b ...
drives, major appliances, etc.)
* SH-2 – used in microcontrollers with higher performance requirements, networking applications, and also in video game consoles, like the Sega Saturn
The is a home video game console developed by Sega and released on November 22, 1994, in Japan, May 11, 1995, in North America, and July 8, 1995, in Europe. Part of the fifth generation of video game consoles, it is the successor to the succes ...
and Sega 32X add-on. The SH-2 has also found home in many automotive engine control unit applications, including Subaru, Mitsubishi, and Mazda
is a Japanese Multinational corporation, multinational automotive manufacturer headquartered in Fuchū, Hiroshima (town), Fuchū, Hiroshima Prefecture, Hiroshima, Japan. The company was founded on January 30, 1920, as Toyo Cork Kogyo Co., Ltd. ...
.
* SH-2A – The SH-2A core is an extension of the SH-2 core including a few extra instructions but most importantly moving to a superscalar architecture (it is capable of executing more than one instruction in a single cycle) and two five-stage pipelines. It also incorporates 15 register banks to facilitate an interrupt latency of 6 clock cycles. It is also strong in motor control application but also in multimedia, car audio, powertrain, automotive body control and office + building automation
* SH-DSP – initially developed for the mobile phone
A mobile phone or cell phone is a portable telephone that allows users to make and receive calls over a radio frequency link while moving within a designated telephone service area, unlike fixed-location phones ( landline phones). This rad ...
market, used later in many consumer applications requiring DSP performance for JPEG
JPEG ( , short for Joint Photographic Experts Group and sometimes retroactively referred to as JPEG 1) is a commonly used method of lossy compression for digital images, particularly for those images produced by digital photography. The degr ...
compression etc.
* SH-3 – used for mobile and handheld applications such as the Jornada, strong in Windows CE applications and market for many years in the car navigation market. The Cave CV1000, similar to the Sega NAOMI hardware's CPU, also made use of this CPU. The Korg Electribe EMX and ESX music production units also use the SH-3.
* SH-3-DSP – used mainly in multimedia terminals and networking applications, also in printers and fax machines
* SH-4 – used whenever high performance is required such as car multimedia terminals, video game console
A video game console is an electronic device that Input/output, outputs a video signal or image to display a video game that can typically be played with a game controller. These may be home video game console, home consoles, which are generally ...
s, most notably the Dreamcast, or set-top box
A set-top box (STB), also known as a cable converter box, cable box, receiver, or simply box, and historically television decoder or a converter, is an information appliance device that generally contains a Tuner (radio)#Television, TV tuner inpu ...
es
* SH-5 – used in high-end 64-bit multimedia applications
* SH-X – mainstream core used in various flavours (with/without DSP or FPU unit) in engine control unit, car multimedia equipment, set-top boxes or mobile phones
* SH-Mobile – SuperH Mobile Application Processor; designed to offload application processing from the baseband LSI
SH-2
The SH-2 is a 32-bit RISC architecture with a 16-bit fixed instruction length for high code density and features a hardware multiply–accumulate (MAC) block for DSP algorithms and has a five-stage pipeline.
The SH-2 has a cache on all ROM-less devices.
It provides 16 general-purpose registers, a vector-base register, global-base register, and a procedure register.
Today the SH-2 family stretches from 32 KB of on-board flash up to ROM-less devices. It is used in a variety of different devices with differing peripherals such as CAN, Ethernet, motor-control timer unit, fast ADC and others.
SH-2A
The SH-2A is an upgrade to the SH-2 core that added some 32-bit instructions. It was announced in early 2006.
New features on the SH-2A core include:
* Superscalar architecture: execution of 2 instructions simultaneously
* Harvard architecture
* Two 5-stage pipelines
* Mixed 16-bit and 32-bit instructions
* 15 register banks for interrupt response in 6 cycles.
* Optional FPU
The SH-2A family today spans a wide memory field from 16 KB up to and includes many ROM-less variations. The devices feature standard peripherals such as CAN, Ethernet
Ethernet ( ) is a family of wired computer networking technologies commonly used in local area networks (LAN), metropolitan area networks (MAN) and wide area networks (WAN). It was commercially introduced in 1980 and first standardized in 198 ...
, USB and more as well as more application-specific peripherals such as motor control timers, TFT controllers and peripherals dedicated to automotive powertrain applications.
SH-3
SH-4
The SH-4 is a RISC CPU and was developed for primary use in multimedia applications, such as Sega's Dreamcast and NAOMI game systems. It includes a much more powerful floating-point unit and additional built-in functions, along with the standard 32-bit integer processing and 16-bit instruction size.
SH-4 features include:
* FPU with four floating-point multipliers, supporting 32-bit single-precision and 64-bit double-precision floats
* 4D floating-point dot-product operation and matrix–vector multiplication
* 128-bit floating-point bus allowing 3.2 GB/sec transfer rate from the data cache
* 64-bit external data bus with 32-bit memory addressing, allowing a maximum of 4 GB addressable memory (see Byte addressing) with a transfer rate of 800 MB/sec
* Built-in interrupt, DMA, and power management controllers
There is no FPU in the custom SH-4 made for Casio, the SH7305.
SH-5
The SH-5 is a 64-bit RISC CPU.
Almost no non-simulated SH-5 hardware was ever released, and, unlike the still-live SH-4, support for SH-5 was dropped from GCC and Linux.
SH-6
SH-6 was an announced but never implemented further development. It was supposed to achieve over 2 GIPS, over 7 GFLOPS and over 24 GOPS.
References
Citations
Bibliography
*
*
External links
Renesas SuperH
Products, Tools, Manuals, App.Notes, Information
J-core Open Processor
*
Linux SuperH development list
*
in-progress Debian port for SH4
* A 15-part series on programming for the microprocessor.
{{DEFAULTSORT:Superh
SuperH architecture
Embedded microprocessors
Instruction set architectures
Japanese inventions
Renesas microcontrollers
Open-source hardware
32-bit microprocessors
Open microprocessors