The Scalable Coherent Interface or Scalable Coherent Interconnect (SCI), is a high-speed interconnect standard for shared memory multiprocessing and message passing. The goal was to scale well, provide system-wide
memory coherence Memory coherence is an issue that affects the design of computer systems in which two or more Central processing unit, processors or Multi core, cores share a common area of memory (computers), memory.
In a uniprocessor system (where there exists o ...
and a simple interface; i.e. a standard to replace existing buses in multiprocessor systems with one with no inherent scalability and performance limitations.
The IEEE Std 1596-1992, IEEE Standard for Scalable Coherent Interface (SCI) was approved by the IEEE standards board on March 19, 1992.
It saw some use during the 1990s, but never became widely used and has been replaced by other systems from the early 2000s.
History
Soon after the
Fastbus
FASTBUS (IEEE 960) is a computer bus standard, originally intended to replace Computer Automated Measurement and Control (CAMAC) in high-speed, large-scale data acquisition. It is also a modular crate electronics standard commonly used in data a ...
(IEEE 960) follow-on
Futurebus
Futurebus (IEEE 896) is a Bus (computing), computer bus standard designed to replace all local bus connections in a computer, including the Central processing unit, CPU, plug-in cards, and even some Local area network, LAN links between machines. ...
(IEEE 896) project in 1987, some engineers predicted it would already be too slow for the
high performance computing
High-performance computing (HPC) is the use of supercomputers and computer clusters to solve advanced computation problems.
Overview
HPC integrates systems administration (including network and security knowledge) and parallel programming into ...
marketplace by the time it would be released in the early 1990s.
In response, a "Superbus" study group was formed in November 1987.
Another working group of the
standards association of the
Institute of Electrical and Electronics Engineers
The Institute of Electrical and Electronics Engineers (IEEE) is an American 501(c)(3) public charity professional organization for electrical engineering, electronics engineering, and other related disciplines.
The IEEE has a corporate office ...
(IEEE) spun off to form a standard targeted at this market in July 1988.
It was essentially a subset of Futurebus features that could be easily implemented at high speed, along with minor additions to make it easier to connect to other systems, such as
VMEbus
VMEbus (Versa Module Eurocard bus) is a computer bus standard physically based on Eurocard sizes.
History
In 1979, during development of the Motorola 68000 CPU, one of their engineers, Jack Kister, decided to set about creating a standar ...
. Most of the developers had their background from high-speed
computer buses
In computer architecture, a bus (historically also called a data highway or databus) is a communication system that transfers Data (computing), data between components inside a computer or between computers. It encompasses both Computer hardw ...
. Representatives from companies in the computer industry and research community included Amdahl, Apple Computer,
BB&N,
Hewlett-Packard
The Hewlett-Packard Company, commonly shortened to Hewlett-Packard ( ) or HP, was an American multinational information technology company. It was founded by Bill Hewlett and David Packard in 1939 in a one-car garage in Palo Alto, California ...
, CERN, Dolphin Server Technology,
Cray Research
Cray Inc., a subsidiary of Hewlett Packard Enterprise, is an American supercomputer manufacturer headquartered in Seattle, Washington. It also manufactures systems for data storage and analytics. Several Cray supercomputer systems are listed i ...
, Sequent, AT&T, Digital Equipment Corporation, McDonnell Douglas, National Semiconductor, Stanford Linear Accelerator Center, Tektronix, Texas Instruments, Unisys, University of Oslo,
University of Wisconsin
A university () is an institution of tertiary education and research which awards academic degrees in several academic disciplines. ''University'' is derived from the Latin phrase , which roughly means "community of teachers and scholars". Uni ...
.
The original intent was a single standard for all buses in the computer.
The working group soon came up with the idea of using point-to-point communication in the form of insertion rings. This avoided the lumped capacitance, limited physical length/speed of light problems and stub reflections in addition to allowing parallel transactions. The use of insertion rings is credited to Manolis Katevenis who suggested it at one of the early meetings of the working group. The working group for developing the standard was led by David B. Gustavson (chair) and David V. James (Vice Chair).
David V. James was a major contributor for writing the specifications including the executable C-code. Stein Gjessing’s group at the University of Oslo used formal methods to verify the coherence protocol and Dolphin Server Technology implemented a node controller chip including the cache coherence logic.
Different versions and derivatives of SCI were implemented by companies like
Dolphin Interconnect Solutions
Dolphin Interconnect Solutions is a privately held manufacturer of high-speed data communication systems headquartered in Oslo, Norway and Woodsville, New Hampshire, USA.
The technology of Dolphin was based on development work at Norsk Data d ...
, Convex,
Data General AViiON (using cache controller and link controller chips from Dolphin), Sequent and Cray Research. Dolphin Interconnect Solutions implemented a PCI and PCI-Express connected derivative of SCI that provides non-coherent shared memory access. This implementation was used by
Sun Microsystems
Sun Microsystems, Inc., often known as Sun for short, was an American technology company that existed from 1982 to 2010 which developed and sold computers, computer components, software, and information technology services. Sun contributed sig ...
for its high-end clusters,
Thales Group
Thales S.A., Trade name, trading as Thales Group (), is a French multinational corporation, multinational aerospace and defence industry, defence corporation specializing in electronics. It designs, develops and manufactures a wide variety of aer ...
and several others including volume applications for message passing within HPC clustering and medical imaging.
SCI was often used to implement
non-uniform memory access
Non-uniform memory access (NUMA) is a computer storage, computer memory design used in multiprocessing, where the memory access time depends on the memory location relative to the processor. Under NUMA, a processor can access its own local memory ...
architectures.
It was also used by
Sequent Computer Systems
Sequent Computer Systems, Inc. was a computer company that designed and manufactured multiprocessing computer systems. They were among the pioneers in high-performance symmetric multiprocessing (SMP) Open system (computing), open systems, innovatin ...
as the processor memory bus in their NUMA-Q systems. Numascale developed a derivative to connect with
coherent HyperTransport.
The standard
The standard defined two interface levels:
*The physical level that deals with electrical signals, connectors, mechanical and thermal conditions
*The logical level that describes the address space, data transfer protocols, cache coherence mechanisms, synchronization primitives, control and status registers, and initialization and error recovery facilities.
This structure allowed new developments in physical interface technology to be easily adapted without any redesign on the logical level.
Scalability for large systems is achieved through a distributed
directory-based cache coherence
In computer engineering, directory-based cache coherence is a type of Cache coherence#Coherence mechanisms, cache coherence mechanism, where directories are used to manage caches in place of bus snooping. Bus snooping methods scale poorly due to t ...
model. (The other popular models for cache coherency are based on system-wide eavesdropping (snooping) of memory transactions – a scheme which is not very scalable.) In SCI each node contains a directory with a pointer to the next node in a linked list that shares a particular cache line.
SCI defines a 64-bit flat address space (16 exabytes) where 16 bits are used for identifying a node (65,536 nodes) and 48 bits for address within the node (256 terabytes). A node can contain many processors and/or memory. The SCI standard defines a
packet switched network
In telecommunications, packet switching is a method of grouping data into short messages in fixed format, i.e. '' packets,'' that are transmitted over a digital network. Packets consist of a header and a payload. Data in the header is used b ...
.
Topologies
SCI can be used to build systems with different types of switching topologies from centralized to fully distributed switching:
*With a central switch, each node is connected to the switch with a ringlet (in this case a two-node ring).
*In distributed switching systems, each node can be connected to a ring of arbitrary length and either all or some of the nodes can be connected to two or more rings.
The most common way to describe these multi-dimensional topologies is k-ary n-cubes (or tori). The SCI standard specification mentions several such topologies as examples.
The 2-D
torus
In geometry, a torus (: tori or toruses) is a surface of revolution generated by revolving a circle in three-dimensional space one full revolution about an axis that is coplanarity, coplanar with the circle. The main types of toruses inclu ...
is a combination of rings in two dimensions. Switching between the two dimensions requires a small switching capability in the node. This can be expanded to three or more dimensions. The concept of folding rings can also be applied to the Torus topologies to avoid any long connection segments.
Transactions
SCI sends information in packets. Each packet consists of an unbroken sequence of 16-bit symbols. The symbol is accompanied by a flag bit. A transition of the flag bit from 0 to 1 indicates the start of a packet. A transition from 1 to 0 occurs 1 (for echoes) or 4 symbols before the packet end. A packet contains a header with address command and status information, payload (from 0 through optional lengths of data) and a CRC check symbol. The first symbol in the packet header contains the destination node address. If the address is not within the domain handled by the receiving node, the packet is passed to the output through the bypass FIFO. In the other case, the packet is fed to a receive queue and may be transferred to a ring in another dimension. All packets are marked when they pass the scrubber (a node is established as scrubber when the ring is initialized). Packets without a valid destination address will be removed when passing the scrubber for the second time to avoid filling the ring with packets that would otherwise circulate indefinitely.
Cache coherence
Cache coherence
In computer architecture, cache coherence is the uniformity of shared resource data that is stored in multiple local caches. In a cache coherent system, if multiple clients have a cached copy of the same region of a shared memory resource, all ...
ensures data consistency in multiprocessor systems. The simplest form applied in earlier systems was based on clearing the cache contents between
context switch
In computing, a context switch is the process of storing the state of a process or thread, so that it can be restored and resume execution at a later point, and then restoring a different, previously saved, state. This allows multiple processes ...
es and disabling the cache for data that were shared between two or more processors. These methods were feasible when the performance difference between the cache and memory were less than one order of magnitude. Modern processors with caches that are more than two orders of magnitude faster than main memory would not perform anywhere near optimal without more sophisticated methods for data consistency. Bus based systems use eavesdropping (
snooping) methods since buses are inherently broadcast. Modern systems with point-to point links use broadcast methods with snoop filter options to improve performance. Since broadcast and eavesdropping are inherently non-scalable, these are not used in SCI.
Instead, SCI uses a distributed directory-based cache coherence protocol with a
linked list
In computer science, a linked list is a linear collection of data elements whose order is not given by their physical placement in memory. Instead, each element points to the next. It is a data structure consisting of a collection of nodes whi ...
of nodes containing processors that share a particular cache line. Each node holds a directory for the main memory of the node with a tag for each line of memory (same line length as the cache line). The memory tag holds a pointer to the head of the linked list and a state code for the line (three states – home, fresh, gone). Associated with each node is also a cache for holding remote data with a directory containing forward and backward pointers to nodes in the linked list sharing the cache line. The tag for the cache has seven states (invalid, only fresh, head fresh, only dirty, head dirty, mid valid, tail valid).
The distributed directory is scalable. The overhead for the directory based cache coherence is a constant percentage of the node’s memory and cache. This percentage is in the order of 4% for the memory and 7% for the cache.
Legacy
SCI is a standard for connecting the different resources within a multiprocessor computer system, and it is not as widely known to the public as for example the
Ethernet
Ethernet ( ) is a family of wired computer networking technologies commonly used in local area networks (LAN), metropolitan area networks (MAN) and wide area networks (WAN). It was commercially introduced in 1980 and first standardized in 198 ...
family for connecting different systems. Different system vendors implemented different variants of SCI for their internal system infrastructure. These different implementations interface to very intricate mechanisms in processors and memory systems and each vendor has to preserve some degrees of compatibility for both hardware and software.
Gustavson led a group called the Scalable Coherent Interface and Serial Express Users, Developers, and Manufacturers Association and maintained a web site for the technology starting in 1996.
A series of workshops were held through 1999.
After the first 1992 edition,
follow-on projects defined shared data formats in 1993, a version using
low-voltage differential signaling
Low-voltage differential signaling (LVDS), also known as TIA/EIA-644, is a technical standard that specifies electrical characteristics of a differential, serial signaling standard. LVDS operates at low power and can run at very high speeds u ...
in 1996, and a memory interface known as Ramlink later in 1996.
In January 1998, the
SLDRAM
Synchronous dynamic random-access memory (synchronous dynamic RAM or SDRAM) is any DRAM where the operation of its external pin interface is coordinated by an externally supplied clock signal.
DRAM integrated circuits (ICs) produced from the ea ...
corporation was formed to hold patents on an attempt to define a new memory interface that was related to another working group called SerialExpress or Local Area Memory Port.
However, by early 1999 the new memory standard was abandoned.
In 1999 a series of papers was published as a book on SCI.
An updated specification was published in July 2000 by the
International Electrotechnical Commission
The International Electrotechnical Commission (IEC; ) is an international standards organization that prepares and publishes international standards for all electrical, electronics, electronic and related technologies. IEC standards cover a va ...
(IEC) of the
International Organization for Standardization
The International Organization for Standardization (ISO ; ; ) is an independent, non-governmental, international standard development organization composed of representatives from the national standards organizations of member countries.
M ...
(ISO) as ISO/IEC 13961.
See also
*
Dolphin Interconnect Solutions
Dolphin Interconnect Solutions is a privately held manufacturer of high-speed data communication systems headquartered in Oslo, Norway and Woodsville, New Hampshire, USA.
The technology of Dolphin was based on development work at Norsk Data d ...
*
List of device bandwidths
A list is a set of discrete items of information collected and set forth in some format for utility, entertainment, or other purposes. A list may be memorialized in any number of ways, including existing only in the mind of the list-maker, but ...
*
NUMAlink
NUMAlink is a system interconnect developed by Silicon Graphics (SGI) for use in its distributed shared memory ccNUMA computer systems. NUMAlink was originally developed by SGI for their Origin 2000 and Onyx2 systems. At the time of these system ...
*
QuickRing QuickRing was a gigabit-rate interconnect that combined the functions of a computer bus and a network. It was designed at Apple Computer as a multimedia system to run "on top" of existing local bus systems inside a computer, but was later taken ov ...
*
HIPPI
HIPPI, short for High Performance Parallel Interface, is a computer bus for the attachment of high speed storage devices to supercomputers, in a Point-to-point link#Point-to-point, point-to-point link. It was popular in the late 1980s and into ...
*
IEEE 1355
IEEE Standard 1355-1995, IEC 14575, or ISO 14575 is a data communications standard for Heterogeneous Interconnect (HIC).
IEC 14575 is a low-cost, low latency, scalable serial interconnection system, originally intended for communication between la ...
*
RapidIO
The RapidIO architecture is a high-performance packet-switched electrical connection technology. It supports messaging, read/write and cache coherency semantics. Based on industry-standard electrical specifications such as those for Ethernet, Ra ...
*
Myrinet
Myrinet, ANSI/VITA 26-1998, is a high-speed local area networking system designed by the company Myricom to be used as an interconnect between multiple machines to form computer clusters.
Description
Myrinet was promoted as having lower protocol ...
*
QsNet
*
Futurebus
Futurebus (IEEE 896) is a Bus (computing), computer bus standard designed to replace all local bus connections in a computer, including the Central processing unit, CPU, plug-in cards, and even some Local area network, LAN links between machines. ...
*
InfiniBand
InfiniBand (IB) is a computer networking communications standard used in high-performance computing that features very high throughput and very low latency. It is used for data interconnect both among and within computers. InfiniBand is also used ...
References
{{Authority control
Supercomputing
Computer networks