A supercomputer operating system is an

operating system An operating system (OS) is system software that manages computer hardware, software resources, and provides common daemon (computing), services for computer programs. Time-sharing operating systems scheduler (computing), schedule tasks for ef ...

intended for supercomputers. Since the end of the 20th century, supercomputer operating systems have undergone major transformations, as fundamental changes have occurred in

supercomputer architecture Approaches to supercomputer architecture have taken dramatic turns since the earliest systems were introduced in the 1960s. Early supercomputer architectures pioneered by Seymour Cray relied on compact innovative designs and local parallelism to ...

. While early operating systems were custom tailored to each supercomputer to gain speed, the trend has been moving away from in-house operating systems and toward some form of

Linux Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, which i ...

, with it running all the supercomputers on the

TOP500 The TOP500 project ranks and details the 500 most powerful non- distributed computer systems in the world. The project was started in 1993 and publishes an updated list of the supercomputers twice a year. The first of these updates always coinc ...

list in November 2017. In 2021, top 10 computers run for instance

Red Hat Enterprise Linux Red Hat Enterprise Linux (RHEL) is a Commercial software, commercial Open-source software, open-source Linux distribution developed by Red Hat for the commerce, commercial market. Red Hat Enterprise Linux is released in server versions for x86-6 ...

(RHEL), or some variant of it or other Linux distribution e.g.

Ubuntu Ubuntu ( ) is a Linux distribution based on Debian and composed mostly of free and open-source software. Ubuntu is officially released in three editions: '' Desktop'', '' Server'', and ''Core'' for Internet of things devices and robots. All th ...

. Given that modern

massively parallel Massively parallel is the term for using a large number of computer processors (or separate computers) to simultaneously perform a set of coordinated computations in parallel. GPUs are massively parallel architecture with tens of thousands of t ...

supercomputers typically separate computations from other services by using multiple types of

nodes In general, a node is a localized swelling (a "knot") or a point of intersection (a vertex). Node may refer to: In mathematics *Vertex (graph theory), a vertex in a mathematical graph *Vertex (geometry), a point where two or more curves, lines, ...

, they usually run different operating systems on different nodes, e.g., using a small and efficient lightweight kernel such as Compute Node Kernel (CNK) or

Compute Node Linux Compute Node Linux (CNL) is a runtime environment based on the Linux kernel for the Cray XT3, Cray XT4, Cray XT5, Cray XT6, Cray XE6 and Cray XK6 supercomputer systems based on SUSE Linux Enterprise Server. CNL forms part of the Cray Linux ...

(CNL) on compute nodes, but a larger system such as a Linux-derivative on server and

input/output In computing, input/output (I/O, or informally io or IO) is the communication between an information processing system, such as a computer, and the outside world, possibly a human or another information processing system. Inputs are the signals ...

(I/O) nodes.''An Evaluation of the Oak Ridge National Laboratory Cray XT3'' by Sadaf R. Alam, et al., International Journal of High Performance Computing Applications, February 2008 vol. 22 no. 1 52–80. While in a traditional multi-user computer system

job scheduling A job scheduler is a computer application for controlling unattended background program execution of jobs. This is commonly called batch scheduling, as execution of non-interactive jobs is often called batch processing, though traditional ''job' ...

is in effect a tasking problem for processing and peripheral resources, in a massively parallel system, the job management system needs to manage the allocation of both computational and communication resources, as well as gracefully dealing with inevitable hardware failures when tens of thousands of processors are present.Open Job Management Architecture for the Blue Gene/L Supercomputer by Yariv Aridor et al in ''Job scheduling strategies for parallel processing'' by Dror G. Feitelson 2005 pages 95–101. Although most modern supercomputers use the Linux operating system, each manufacturer has made its own specific changes to the Linux-derivative they use, and no industry standard exists, partly because the differences in hardware architectures require changes to optimize the operating system to each hardware design. Operating systems used on top 500 supercomputers

Operating systems used on top 500 supercomputers

Context and overview

In the early days of supercomputing, the basic architectural concepts were evolving rapidly, and

system software System software is software designed to provide a platform for other software. Examples of system software include operating systems (OS) like macOS, Linux, Android and Microsoft Windows, computational science software, game engines, search engin ...

had to follow hardware innovations that usually took rapid turns. In the early systems, operating systems were custom tailored to each supercomputer to gain speed, yet in the rush to develop them, serious software quality challenges surfaced and in many cases the cost and complexity of system software development became as much an issue as that of hardware. Pleiades supercomputer

In the 1980s the cost for software development at

Cray Cray Inc., a subsidiary of Hewlett Packard Enterprise, is an American supercomputer manufacturer headquartered in Seattle, Washington. It also manufactures systems for data storage and analytics. Several Cray supercomputer systems are listed i ...

came to equal what they spent on hardware and that trend was partly responsible for a move away from the in-house operating systems to the adaptation of generic software.''Knowing machines: essays on technical change'' by Donald MacKenzie 1998 page 149–151. The first wave in operating system changes came in the mid-1980s, as vendor specific operating systems were abandoned in favor of

Unix Unix (; trademarked as UNIX) is a family of multitasking, multiuser computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, a ...

. Despite early skepticism, this transition proved successful. By the early 1990s, major changes were occurring in supercomputing system software.''Encyclopedia of Parallel Computing'' by David Padua 2011 pages 426–429. By this time, the growing use of Unix had begun to change the way system software was viewed. The use of a high level language ( C) to implement the operating system, and the reliance on standardized interfaces was in contrast to the assembly language oriented approaches of the past. As hardware vendors adapted Unix to their systems, new and useful features were added to Unix, e.g., fast file systems and tunable process schedulers. However, all the companies that adapted Unix made unique changes to it, rather than collaborating on an industry standard to create "Unix for supercomputers". This was partly because differences in their architectures required these changes to optimize Unix to each architecture. Thus as general purpose operating systems became stable, supercomputers began to borrow and adapt the critical system code from them and relied on the rich set of secondary functions that came with them, not having to reinvent the wheel. However, at the same time the size of the code for general purpose operating systems was growing rapidly. By the time Unix-based code had reached 500,000 lines long, its maintenance and use was a challenge. This resulted in the move to use

microkernel In computer science, a microkernel (often abbreviated as μ-kernel) is the near-minimum amount of software that can provide the mechanisms needed to implement an operating system (OS). These mechanisms include low-level address space management, ...

s which used a minimal set of the operating system functions. Systems such as

Mach Mach may refer to Mach number, the speed of sound in local conditions. It may also refer to: Computing * Mach (kernel), an operating systems kernel technology * ATI Mach, a 2D GPU chip by ATI * GNU Mach, the microkernel upon which GNU Hurd is bas ...

Carnegie Mellon University Carnegie Mellon University (CMU) is a private research university in Pittsburgh, Pennsylvania. One of its predecessors was established in 1900 by Andrew Carnegie as the Carnegie Technical Schools; it became the Carnegie Institute of Technology ...

and

ChorusOS ChorusOS is a microkernel real-time operating system designed as a message passing computing model. ChorusOS began as the Chorus distributed real-time operating system research project at the French Institute for Research in Computer Scien ...

INRIA The National Institute for Research in Digital Science and Technology (Inria) () is a French national research institution focusing on computer science and applied mathematics. It was created under the name ''Institut de recherche en informatiq ...

were examples of early microkernels. The separation of the operating system into separate components became necessary as supercomputers developed different types of nodes, e.g., compute nodes versus I/O nodes. Thus modern supercomputers usually run different operating systems on different nodes, e.g., using a small and efficient lightweight kernel such as CNK or CNL on compute nodes, but a larger system such as a

-derivative on server and I/O nodes.

Early systems

The

CDC 6600 The CDC 6600 was the flagship of the 6000 series of mainframe computer systems manufactured by Control Data Corporation. Generally considered to be the first successful supercomputer, it outperformed the industry's prior recordholder, the IBM ...

, generally considered the first supercomputer in the world, ran the Chippewa Operating System, which was then deployed on various other

CDC 6000 series The CDC 6000 series is a discontinued family of mainframe computers manufactured by Control Data Corporation in the 1960s. It consisted of the CDC 6200, CDC 6300, CDC 6400, CDC 6500, CDC 6600 and CDC 6700 computers, which were all extremely rapid ...

computers.''The computer revolution in Canada'' by John N. Vardalas 2001 page 258. The Chippewa was a rather simple job control oriented system derived from the earlier

CDC 3000 The CDC 3000 series ("thirty-six hundred" of "thirty-one hundred") computers from Control Data Corporation were mid-1960s follow-ons to the CDC 1604 and CDC 924 systems. Over time, a range of machines were produced - divided into * the 48-bit u ...

, but it influenced the later KRONOS and

SCOPE Scope or scopes may refer to: People with the surname * Jamie Scope (born 1986), English footballer * John T. Scopes (1900–1970), central figure in the Scopes Trial regarding the teaching of evolution Arts, media, and entertainment * CinemaS ...

systems. The first Cray-1 was delivered to the Los Alamos Lab with no operating system, or any other software.''Targeting the computer: government support and international competition'' by Kenneth Flamm 1987 pages 81–83. Los Alamos developed the application software for it, and the operating system. The main timesharing system for the Cray 1, the

Cray Time Sharing System The Cray Time Sharing System, also known in the Cray user community as CTSS, was developed as an operating system for the Cray-1 or Cray X-MP line of supercomputers. CTSS was developed by the Los Alamos Scientific Laboratory (LASL now LANL) in ...

(CTSS), was then developed at the Livermore Labs as a direct descendant of the Livermore Time Sharing System (LTSS) for the CDC 6600 operating system from twenty years earlier. In developing supercomputers, rising software costs soon became dominant, as evidenced by the 1980s cost for software development at Cray growing to equal their cost for hardware. That trend was partly responsible for a move away from the in-house

Cray Operating System The Cray Operating System (COS) is a Cray Research operating system for its now-discontinued Cray-1 (1976) and Cray X-MP supercomputers. It succeeded the Chippewa Operating System (shipped with earlier Control Data Corporation CDC 6000 seri ...

UNICOS UNICOS is a range of Unix and after it Linux operating system (OS) variants developed by Cray for its supercomputers. UNICOS is the successor of the Cray Operating System (COS). It provides network clustering and source code compatibility la ...

system based on

. In 1985, the

Cray-2 The Cray-2 is a supercomputer with four vector processors made by Cray Research starting in 1985. At 1.9 GFLOPS peak performance, it was the fastest machine in the world when it was released, replacing the Cray X-MP in that spot. It was, i ...

was the first system to ship with the UNICOS operating system.Lester T. Davis, ''The balance of power, a brief history of Cray Research hardware architectures'' in "High performance computing: technology, methods, and applications" by J. J. Dongarra 1995 page 12

Around the same time, the EOS (operating system), EOS operating system was developed by ETA Systems for use in their ETA10 supercomputers. Lloyd M. Thorndyke, ''The Demise of the ETA Systems'' in "Frontiers of Supercomputing II by Karyn R. Ames, Alan Brenner 1994 pages 489–497. Written in Cybil, a Pascal-like language from Control Data Corporation, EOS highlighted the stability problems in developing stable operating systems for supercomputers and eventually a Unix-like system was offered on the same machine. The lessons learned from developing ETA system software included the high level of risk associated with developing a new supercomputer operating system, and the advantages of using Unix with its large extant base of system software libraries. By the middle 1990s, despite the extant investment in older operating systems, the trend was toward the use of Unix-based systems, which also facilitated the use of interactive

graphical user interface The GUI ( "UI" by itself is still usually pronounced . or ), graphical user interface, is a form of user interface that allows User (computing), users to Human–computer interaction, interact with electronic devices through graphical icon (comp ...

s (GUIs) for

scientific computing Computational science, also known as scientific computing or scientific computation (SC), is a field in mathematics that uses advanced computing capabilities to understand and solve complex problems. It is an area of science that spans many disc ...

across multiple platforms. The move toward a ''commodity OS'' had opponents, who cited the fast pace and focus of Linux development as a major obstacle against adoption. As one author wrote "Linux will likely catch up, but we have large-scale systems now". Nevertheless, that trend continued to gain momentum and by 2005, virtually all supercomputers used some

Unix-like A Unix-like (sometimes referred to as UN*X or *nix) operating system is one that behaves in a manner similar to a Unix system, although not necessarily conforming to or being certified to any version of the Single UNIX Specification. A Unix-li ...

OS.''Getting up to speed: the future of supercomputing'' by Susan L. Graham, Marc Snir, Cynthia A. Patterson, National Research Council 2005 page 136. These variants of Unix included

IBM AIX AIX (Advanced Interactive eXecutive, pronounced , "ay-eye-ex") is a series of proprietary Unix operating systems developed and sold by IBM for several of its computer platforms. Background Originally released for the IBM RT PC RISC w ...

, the open source

system, and other adaptations such as

from Cray. By the end of the 20th century, Linux was estimated to command the highest share of the supercomputing pie.

Modern approaches

The IBM

Blue Gene Blue Gene is an IBM project aimed at designing supercomputers that can reach operating speeds in the petaFLOPS (PFLOPS) range, with low power consumption. The project created three generations of supercomputers, Blue Gene/L, Blue Gene/P, ...

supercomputer uses the

CNK operating system Compute Node Kernel (CNK) is the node level operating system for the IBM Blue Gene series of supercomputers.''Euro-Par 2004 Parallel Processing: 10th International Euro-Par Conference'' 2004, by Marco Danelutto, Marco Vanneschi and Domenico Lafor ...

on the compute nodes, but uses a modified

-based kernel called I/O Node Kernel (

INK Ink is a gel, sol, or solution that contains at least one colorant, such as a dye or pigment, and is used to color a surface to produce an image, text, or design. Ink is used for drawing or writing with a pen, brush, reed pen, or quill. ...

) on the I/O nodes.''Euro-Par 2004 Parallel Processing: 10th International Euro-Par Conference'' 2004, by Marco Danelutto, Marco Vanneschi and Domenico Laforenza page 835.''Euro-Par 2006 Parallel Processing: 12th International Euro-Par Conference'', 2006, by Wolfgang E. Nagel, Wolfgang V. Walter and Wolfgang Lehner . CNK is a lightweight kernel that runs on each node and supports a single application running for a single user on that node. For the sake of efficient operation, the design of CNK was kept simple and minimal, with physical memory being statically mapped and the CNK neither needing nor providing scheduling or context switching. CNK does not even implement file I/O on the compute node, but delegates that to dedicated I/O nodes. However, given that on the Blue Gene multiple compute nodes share a single I/O node, the I/O node operating system does require multi-tasking, hence the selection of the Linux-based operating system. While in traditional multi-user computer systems and early supercomputers,

was in effect a

task scheduling In computing, scheduling is the action of assigning ''resources'' to perform ''tasks''. The ''resources'' may be processors, network links or expansion cards. The ''tasks'' may be threads, processes or data flows. The scheduling activity is car ...

problem for processing and peripheral resources, in a massively parallel system, the job management system needs to manage the allocation of both computational and communication resources. It is essential to tune task scheduling, and the operating system, in different configurations of a supercomputer. A typical parallel job scheduler has a master scheduler which instructs some number of slave schedulers to launch, monitor, and control parallel jobs, and periodically receives reports from them about the status of job progress. Some, but not all supercomputer schedulers attempt to maintain locality of job execution. The PBS Pro scheduler used on the

Cray XT3 The Cray XT3 is a distributed memory massively parallel MIMD supercomputer designed by Cray Inc. with Sandia National Laboratories under the codename '' Red Storm''. Cray turned the design into a commercial product in 2004. The XT3 derives muc ...

and

Cray XT4 The Cray XT4 (codenamed ''Hood'' during development) is an updated version of the Cray XT3 supercomputer. It was released on November 18, 2006. It includes an updated version of the SeaStar interconnect router called SeaStar2, processor sockets ...

systems does not attempt to optimize locality on its three-dimensional

torus interconnect A torus interconnect is a switch-less network topology for connecting processing nodes in a parallel computer system. Introduction In geometry, a torus is created by revolving a circle about an axis coplanar to the circle. While this is a gen ...

, but simply uses the first available processor. On the other hand, IBM's scheduler on the Blue Gene supercomputers aims to exploit locality and minimize network contention by assigning tasks from the same application to one or more midplanes of an 8x8x8 node group.''Job Scheduling Strategies for Parallel Processing:'' by Eitan Frachtenberg and Uwe Schwiegelshohn 2010 pages 138–144. The Slurm Workload Manager scheduler uses a best fit algorithm, and performs Hilbert curve scheduling to optimize locality of task assignments. Several modern supercomputers such as the

Tianhe-2 Tianhe-2 or TH-2 (, i.e. 'Milky Way 2') is a 33.86-petaflops supercomputer located in the National Supercomputer Center in Guangzhou, China. It was developed by a team of 1,300 scientists and engineers. It was the world's fastest supercomputer ...

use Slurm, which arbitrates contention for resources across the system. Slurm is

open source Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use the source code, design documents, or content of the product. The open-source model is a decentralized sof ...

, Linux-based, very scalable, and can manage thousands of nodes in a computer cluster with a sustained throughput of over 100,000 jobs per hour.Jette, M. and M. Grondona, ''SLURM: Simple Linux Utility for Resource Management'' in the Proceedings of ClusterWorld Conference, San Jose, California, June 200

/ref>

References

{{Operating system Operating systems *Supercomputer operating systems

Context and overview

Early systems

Modern approaches

See also

References