In
distributed computing, a single system image (SSI) cluster is a
cluster
may refer to:
Science and technology Astronomy
* Cluster (spacecraft), constellation of four European Space Agency spacecraft
* Asteroid cluster, a small asteroid family
* Cluster II (spacecraft), a European Space Agency mission to study th ...
of machines that appears to be one single system.
The concept is often considered synonymous with that of a
distributed operating system, but a single image may be presented for more limited purposes, just
job scheduling for instance, which may be achieved by means of an additional layer of software over conventional
operating system images running on each
node.
The interest in SSI clusters is based on the perception that they may be simpler to use and administer than more specialized clusters.
Different SSI systems may provide a more or less complete illusion of a single system.
Features of SSI clustering systems
Different SSI systems may, depending on their intended usage, provide some subset of these features.
Process migration
Many SSI systems provide
process migration.
Processes may start on one
node and be moved to another node, possibly for
resource balancing or administrative reasons.
[for example it may be necessary to move long running processes off a node that is to be closed down for maintenance] As processes are moved from one node to another, other associated resources (for example
IPC
IPC may refer to:
Computing
* Infrastructure protection centre or information security operations center
* Instructions per cycle or instructions per clock, an aspect of central-processing performance
* Inter-process communication, the sharin ...
resources) may be moved with them.
Process checkpointing
Some SSI systems allow
checkpointing
Checkpointing is a technique that provides fault tolerance for computing systems. It basically consists of saving a snapshot of the application's state, so that applications can restart from that point in case of failure. This is particularly imp ...
of running processes, allowing their current state to be saved and reloaded at a later date.
[Checkpointing is particularly useful in clusters used for high-performance computing, avoiding lost work in case of a cluster or node restart.]
Checkpointing can be seen as related to migration, as migrating a process from one node to another can be implemented by first checkpointing the process, then restarting it on another node. Alternatively checkpointing can be considered as ''migration to disk''.
Single process space
Some SSI systems provide the illusion that all processes are running on the same machine - the process management tools (e.g. "ps", "kill" on
Unix like systems) operate on all processes in the cluster.
Single root
Most SSI systems provide a single view of the file system. This may be achieved by a simple
NFS server, shared disk devices or even file replication.
The advantage of a single root view is that processes may be run on any available node and access needed files with no special precautions. If the cluster implements process migration a single root view enables direct accesses to the files from the node where the process is currently running.
Some SSI systems provide a way of "breaking the illusion", having some node-specific files even in a single root.
HP TruCluster provides a "context dependent symbolic link" (CDSL) which points to different files depending on the node that accesses it.
HP VMScluster provides a search list logical name with node specific files occluding cluster shared files where necessary. This capability may be necessary to deal with ''heterogeneous'' clusters, where not all nodes have the same configuration. In more complex configurations such as multiple nodes of multiple architectures over multiple sites, several local disks may combine to form the logical single root.
Single I/O space
Some SSI systems allow all nodes to access the I/O devices (e.g. tapes, disks, serial lines and so on) of other nodes. There may be some restrictions on the kinds of accesses allowed (For example,
OpenSSI can't mount disk devices from one node on another node).
Single IPC space
Some SSI systems allow processes on different nodes to communicate using
inter-process communication
In computer science, inter-process communication or interprocess communication (IPC) refers specifically to the mechanisms an operating system provides to allow the processes to manage shared data. Typically, applications can use IPC, categori ...
s mechanisms as if they were running on the same machine. On some SSI systems this can even include
shared memory (can be emulated with
Software Distributed shared memory).
In most cases inter-node IPC will be slower than IPC on the same machine, possibly drastically slower for shared memory. Some SSI clusters include special hardware to reduce this slowdown.
Cluster IP address
Some SSI systems provide a "
cluster IP A cluster IP is a term in cloud computing to refer to a proxy that represents a computer cluster with a single IP address. It is a term used by the cloud computing system Kubernetes
Kubernetes (, commonly stylized as K8s) is an open-source conta ...
address", a single address visible from outside the cluster that can be used to contact the cluster as if it were one machine. This can be used for load balancing inbound calls to the cluster, directing them to lightly loaded nodes, or for redundancy, moving the cluster address from one machine to another as nodes join or leave the cluster.
["leaving a cluster" is often a euphemism for crashing]
Examples
Examples here vary from commercial platforms with scaling capabilities, to packages/frameworks for creating distributed systems, as well as those that actually implement a single system image.
See also
*
Computer clusters
A computer cluster is a set of computers that work together so that they can be viewed as a single system. Unlike grid computers, computer clusters have each node set to perform the same task, controlled and scheduled by software.
The comp ...
*
Diskless Shared Root Cluster {{no footnotes, date=August 2016
A diskless shared-root cluster is a way to manage several machines at the same time. Instead of each having its own operating system (OS) on its local disk, there is only one image of the OS available on a server, a ...
*
Distributed lock manager
*
Distributed cache
*
Parallel Virtual Machine - multiple system image alternative
*
Message Passing Interface - multiple system image alternative
Notes
References
{{DEFAULTSORT:Single System Image
Cluster computing