Continuous availability is an approach to
computer system
A computer is a machine that can be programmed to automatically carry out sequences of arithmetic or logical operations (''computation''). Modern digital electronic computers can perform generic sets of operations known as ''programs'', wh ...
and application design that protects users against downtime, whatever the cause and ensures that users remain connected to their documents, data files and business applications. Continuous availability describes the information technology methods to ensure
business continuity
Business continuity may be defined as "the capability of an organization to continue the delivery of products or services at pre-defined acceptable levels following a disruptive incident", and business continuity planning (or business continuity ...
.
In early days of computing, availability was not considered business critical. With the increasing use of
mobile computing
Mobile computing is human–computer interaction in which a computer is expected to be transported during normal usage and allow for transmission of data, which can include voice and video transmissions. Mobile computing involves mobile commun ...
, global access to online business transactions and business-to-business communication, continuous availability is increasingly important based on the need to support customer access to information systems.
Solutions to continuous availability exists in different forms and implementations depending on the software and hardware manufacturer. The goal of the discipline is to reduce the user or business application downtime, which can have a severe impact on business operations. Inevitably, such downtime can lead to loss of productivity, loss of revenue, customer dissatisfaction and ultimately can damage a company's reputation.
Degrees of availability
The terms
high availability
High availability (HA) is a characteristic of a system that aims to ensure an agreed level of operational performance, usually uptime, for a higher than normal period.
There is now more dependence on these systems as a result of modernization ...
, continuous operation, and continuous availability are generally used to express how available a system is. The following is a definition of each of these terms.
High availability
refers to the ability to avoid unplanned outages by eliminating
single points of failure.
This is a measure of the reliability of the hardware, operating system, middleware, and database manager software.
Another measure of high availability is the ability to minimize the effect of an unplanned outage by masking the outage from the end users.
This can be accomplished by providing redundancy or quickly restarting failed components.
Availability is usually expressed as a percentage of uptime in a given year:
When defining such a percentage it needs to be specified if it applies to the hardware, the
IT infrastructure
Information technology infrastructure is defined broadly as a set of information technology (IT) components that are the foundation of an IT service; typically physical components (Computer hardware, computer and networking hardware and facilitie ...
or the business application on top.
Continuous operation refers to the ability to avoid planned outages.
For continuous operation there must be ways to perform necessary administrative work, like hardware and software maintenance, upgrades, and platform refreshes while the business application remains available to the end users. This is accomplished by providing multiple servers and switching end users to an available server at times when one server is made unavailable.
Note that a system running in continuous operation is not necessarily operating with high availability because an excessive number of unplanned outages could compromise this.
Continuous availability combines the characteristics of high availability and continuous operation to provide the ability to keep the business application running without any noticeable downtime.
Types of outages
Planned outages are deliberate and are scheduled at a convenient time. These involve such activities as:
- Hardware installation or maintenance
- Software maintenance or upgrades of the operating system, the middleware, the database server or the business application
- Database administration such as offline backup, or offline reorganization
Unplanned outages are unexpected outages that are caused by the failure of any system component.
They include hardware failures, software issues, or people and process issues.
History
Various commercially viable examples exist for hardware/software implementations. These include:
*
BIND
BIND () is a suite of software for interacting with the Domain Name System (DNS). Its most prominent component, named (pronounced ''name-dee'': , short for ''name Daemon (computing), daemon''), performs both of the main DNS server roles, acting ...
*
BitTorrent
BitTorrent is a Protocol (computing), communication protocol for peer-to-peer file sharing (P2P), which enables users to distribute data and electronic files over the Internet in a Decentralised system, decentralized manner. The protocol is d ...
*
Ceph
*
CockroachDB
CockroachDB is a source-available distributed SQL database management system developed by Cockroach Labs.
The relational functionality is built on top of a distributed, transactional, consistent key-value store that can survive a variety of d ...
*
Dovecot
*
IBM Parallel Sysplex
In computing, a Parallel Sysplex is a computer cluster, cluster of IBM mainframes acting together as a single system image with z/OS. Used for disaster recovery, Parallel Sysplex combines data sharing and parallel computing to allow a cluster of ...
*
MariaDB Xpand
*
Stratus
*
Tandem NonStop Computers
*
YugabyteDB
See also
*
Business continuity planning
Business continuity may be defined as "the capability of an organization to continue the delivery of products or services at pre-defined acceptable levels following a disruptive incident", and business continuity planning (or business continuity ...
*
Disaster recovery
IT disaster recovery (also, simply disaster recovery (DR)) is the process of maintaining or reestablishing vital infrastructure and systems following a natural or human-induced disaster, such as a storm or battle. DR employs policies, tools, ...
*
High-availability cluster
In computing, high-availability clusters (HA clusters) or fail-over clusters are groups of computers that support server applications that can be reliably utilized with a minimum amount of down-time. They operate by using high availability sof ...
*
Fault-tolerant system
Fault tolerance is the ability of a system to maintain proper operation despite failures or faults in one or more of its components. This capability is essential for high-availability, mission-critical, or even life-critical systems.
Fault to ...
*
Service Availability Forum
References
{{Reflist
External links
Continuity CentralContinuous Availability BlogBusiness Continuity for SAP on IBM System zTechRepublic: IT should establish realistic availability requirements* US Patent 5027269, "Method and apparatus for providing continuous availability of applications in a computer network", 1991;
IBM
International Business Machines Corporation (using the trademark IBM), nicknamed Big Blue, is an American Multinational corporation, multinational technology company headquartered in Armonk, New York, and present in over 175 countries. It is ...
Computer systems