
In
engineering
Engineering is the use of scientific method, scientific principles to design and build machines, structures, and other items, including bridges, tunnels, roads, vehicles, and buildings. The discipline of engineering encompasses a broad rang ...
, redundancy is the intentional duplication of critical components or functions of a system with the goal of increasing reliability of the
system, usually in the form of a backup or
fail-safe, or to improve actual system performance, such as in the case of
GNSS
A satellite navigation or satnav system is a system that uses satellites to provide autonomous geo-spatial positioning. It allows satellite navigation devices to determine their location (longitude, latitude, and altitude/elevation) to high p ...
receivers, or
multi-threaded computer processing.
In many
safety-critical systems, such as
fly-by-wire
Fly-by-wire (FBW) is a system that replaces the conventional manual flight controls of an aircraft with an electronic interface. The movements of flight controls are converted to electronic signals transmitted by wires, and flight control ...
and
hydraulic systems in
aircraft
An aircraft is a vehicle that is able to fly by gaining support from the air. It counters the force of gravity by using either static lift or by using the dynamic lift of an airfoil, or in a few cases the downward thrust from jet engines. ...
, some parts of the control system may be triplicated, which is formally termed
triple modular redundancy (TMR). An error in one component may then be out-voted by the other two. In a triply redundant system, the system has three sub components, all three of which must fail before the system fails. Since each one rarely fails, and the sub components are expected to fail independently, the probability of all three failing is calculated to be extraordinarily small; it is often outweighed by other risk factors, such as
human error
Human error refers to something having been done that was "unintended consequences, not intended by the actor; not desired by a set of rules or an external observer; or that led the task or system outside its acceptable limits".Senders, J.W. and M ...
. Redundancy may also be known by the terms "majority voting systems" or "voting logic".

Redundancy sometimes produces less, instead of greater reliability it creates a more complex system which is prone to various issues, it may lead to human neglect of duty, and may lead to higher production demands which by overstressing the system may make it less safe.
Forms of redundancy
In computer science, there are four major forms of redundancy:
* Hardware redundancy, such as
dual modular redundancy and
triple modular redundancy
* Information redundancy, such as
error detection and correction methods
* Time redundancy, performing the same operation multiple times such as multiple executions of a program or multiple copies of data transmitted
* Software redundancy such as
N-version programming
A modified form of software redundancy, applied to hardware may be:
* Distinct functional redundancy, such as both mechanical and hydraulic braking in a car. Applied in the case of software, code written independently and distinctly different but producing the same results for the same inputs.
Structures are usually designed with redundant parts as well, ensuring that if one part fails, the entire structure will not collapse. A structure without redundancy is called
fracture-critical, meaning that a single broken component can cause the collapse of the entire structure. Bridges that failed due to lack of redundancy include the
Silver Bridge and the
Interstate 5 bridge over the Skagit River.
Parallel and combined systems demonstrate different level of redundancy. The models are subject of studies in reliability and safety engineering.
Dissimilar redundancy
Unlike traditional redundancy, which uses more than one of the same thing, dissimilar redundancy uses different things. The idea is that the different things are unlikely to contain identical flaws. The voting method may involve additional complexity if the two things take different amounts of time. Dissimilar redundancy is often used with software, because identical software contains identical flaws.
The chance of failure is reduced by using at least two different types of each of the following
* processors,
* operating systems,
* software,
* sensors,
* types of actuators (electric, hydraulic, pneumatic, manual mechanical, etc.)
* communications protocols,
* communications hardware,
* communications networks,
* communications paths
Geographic redundancy
Geographic redundancy corrects the vulnerabilities of redundant devices deployed by geographically separating backup devices. Geographic redundancy reduces the likelihood of events such as power outages, floods, HVAC failures, lightning strikes, tornadoes, building fires, wildfires, and mass shootings would disable the system.
Geographic redundancy locations can be
* more than
continent
A continent is any of several large landmasses. Generally identified by convention rather than any strict criteria, up to seven geographical regions are commonly regarded as continents. Ordered from largest in area to smallest, these seven ...
al,
* more than 62 miles apart and less than apart,
Data Center Site Redundancy , H. M. Brotherton and J. Eric Dietz , Computer Information Technology, Purdue University
* less than 62 miles apart, but not on the same campus, or
* different buildings that are more than apart on the same campus.
The following methods can reduce the risks of damage by a fire
conflagration:
* large buildings at least apart
National Research Council , Canada , Division Of Building Research , Spatial Separation Of Buildlngs , November 1959
* high-rise buildings at least apart
* open spaces clear of flammable vegetation within on each side of objects
Protecting Residences From Wildfires , by Howard E. Moore (General Technical Report PSW-50) , page 30, item 10.
* different wings on the same building, in rooms that are separated by more than
* different floors on the same wing of a building in rooms that are horizontally offset by a minimum of with fire walls between the rooms that are on different floors
* two rooms separated by another room, leaving at least a 70-foot gap between the two rooms
* there should be a minimum of two separated fire walls and on opposite sides of a corridor
The Distant Early Warning Line was an example of Geographic redundancy. Those radar sites were a minimum of apart, but provided overlapping coverage.
Functions of redundancy
The two functions of redundancy are passive redundancy and active redundancy. Both functions prevent performance decline from exceeding specification limits without human intervention using extra capacity.
Passive redundancy uses excess capacity to reduce the impact of component failures. One common form of passive redundancy is the extra strength of cabling and struts used in bridges. This extra strength allows some structural components to fail without bridge collapse. The extra strength used in the design is called the margin of safety.
Eyes and ears provide working examples of passive redundancy. Vision loss in one eye does not cause blindness but depth perception
Depth perception is the ability to perceive distance to objects in the world using the visual system and visual perception. It is a major factor in perceiving the world in three dimensions. Depth perception happens primarily due to stereopsi ...
is impaired. Hearing loss in one ear does not cause deafness but directionality is lost. Performance decline is commonly associated with passive redundancy when a limited number of failures occur.
Active redundancy eliminates performance declines by monitoring the performance of individual devices, and this monitoring is used in voting logic. The voting logic is linked to switching that automatically reconfigures the components. Error detection and correction and the Global Positioning System (GPS) are two examples of active redundancy.
Electrical power distribution provides an example of active redundancy. Several power lines connect each generation facility with customers. Each power line includes monitors that detect overload. Each power line also includes circuit breakers. The combination of power lines provides excess capacity. Circuit breakers disconnect a power line when monitors detect an overload. Power is redistributed across the remaining lines. At the Toronto Airport, there are 4 redundant electrical lines. Each of the 4 lines supply enough power for the entire airport. A Spot network substation uses reverse current relays to open breakers to lines that fail, but lets power continue to flow the airport.
Electrical power systems use power scheduling to reconfigure active redundancy. Computing systems adjust the production output of each generating facility when other generating facilities are suddenly lost. This prevents blackout conditions during major events such as an earthquake.
Fire Alarms, Burglary Alarms, Telephone Central Office Exchanges, and similar other critical systems operate on DC power.
Disadvantages
Charles Perrow, author of '' Normal Accidents'', has said that sometimes redundancies backfire and produce less, not more reliability. This may happen in three ways: First, redundant safety devices result in a more complex system, more prone to errors and accidents. Second, redundancy may lead to shirking of responsibility among workers. Third, redundancy may lead to increased production pressures, resulting in a system that operates at higher speeds, but less safely.
Voting logic
Voting logic uses performance monitoring to determine how to reconfigure individual components so that operation continues without violating specification limitations of the overall system. Voting logic often involves computers, but systems composed of items other than computers may be reconfigured using voting logic. Circuit breakers are an example of a form of non-computer voting logic.
The simplest voting logic in computing systems involves two components: primary and alternate. They both run similar software, but the output from the alternate remains inactive during normal operation. The primary monitors itself and periodically sends an activity message to the alternate as long as everything is OK. All outputs from the primary stop, including the activity message, when the primary detects a fault. The alternate activates its output and takes over from the primary after a brief delay when the activity message ceases. Errors in voting logic can cause both outputs to be active or inactive at the same time, or cause outputs to flutter on and off.
A more reliable form of voting logic involves an odd number of three devices or more. All perform identical functions and the outputs are compared by the voting logic. The voting logic establishes a majority when there is a disagreement, and the majority will act to deactivate the output from other device(s) that disagree. A single fault will not interrupt normal operation. This technique is used with avionics systems, such as those responsible for operation of the Space Shuttle
The Space Shuttle is a retired, partially reusable low Earth orbital spacecraft system operated from 1981 to 2011 by the U.S. National Aeronautics and Space Administration (NASA) as part of the Space Shuttle program. Its official program na ...
.
Calculating the probability of system failure
Each duplicate component added to the system decreases the probability of system failure according to the formula:-
:
where:
* – number of components
* – probability of component i failing
* – the probability of all components failing (system failure)
This formula assumes independence of failure events. That means that the probability of a component B failing given that a component A has already failed is the same as that of B failing when A has not failed. There are situations where this is unreasonable, such as using two power supplies
A power supply is an electrical device that supplies electric power to an electrical load. The main purpose of a power supply is to convert electric current from a source to the correct voltage, current, and frequency to power the load. As a r ...
connected to the same socket in such a way that if one power supply
A power supply is an electrical device that supplies electric power to an electrical load. The main purpose of a power supply is to convert electric current from a source to the correct voltage, current, and frequency to power the load. As a ...
failed, the other would too.
It also assumes that only one component is needed to keep the system running.
See also
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
References
External links
Secure Propulsion using Advanced Redundant Control
Using powerline as a redundant communication channel
*
{{Authority control
Engineering concepts
Reliability engineering
Safety
Fault-tolerant computer systems