Resilience Engineering
   HOME

TheInfoList



OR:

Resilience engineering is a subfield of safety science research that focuses on understanding how
complex Complex commonly refers to: * Complexity, the behaviour of a system whose components interact in multiple ways so possible interactions are difficult to describe ** Complex system, a system composed of many components which may interact with each ...
adaptive systems An adaptive system is a set of interacting or interdependent entities, real or abstract, forming an integrated whole that together are able to respond to environmental changes or changes in the interacting parts, in a way analogous to either cont ...
cope when encountering a surprise. The term ''resilience'' in this context refers to the capabilities that a system must possess in order to deal effectively with unanticipated events. Resilience engineering examines how systems build, sustain, degrade, and lose these capabilities. Resilience engineering researchers have studied multiple safety-critical domains, including
aviation Aviation includes the activities surrounding mechanical flight and the aircraft industry. ''Aircraft'' include fixed-wing and rotary-wing types, morphable wings, wing-less lifting bodies, as well as lighter-than-air aircraft such as h ...
,
anesthesia Anesthesia (American English) or anaesthesia (British English) is a state of controlled, temporary loss of sensation or awareness that is induced for medical or veterinary purposes. It may include some or all of analgesia (relief from or prev ...
,
fire safety Fire safety is the set of practices intended to reduce destruction caused by fire. Fire safety measures include those that are intended to prevent wikt:ignition, the ignition of an uncontrolled fire and those that are used to limit the spread a ...
, space mission control,
military operations A military operation (op) is the coordinated military actions of a state, or a non-state actor, in response to a developing situation. These actions are designed as a military plan to resolve the situation in the state or actor's favor. Operatio ...
, power plants, air traffic control, rail engineering, health care, and emergency response to both natural and industrial disasters. Resilience engineering researchers have also studied the non-safety-critical domain of software operations. Whereas other approaches to
safety Safety is the state of being protected from harm or other danger. Safety can also refer to the control of recognized hazards in order to achieve an acceptable level of risk. Meanings The word 'safety' entered the English language in the 1 ...
(e.g.,
behavior-based safety Behavior-based safety (BBS) is the "application of science of behavior change to real world safety problems". or "A process that creates a safety partnership between management and employees that continually focuses people's attentions and actions ...
,
probabilistic risk assessment Probabilistic risk assessment (PRA) is a systematic and comprehensive methodology to evaluate risks associated with a complex engineered technological entity (such as an airliner or a nuclear power plant) or the effects of stressors on the environ ...
) focus on designing controls to prevent or mitigate specific known
hazards A hazard is a potential source of harm. Substances, events, or circumstances can constitute hazards when their nature would potentially allow them to cause damage to health, life, property, or any other interest of value. The probability of that ...
(e.g.,
hazard analysis A hazard analysis is one of many methods that may be used to assess risk. At its core, the process entails describing a system object (such as a person or machine) that intends to conduct some activity. During the performance of that activity, a ...
), or on assuring that a particular system is safe (e.g., safety cases), resilience
engineering Engineering is the practice of using natural science, mathematics, and the engineering design process to Problem solving#Engineering, solve problems within technology, increase efficiency and productivity, and improve Systems engineering, s ...
looks at a more general capability of systems to deal with
hazard A hazard is a potential source of harm. Substances, events, or circumstances can constitute hazards when their nature would potentially allow them to cause damage to health, life, property, or any other interest of value. The probability of that ...
s that were not previously known before they were encountered. In particular, resilience engineering researchers study how people are able to cope effectively with complexity to ensure safe system operation, especially when they are experiencing time pressure. Under the resilience engineering paradigm,
accidents An accident is an unintended, normally unwanted event that was not deliberately caused by humans. The term ''accident'' implies that the event may have been caused by unrecognized or unaddressed risks. Many researchers, insurers and attorneys w ...
are not attributable to
human error Human error is an action that has been done but that was "not intended by the actor; not desired by a set of rules or an external observer; or that led the task or system outside its acceptable limits".Senders, J.W. and Moray, N.P. (1991) Human Er ...
. Instead, the assumption is that humans working in a system are always faced with goal conflicts, and limited resources, requiring them to constantly make trade-offs while under time pressure. When failures happen, they are understood as being due to the system temporarily being unable to cope with complexity. Hence, resilience engineering is related to other perspectives in safety that have reassessed the nature of human error, such as the "new look", the "new view", "safety differently", and Safety-II. Resilience engineering researchers ask questions such as: * What can organizations do in order to be better prepared to handle unforeseeable challenges? * How do organizations adapt their structure and behavior to cope effectively when faced with an unforeseen challenge? Because incidents often involve unforeseen challenges, resilience engineering researchers often use incident analysis as a research method.


Resilience engineering symposia

The first symposium on resilience engineering was held in October 2004 in Soderkoping, Sweden. It brought together fourteen safety
science Science is a systematic discipline that builds and organises knowledge in the form of testable hypotheses and predictions about the universe. Modern science is typically divided into twoor threemajor branches: the natural sciences, which stu ...
researchers Research is creative and systematic work undertaken to increase the stock of knowledge. It involves the collection, organization, and analysis of evidence to increase understanding of a topic, characterized by a particular attentiveness to ...
with an interest in
complex system A complex system is a system composed of many components that may interact with one another. Examples of complex systems are Earth's global climate, organisms, the human brain, infrastructure such as power grid, transportation or communication sy ...
s. A second symposium on resilience engineering was held in November 2006 in Sophia Antipolis, France. The
symposium In Ancient Greece, the symposium (, ''sympósion'', from συμπίνειν, ''sympínein'', 'to drink together') was the part of a banquet that took place after the meal, when drinking for pleasure was accompanied by music, dancing, recitals, o ...
had eighty participants. The Resilience Engineering Association, an association of researchers and practitioners with an interest in resilience engineering, continues to hold bi-annual symposia. These symposia led to a series of books being published (see Books section below).


Themes

This section discusses aspects of the resilience engineering perspective that are different from traditional approaches to safety.


Normal work leads to both success and failure

The resilience engineering perspective assumes that the nature of work which people do within a system that contributes to an accident is fundamentally the same as the work that people do that contributes to successful outcomes. As a consequence, if work practices are only examined after an accident and are only interpreted in the context of the accident, the result of this analysis is subject to
selection bias Selection bias is the bias introduced by the selection of individuals, groups, or data for analysis in such a way that proper randomization is not achieved, thereby failing to ensure that the sample obtained is representative of the population inte ...
.


Fundamental surprise

The resilience engineering perspective posits that a significant number of failure modes are literally inconceivable in advance of them happening, because the environment that systems operate in are very dynamic and the perspectives of the people within the system are always inherently limited. These sorts of events are sometimes referred to as fundamental surprise. Contrast this with the approach of
probabilistic risk assessment Probabilistic risk assessment (PRA) is a systematic and comprehensive methodology to evaluate risks associated with a complex engineered technological entity (such as an airliner or a nuclear power plant) or the effects of stressors on the environ ...
which focuses on evaluate conceivable risks.


Human performance variability as an asset

The resilience engineering perspective holds that human performance variability has positive effects as well as negative ones, and that safety is increased by amplifying the positive effects of human variability as well as adding controls to mitigate the negative effects. For example, the ability of humans to adapt their behavior based on novel circumstances is a positive effect that ''creates'' safety. As a consequence, adding controls to mitigate the effects of human variability can reduce safety in certain circumstances


The centrality of expertise and experience

Expert operators are an important source of resilience inside of systems. These operators become experts through previous experience at dealing with failures.


Risk is unavoidable

Under the resilience engineering perspective, the operators are always required to trade-off risks. As a consequence, in order to create safety, it is sometimes necessary for a system to take on some risk.


Bringing existing resilience to bear vs generating new resilience

The researcher Richard Cook distinguishes two separate kinds of work that tend to be conflated under the heading ''resilience engineering'':


Bringing existing resilience to bear

The first type of resilience engineering work is determining how to best take advantage of the resilience that is already present in the system. Cook uses the example of setting a broken bone as this type of work: the resilience is already present in the physiology of bone, and setting the bone uses this resilience to achieving better healing outcomes. Cook notes that this first type of resilience work does not require a deep understanding of the underlying mechanisms of resilience: humans have been setting bones long before the mechanism by which bone heals was understood.


Generating new resilience

The second type of resilience engineering work involves altering mechanisms in the system in order to increase the amount of the resilience. Cook uses the example of new drugs such as Abaloparatide and Teriparatide, which mimic
Parathyroid hormone-related protein Parathyroid hormone-related protein (PTHrP) is a proteinaceous hormone and a member of the parathyroid hormone family secreted by mesenchymal stem cells. It is occasionally secreted by cancer cells (for example, breast cancer, certain types of ...
and are used to treat osteoporosis. Cook notes that this second type of resilience work requires a much deeper understanding of the underlying existing resilience mechanisms in order to create interventions that can effectively increase resilience.


Hollnagel perspective

The safety researcher
Erik Hollnagel The given name Eric, Erich, Erikk, Erik, Erick, Eirik, or Eiríkur is derived from the Old Norse name ''Eiríkr'' (or ''Eríkr'' in Old East Norse due to monophthongization). The first element, ''ei-'' may be derived from the older Proto-Nor ...
views resilient performance as requiring four systemic potentials: # The potential to respond # The potential to monitor # The potential to learn # The potential to anticipate. This has been described in a White Paper from Eurocontrol on Systemic Potentials Management https://skybrary.aero/bookshelf/systemic-potentials-management-building-basis-resilient-performance


Woods perspective

The safety researcher David Woods considers the following two concepts in his definition of resilience: * ''graceful extensibility:'' the ability of a system to develop new capabilities when faced with a surprise that cannot be dealt with effectively with a system's existing capabilities * ''sustained adaptability:'' the ability of a system to continue to keep adapting to surprises, over long periods of time These two concepts are elaborated in Woods's theory of graceful extensibility. Woods contrasts resilience with ''robustness'', which is the ability of a system to deal effectively with potential challenges that were anticipated in advance. The safety researcher Richard Cook argued that
bone A bone is a rigid organ that constitutes part of the skeleton in most vertebrate animals. Bones protect the various other organs of the body, produce red and white blood cells, store minerals, provide structure and support for the body, ...
should serve as the ''archetype'' for understanding what resilience is in the Woods perspective. Cook notes that bone has both ''graceful extensibility'' (has a soft boundary at which it can extend function) and ''sustained adaptability'' (bone is constantly adapting through a dynamic balance between creation and destruction that is directed by mechanical strain). In Woods's view, there are three common patterns to the failure of
complex adaptive system A complex adaptive system (CAS) is a system that is ''complex'' in that it is a dynamic network of interactions, but the behavior of the ensemble may not be predictable according to the behavior of the components. It is '' adaptive'' in that the ...
s: # decompensation'':'' exhaustion of capacity when encountering a disturbance # working at cross purposes: when individual agents in a system behave in a way that achieves local goals but goes against global goals # getting stuck in outdated behaviors: relying on strategies that were previously adaptive but are no longer so due to changes in the environment


Resilient Health care

In 2012 the growing interest for resilience engineering gave rise to the sub-field of Resilient Health Care. This led to a series of annual conferences on the topic that are still ongoing as well as a series of books, on Resilient Health Care, and in 2022 to the establishment of the Resilient Health Care Society (registered in Sweden). (https://rhcs.se/)


Books

* ''Resilience Engineering: Concepts and Precepts'' by David Woods,
Erik Hollnagel The given name Eric, Erich, Erikk, Erik, Erick, Eirik, or Eiríkur is derived from the Old Norse name ''Eiríkr'' (or ''Eríkr'' in Old East Norse due to monophthongization). The first element, ''ei-'' may be derived from the older Proto-Nor ...
, and
Nancy Leveson Leveson in 2022 Nancy G. Leveson is an American specialist in system and software safety and a professor of Aeronautics and Astronautics at Massachusetts Institute of Technology (MIT), United States. Leveson gained her degrees (in computer scie ...
, 2006. * ''Resilience Engineering in Practice: A Guidebook'' by Jean Pariès, John Wreathall, and
Erik Hollnagel The given name Eric, Erich, Erikk, Erik, Erick, Eirik, or Eiríkur is derived from the Old Norse name ''Eiríkr'' (or ''Eríkr'' in Old East Norse due to monophthongization). The first element, ''ei-'' may be derived from the older Proto-Nor ...
, 2013. * ''Resilient Health Care, Volume 1'':
Erik Hollnagel The given name Eric, Erich, Erikk, Erik, Erick, Eirik, or Eiríkur is derived from the Old Norse name ''Eiríkr'' (or ''Eríkr'' in Old East Norse due to monophthongization). The first element, ''ei-'' may be derived from the older Proto-Nor ...
, Jeffrey Braithwaite, and Robert L. Wears (eds), 2015. * ''Resilient Health Care, Volume 2: The Resilience of Everyday Clinical Work'' by Erik Hollnagel, Jeffrey Braithwaite, Robert Wears (eds), 2015. * ''Resilient Health Care, Volume 3: Reconciling Work-as-Imagined and Work-as-Done'' by Jeffrey Braithwaite, Robert Wears, and Erik Hollnagel (eds), 2016. * ''Resilience Engineering Perspectives, Volume 1: Remaining Sensitive to the Possibility of Failure'' by
Erik Hollnagel The given name Eric, Erich, Erikk, Erik, Erick, Eirik, or Eiríkur is derived from the Old Norse name ''Eiríkr'' (or ''Eríkr'' in Old East Norse due to monophthongization). The first element, ''ei-'' may be derived from the older Proto-Nor ...
, Christopher Nemeth, and Sidney Dekker (eds.), 2016. * ''Resilience Engineering Perspectives, Volume 2: Remaining Sensitive to the Possibility of Failure'' by Christopher Nemeth,
Erik Hollnagel The given name Eric, Erich, Erikk, Erik, Erick, Eirik, or Eiríkur is derived from the Old Norse name ''Eiríkr'' (or ''Eríkr'' in Old East Norse due to monophthongization). The first element, ''ei-'' may be derived from the older Proto-Nor ...
, and Sidney Dekker (eds.), 2016. * ''Governance and Control of Financial Systems: A Resilience Engineering Perspective'' by Gunilla Sundström and
Erik Hollnagel The given name Eric, Erich, Erikk, Erik, Erick, Eirik, or Eiríkur is derived from the Old Norse name ''Eiríkr'' (or ''Eríkr'' in Old East Norse due to monophthongization). The first element, ''ei-'' may be derived from the older Proto-Nor ...
, 2018.


References

{{Reflist Safety engineering Hazard analysis Reliability engineering