Alarm management
   HOME

TheInfoList



OR:

Alarm management is the application of
human factors and ergonomics Human factors and ergonomics (commonly referred to as human factors) is the application of psychological and physiological principles to the engineering and design of products, processes, and systems. Four primary goals of human factors learnin ...
along with instrumentation engineering and
systems thinking Systems thinking is a way of making sense of the complexity of the world by looking at it in terms of wholes and relationships rather than by splitting it down into its parts. It has been used as a way of exploring and developing effective actio ...
to manage the design of an alarm system to increase its
usability Usability can be described as the capacity of a system to provide a condition for its users to perform the tasks safely, effectively, and efficiently while enjoying the experience. In software engineering, usability is the degree to which a sof ...
. Most often the major usability problem is that there are too many alarms annunciated in a plant upset, commonly referred to as alarm flood (similar to an interrupt storm), since it is so similar to a flood caused by excessive rainfall input with a basically fixed
drainage Drainage is the natural or artificial removal of a surface's water and sub-surface water from an area with excess of water. The internal drainage of most agricultural soils is good enough to prevent severe waterlogging (anaerobic condition ...
output capacity. However, there can also be other problems with an alarm system such as poorly designed alarms, improperly set alarm points, ineffective annunciation, unclear alarm messages, etc. Poor alarm management is one of the leading causes of unplanned downtime, contributing to over $20B in lost production every year, and of major industrial incidents such as the one in Texas City. Developing good alarm management practices is not a discrete activity, but more of a continuous process (i.e., it is more of a journey than a destination).


Alarm problem history

From their conception, large chemical, refining, power generation, and other processing plants required the use of a control system to keep the process operating successfully and producing products. Due to the fragility of the components as compared to the process, these control systems often required a control room to protect them from the elements and process conditions. In the early days of control rooms, they used what were referred to as " panel boards" which were loaded with control instruments and indicators. These were tied to sensors located in the process streams and on the outside of process equipment. The sensors relayed their information to the control instruments via analogue signals, such as a 4-20 mA current loop in the form of twisted pair wiring. At first these systems merely yielded information, and a well-trained operator was required to make adjustments either by changing flow rates, or altering energy inputs to keep the process within its designed limits. Alarms were added to alert the operator to a condition that was about to exceed a design limit, or had already exceeded a design limit. Additionally, Emergency Shut Down (ESD) systems were employed to halt a process that was in danger of exceeding either safety, environmental or monetarily acceptable process limits. Alarm were indicated to the operator by annunciator horns, and lights of different colours. (For instance, green lights meant OK, Yellow meant not OK, and Red meant BAD.) Panel boards were usually laid out in a manner that replicated the process flow in the plant. So instrumentation indicating operating units with the plant was grouped together for recognition sake and ease of problem solution. It was a simple matter to look at the entire panel board, and discern whether any section of the plant was running poorly. This was due to both the design of the instruments and the implementation of the alarms associated with the instruments. Instrumentation companies put a lot of effort into the design and individual layout of the instruments they manufactured. To do this they employed behavioural psychology practices which revealed how much information a human being could collect in a quick glance. More complex plants had more complex panel boards, and therefore often more human operators or controllers. Thus, in the early days of panel board systems, alarms were regulated by both size and cost. In essence, they were limited by the amount of available board space, and the cost of running wiring, and hooking up an annunciator (horn), indicator (light) and switches to flip to acknowledge, and clear a resolved alarm. It was often the case that if a new alarm was needed, an old one had to be given up. As technology developed, the control system and control methods were tasked to continue to advance a higher degree of plant automation with each passing year. Highly complex material processing called for highly complex control methodologies. Also, global competition pushed manufacturing operations to increase production while using less energy, and producing less waste. In the days of the panel boards, a special kind of engineer was required to understand a combination of the electronic equipment associated with process measurement and control, the control algorithms necessary to control the process (PID basics), and the actual process that was being used to make the products. Around the mid 80's, we entered the digital revolution. Distributed control systems (DCS) were a boon to the industry. The engineer could now control the process without having to understand the equipment necessary to perform the control functions. Panel boards were no longer required, because all of the information that once came across analogue instruments could be digitised, stuffed into a computer and manipulated to achieve the same control actions once performed with amplifiers and potentiometers. As a side effect, that also meant that alarms were easy and cheap to configure and deploy. You simply typed in a location, a value to alarm on and set it to active. The unintended result was that soon people alarmed everything. Initial installers set an alarm at 80% and 20% of the operating range of any variable just as a habit. The integration of programmable logic controllers, safety instrumented systems, and packaged equipment controllers has been accompanied by an overwhelming increase in associated alarms. One other unfortunate part of the digital revolution was that what once covered several square yards of panel space, now had to be fit into a 17-inch computer monitor. Multiple pages of information was thus employed to replicate the information on the replaced panel board. Alarms were used to tell an operator to go look at a page he was not viewing. Alarms were used to tell an operator that a tank was filling. Every mistake made in operations usually resulted in a new alarm. With the implementation of the OSHA 1910 regulations, HAZOPS studies usually requested several new alarms. Alarms were everywhere. Incidents began to accrue as a combination of too much data collided with too little useful information.


Alarm management history

Recognizing that alarms were becoming a problem, industrial control system users banded together and formed th
Alarm Management Task Force
which was a customer advisory board led by Honeywell in 1990. The AMTF included participants from chemical, petrochemical, and refining operations. They gathered and wrote a document on the issues associated with alarm management. This group quickly realised that alarm problems were simply a subset of a larger problem, and formed the Abnormal Situation Management Consortium (ASM is a registered trademark of Honeywell). Th
ASM Consortium
developed a research proposal and was granted funding from the National Institute of Standards and Technology (NIST) in 1994. The focus of this work was addressing the complex human-system interaction and factors that influence successful performance for process operators. Automation solutions have often been developed without consideration of the human that needs to interact with the solution. In particular, alarms are intended to improve situation awareness for the control room operator, but a poorly configured alarm system does not achieve this goal. The ASM Consortium has produced documents on best practices in alarm management, as well as operator situation awareness, operator effectiveness, and other operator-oriented issues. These documents were originally for ASM Consortium members only, but the ASMC has recently offered these documents publicly. The ASM consortium also participated in development of a
alarm management guideline
published by the Engineering Equipment & Materials Users' Association (EEMUA) in the UK. The ASM Consortium provided data from their member companies, and contributed to the editing of the guideline. The result is EEMUA 191 "Alarm Systems- A Guide to Design, Management and Procurement". Several institutions and societies are producing standards on alarm management to assist their members in the best practices use of alarms in industrial manufacturing systems. Among them are the ISA (ISA 18.2), API (API 1167) and
NAMUR Namur (; ; nl, Namen ; wa, Nameur) is a city and municipality in Wallonia, Belgium. It is both the capital of the province of Namur and of Wallonia, hosting the Parliament of Wallonia, the Government of Wallonia and its administration. Na ...
(Namur NA 102). Several companies also offer software packages to assist users in dealing with alarm management issues. Among them are DCS manufacturing companies, and third-party vendors who offer add-on systems.


Concepts

The fundamental purpose of alarm annunciation is to alert the operator to deviations from normal operating conditions, i.e. abnormal operating situations. The ultimate objective is to prevent, or at least minimise, physical and economic loss through operator intervention in response to the condition that was alarmed. For most digital control system users, losses can result from situations that threaten environmental safety, personnel safety, equipment integrity, economy of operation, and product quality control as well as plant throughput. A key factor in operator response effectiveness is the speed and accuracy with which the operator can identify the alarms that require immediate action. By default, the assignment of alarm trip points and alarm priorities constitute basic alarm management. Each individual alarm is designed to provide an alert when that process indication deviates from normal. The main problem with basic alarm management is that these features are static. The resultant alarm annunciation does not respond to changes in the mode of operation or the operating conditions. When a major piece of process equipment like a charge pump, compressor, or fired heater shuts down, many alarms become unnecessary. These alarms are no longer independent exceptions from normal operation. They indicate, in that situation, secondary, non-critical effects and no longer provide the operator with important information. Similarly, during start-up or shutdown of a process unit, many alarms are not meaningful. This is often the case because the static alarm conditions conflict with the required operating criteria for start-up and shutdown. In all cases of major equipment failure, start-ups, and shutdowns, the operator must search alarm annunciation displays and analyse which alarms are significant. This wastes valuable time when the operator needs to make important operating decisions and take swift action. If the resultant flood of alarms becomes too great for the operator to comprehend, then the basic alarm management system has failed as a system that allows the operator to respond quickly and accurately to the alarms that require immediate action. In such cases, the operator has virtually no chance to minimise, let alone prevent, a significant loss. In short, one needs to extend the objectives of alarm management beyond the basic level. It is not sufficient to utilise multiple priority levels because priority itself is often dynamic. Likewise, alarm disabling based on unit association or suppressing audible annunciation based on priority do not provide dynamic, selective alarm annunciation. The solution must be an alarm management system that can dynamically filter the process alarms based on the current plant operation and conditions so that only the currently significant alarms are annunciated. The fundamental purpose of dynamic alarm annunciation is to alert the operator to relevant abnormal operating situations. They include situations that have a necessary or possible operator response to ensure: *Personnel and Environmental Safety, *Equipment Integrity, *Product Quality Control. The ultimate objectives are no different from the previous basic alarm annunciation management objectives. Dynamic alarm annunciation management focuses the operator's attention by eliminating extraneous alarms, providing better recognition of critical problems, and insuring swifter, more accurate operator response.


The need for alarm management

Alarm management is usually necessary in a process
manufacturing Manufacturing is the creation or production of goods with the help of equipment, labor, machines, tools, and chemical or biological processing or formulation. It is the essence of secondary sector of the economy. The term may refer to ...
environment that is controlled by an operator using a supervisory control system, such as a DCS, a
SCADA Supervisory control and data acquisition (SCADA) is a control system architecture comprising computers, networked data communications and graphical user interfaces for high-level supervision of machines and processes. It also covers sensors and o ...
or a programmable logic controller (PLC). Such a system may have hundreds of individual alarms that up until very recently have probably been designed with only limited consideration of other alarms in the system. Since humans can only do one thing at a time and can pay
attention Attention is the behavioral and cognitive process of selectively concentrating on a discrete aspect of information, whether considered subjective or objective, while ignoring other perceivable information. William James (1890) wrote that "Att ...
to a limited number of things at a time, there needs to be a way to ensure that alarms are presented at a rate that can be assimilated by a human operator, particularly when the plant is upset or in an unusual condition. Alarms also need to be capable of directing the operator's attention to the most important problem that he or she needs to act upon, using a priority to indicate degree of importance or rank, for instance. To ensure a continuous production, a seamless service, a perfect quality at any time of day or night, there must be an organisation which implies several teams of people handling, one after the other, the occurring events. This is more commonly called the on-call management. The on-call management relies on a team of one or more persons (site manager, maintenance staff) or on external organisation (guards, telesurveillance centre). To avoid having a full-time person to monitor a single process or a level, information and / or events transmission is mandatory. This information transmission will enable the on-call staff to be more mobile, more efficient and will allow it to perform other tasks at the same time.


Some improvement methods

The techniques for achieving rate reduction range from the extremely simple ones of reducing nuisance and low value alarms to redesigning the alarm system in a
holistic Holism () is the idea that various systems (e.g. physical, biological, social) should be viewed as wholes, not merely as a collection of parts. The term "holism" was coined by Jan Smuts in his 1926 book '' Holism and Evolution''."holism, n." OED On ...
way that considers the relationships among individual alarms.


Design guide

This step involves documenting the methodology or
philosophy Philosophy (from , ) is the systematized study of general and fundamental questions, such as those about existence, reason, knowledge, values, mind, and language. Such questions are often posed as problems to be studied or resolved. ...
of how to design alarms. It can include things such as what to alarm, standards for alarm annunciation and text messages, how the operator will interact with the alarms.


Rationalization and Documentation

This phase is a detailed review of all alarms to
document A document is a written, drawn, presented, or memorialized representation of thought, often the manifestation of non-fictional, as well as fictional, content. The word originates from the Latin ''Documentum'', which denotes a "teaching" o ...
their design purpose, and to ensure that they are selected and set properly and meet the design criteria. Ideally this stage will result in a reduction of alarms, but doesn't always.


Advanced methods

The above steps will often still fail to prevent an alarm flood in an operational upset, so advanced methods such as alarm suppression under certain circumstances are then necessary. As an example, shutting down a
pump A pump is a device that moves fluids (liquids or gases), or sometimes slurries, by mechanical action, typically converted from electrical energy into hydraulic energy. Pumps can be classified into three major groups according to the method they ...
will always cause a low flow alarm on the pump outlet flow, so the low flow alarm may be suppressed if the pump was shut down since it adds no value for the operator, because he or she already knows it was caused by the pump being shut down. This technique can of course get very complicated and requires considerable care in design. In the above case for instance, it can be argued that the low flow alarm does add value as it confirms to the operator that the pump has indeed stopped. Process boundaries (Boundary Management) must also be taken into account. Alarm management becomes more and more necessary as the
complexity Complexity characterises the behaviour of a system or model whose components interact in multiple ways and follow local rules, leading to nonlinearity, randomness, collective dynamics, hierarchy, and emergence. The term is generally used to ch ...
and size of manufacturing systems increases. A lot of the need for alarm management also arises because alarms can be configured on a DCS at nearly zero incremental cost, whereas in the past on physical control panel systems that consisted of individual
pneumatic Pneumatics (from Greek ‘wind, breath’) is a branch of engineering that makes use of gas or pressurized air. Pneumatic systems used in industry are commonly powered by compressed air or compressed inert gases. A centrally located and ...
or electronic analogue instruments, each alarm required expenditure and control panel area, so more thought usually went into the need for an alarm. Numerous disasters such as Three Mile Island,
Chernobyl accident The Chernobyl disaster was a nuclear accident that occurred on 26 April 1986 at the No. 4 reactor in the Chernobyl Nuclear Power Plant, near the city of Pripyat in the north of the Ukrainian SSR in the Soviet Union. It is one of only two nucl ...
and the ''
Deepwater Horizon ''Deepwater Horizon'' was an ultra-deepwater, dynamically positioned, semi-submersible offshore drilling rig owned by Transocean and operated by BP. On 20 April 2010, while drilling at the Macondo Prospect, a blowout caused an explosion ...
'' have established a clear need for alarm management.


The seven steps to alarm management

Step 1: Create and adopt an alarm philosophy A comprehensive design and guideline document is produced which defines a plant standard employing a best-practise alarm management methodology. Step 2: Alarm performance benchmarking Analyze the alarm system to determine its strengths and deficiencies, and effectively map out a practical solution to improve it. Step 3: “Bad actor” alarm resolution From experience, it is known that around half of the entire alarm load usually comes from a relatively few alarms. The methods for making them work properly are documented, and can be applied with minimum effort and maximum performance improvement. Step 4: Alarm documentation and rationalisation (D&R) A full overhaul of the alarm system to ensure that each alarm complies with the alarm philosophy and the principles of good alarm management. Step 5: Alarm system audit and enforcement DCS alarm systems are notoriously easy to change and generally lack proper security. Methods are needed to ensure that the alarm system does not drift from its rationalised state. Step 6: Real-time alarm management More advanced alarm management techniques are often needed to ensure that the alarm system properly supports, rather than hinders, the operator in all operating scenarios. These include Alarm Shelving, State-Based Alarming, and Alarm Flood Suppression technologies. Step 7: Control and maintain alarm system performance Proper management of change and longer term analysis and KPI monitoring are needed, to ensure that the gains that have been achieved from performing the steps above do not dwindle away over time. Otherwise they will; the principle of “entropy” definitely applies to an alarm system.


See also

* List of human-computer interaction topics, since most control systems are computer-based *
Design A design is a plan or specification for the construction of an object or system or for the implementation of an activity or process or the result of that plan or specification in the form of a prototype, product, or process. The verb ''to design' ...
, especially
interaction design Interaction design, often abbreviated as IxD, is "the practice of designing interactive digital products, environments, systems, and services." Beyond the digital aspect, interaction design is also useful when creating physical (non-digital) produ ...
*
Detection theory Detection theory or signal detection theory is a means to measure the ability to differentiate between information-bearing patterns (called stimulus in living organisms, signal in machines) and random patterns that distract from the information ( ...
*
Physical security Physical security describes security measures that are designed to deny unauthorized access to facilities, equipment and resources and to protect personnel and property from damage or harm (such as espionage, theft, or terrorist attacks). Phy ...
* Annunciator panel * Alarm fatigue * Fault management


Notes


References

* SSM InfoTech Solutions Pvt. Ltd.
Alarm Management System
* EPRI (2005) Advanced Control Room Alarm System: Requirements and Implementation Guidance. Palo Alto, CA. EPRI report 1010076. * EEMUA 191 Alarm Systems - A Guide to Design, Management and Procurement - Edition 3 (2013) * PAS - The Alarm Management Handbook - Second Edition (2010) * ASM Consortium (2009) - Effective Alarm Management Practices * ANSI/ISA–18.2–2009 - Management of Alarm Systems for the Process Industries * IEC 62682 Management of alarms systems for the process industries
Ako-Tec AG - Description of a modern Alarm Management System

Alarm Management and ISA-18 A Journey Not a Destination

RFC8632 A YANG Data Model for Alarm Management


External links


"Principles for alarm system design" YA-711 Norwegian Petroleum Directorate
Safety Security Production and manufacturing Alarms