mas for alarm management system in emergencies

10
MAS for Alarm Management System in Emergencies Ana Cristina Bicharra Garc´ ıa 1 , Luiz Andre P. Paes Leme 1 , Fernando Pinto 1 , and Nayat Sanchez-Pi 2 1 Computer Science Institute, Fluminense Federal University. Rua Passo da P´ atria, 156. Niter´ oi. RJ. 24210-240. Brazil {cristina,luis.andre,fernando}@addlabs.uff.br 2 Computer Science Department, Carlos III University of Madrid. Avda de la Universidad Carlos III, 22. Madrid. 28270. Spain [email protected] Abstract. Due to the imminent danger involved in the petroleum opera- tion domain, only well trained workers are allowed to operate in offshore oil process plants. Although their vast experience, human errors may hap- pen during emergency situations as a result of the overwhelmed amount of information generated by a great deal of triggered alarms. Alarm devices have become very cheap leading petroleum equipment manufacturers to overuse them transferring safety responsibility to operators. Not rarely, accident reports cite poor operators’ understanding of the actual plant status due to too many active alarms. In this paper, we present an alarm management system focused on guiding offshore platform operators’ at- tention to the essential information that calls for immediate action during emergency situations. We use a multi-agent based approach as the basis of our alarm management system for assisting operators to make sense of alarm avalanche scenarios. Keywords: multi.agent systems, emergencies, alarm management, oil industry, fault detection, sense making 1 Introduction Alarm management in emergency scenarios has become a topic of great con- cern in different economic sectors, such as nuclear, aeronautics and offshore oil industry, due to the frequent accidents occurred in the last decades caused by in- appropriate alarm management systems. Although great effort has been devoted to plant’s automation and cheap alarm device development, operators play an important role mastering all information and adjusting equipments’ behavior as needed. The observations of our research are domain independent, but, in this paper, we focus on the offshore oil process plant domain. An alarm informs operators of the process plant unit’ status and might re- quire immediate action. As the safety norms become more stringent and sensor

Upload: uerj

Post on 27-Jan-2023

0 views

Category:

Documents


0 download

TRANSCRIPT

MAS for Alarm Management System inEmergencies

Ana Cristina Bicharra Garcıa1, Luiz Andre P. Paes Leme1, Fernando Pinto1,and Nayat Sanchez-Pi2

1 Computer Science Institute,Fluminense Federal University.

Rua Passo da Patria, 156. Niteroi. RJ. 24210-240. Brazil{cristina,luis.andre,fernando}@addlabs.uff.br

2 Computer Science Department,Carlos III University of Madrid.

Avda de la Universidad Carlos III, 22. Madrid. 28270. [email protected]

Abstract. Due to the imminent danger involved in the petroleum opera-tion domain, only well trained workers are allowed to operate in offshoreoil process plants. Although their vast experience, human errors may hap-pen during emergency situations as a result of the overwhelmed amount ofinformation generated by a great deal of triggered alarms. Alarm deviceshave become very cheap leading petroleum equipment manufacturers tooveruse them transferring safety responsibility to operators. Not rarely,accident reports cite poor operators’ understanding of the actual plantstatus due to too many active alarms. In this paper, we present an alarmmanagement system focused on guiding offshore platform operators’ at-tention to the essential information that calls for immediate action duringemergency situations. We use a multi-agent based approach as the basisof our alarm management system for assisting operators to make senseof alarm avalanche scenarios.

Keywords: multi.agent systems, emergencies, alarm management, oilindustry, fault detection, sense making

1 Introduction

Alarm management in emergency scenarios has become a topic of great con-cern in different economic sectors, such as nuclear, aeronautics and offshore oilindustry, due to the frequent accidents occurred in the last decades caused by in-appropriate alarm management systems. Although great effort has been devotedto plant’s automation and cheap alarm device development, operators play animportant role mastering all information and adjusting equipments’ behavior asneeded. The observations of our research are domain independent, but, in thispaper, we focus on the offshore oil process plant domain.

An alarm informs operators of the process plant unit’ status and might re-quire immediate action. As the safety norms become more stringent and sensor

2 Garcia, A. C. B et al.

devices become inexpensive and easier to embed, manufacturers have overusedinstalling alarms into their equipments. Trained operators handle pretty wellfew, but not many, alarm information at a time. During a non-planned processplant shutdowns, operators face an avalanche of alarm information, frequentlyover 1000 alarms/minute, that must be understood, prioritized and reasoned todecide upon which action, if any, to take. This information overflow has beenrelated as one of the major cause of several serious accidents in the last decade,such as the Mildford Haven refinery accident in the UK, on 24 July 1994, whichresulted in a loss of 48 million pounds and two months of non-operation. Thereport of the Health and Safety Executive department [1] has identified as theaccident cause the refinery operators’ inability to identify what was really hap-pening behind the large number of triggered alarms generated. The accidentcould be avoided if the alarm system had identified the cause of the problem,sorted alarm information and displayed only the most important ones so thatthe operator is able to act properly.

A process plant of a petroleum offshore unit is a complex artifact composedof independent equipments which interact with each other to receive petroleumfrom subsea reservoirs; treat it and export gas and oil to land refineries. Eachequipment behaves and reacts to accomplish its own goal, such as compress gas ortreat oil, but maintaining its behavior within the safety functioning range. Sen-sors and actuators are essential devices embedded in the equipments to monitoror control their behavior, respectively. These complementary devices are orches-trated by the process plant automation system that receives sensors informationand triggers actuators actions, such as closing a valve.

Inspired by the distributed and encapsulated aspect of the process plant ar-tifact physical model, we proposed a multi-agent-based alarm management sys-tem to synthesize the process plant situation during emergency situations. Eachagent represents an equipment that understands about its expected and unex-pected behavior within the process plant. During emergency scenarios, alarmsrelated to expected behavior can be suppressed to lead operators’ attention tounexpected behavior. Distinguishing expected from non-expected behavior dur-ing emergency scenarios and using this information to filter what to display tooperators is the basis for our intelligent alarm management system proposal. Inaddition to proposing a model, we have implemented a prototype version andtested it in a controlled environment. We are currently deploying a version of thesystem to work within a Brazilian offshore petroleum process plant unit. Therest of the paper is structured as followed. The first section presents related workin the area of alarm management systems, focused on offshore oil process plantdomain. Next, our multi-agent alarm management system is laid down, followedby a case study. Results are presented and a conclusions are outlined.

2 Related Work

We see three strategies to face the problem of information overflow: 1) mitigatewrong decisions, 2) predict problems to avoid avalanches and 3) real-time anal-

MAS for Alarm Management System in Emergencies 3

ysis of the avalanche to filter unimportant alarms and automatically diagnoseproblems. We propose the third strategy. Firstly, because mitigate bad decisionsis not always possible or economically feasible for obvious reasons. Secondly, pre-dictive methods are usually based on labeled examples or mathematical modelswhich were not available. Moreover, predicting of problems could not avoid allshutdowns and, therefore, it would still be necessary to manage emergencies.Most existing alarm management approaches focus either on post-crisis informa-tion analysis (first strategy) or on more automation procedures (second strategy)[2–4]. None addresses the challenge of assisting operators to make sense of thesituation DURING crisis scenarios.

Since our goal is to provide assistance during crisis, our system must bedesigned as a real-time application. We have investigated many different ap-proaches in different industries (oil and electric power), such as conventionalcentralized structures [5–8], decentralized applications [9] and multi-agent sys-tem [2–4, 10, 11], but the last approach the best results concerning response timeand flexibility to grow the application.

3 MAS for Alarm Management System

An emergency shutdown (SD) in a process plant unit is a safety measure, ingeneral, triggered by automation systems (AS) to protect the equipment, thesystem, people operating the system and the environment. Each SD triggers aset of events designed by the automation designers to protect the unit. For exam-ple, during a shutdown of a specific equipment, it is expected that this equipmentbe contained and isolated from the rest. In order to accomplish these effects, theautomation procedure imposes the closing of upstream and downstream valves.When shutting down a system or even the entire unit, many more events shouldbe triggered. Of course, these events interact with each other and further au-tomation procedures must specify the desired interactions. At the same timesensors monitor parameters’ values for undesirable situations, sometimes theseundesirable situations are the expected ones. For instance, a fast pressure dropdownstream a pump might represent a pump cavity danger. However, the sameinformation when associated with a pump turn-off means a correct and desir-able behavior. As noted, in an emergency situation, a great deal of informationis generated. It is extremely useful to establish priorities to set which alarms areactually important to be displayed to operators.

This scenario is very suited to agent architecture: each equipment, system oreven a component can be modeled as autonomous agents. A sensor would be anagent, as well as valves, pumps, compressors, etc. An agent must know the effectof its good or ill behavior. If something unexpected occurs the agent can reportthe problem together with alarms which support its conclusion. But, there couldbe other agents which would be responsible for more complex diagnoses thatcould be based on the reasoning of other agents. For example, a vessel would beconsidered contained if all inlet and outlet valves are completely closed. In thiscase, each valve corresponds to a different agent responsible for diagnosing the

4 Garcia, A. C. B et al.

actual state of the valves and the vessel to an agent responsible to diagnose theactual state of the vessel. The vessel agent reasoning rests on the diagnoses ofthe valve agents.

The differences between configurations of process plants is another key aspectwhich points towards the agents technology. Although the set of equipments of aprocess plant can not be dynamically changed, it can be changed over time or op-erated with different sets of active equipments. Moreover, two petroleum offshoreunits can be very different in terms of its process plant. Once Oil companies haveto manage many different petroleum offshore unit, systems for asset managementand control should be easily configurable and scalable. Agents technology per-fectly fits in this kind of engineering problems due to the main characteristics ofagents: autonomous reasoning, proactivity, configurability and scalability. Dur-ing the years researchers have come to the conclusion that reactivity is also avery important characteristic that an intelligent agent should possess [12]. Reac-tivity is suitable for changing environments performing an appropriate responseto some changes which have been recognized and perceived [13].

The objective of our approach is to work as an alarm information filter,receiving and sorting information sent by the automation supervisor system(AS), during a serious non-planned process plant shutdown, called here simplyas STOP. This STOP situation causes an avalanche of alarm information. It ishumanly impossible for process plant operators to understand not only whatis happening with the process plant, but also, and more importantly, if it isbeing moving to a safe state, i.e. if the plant is properly being turned off .Thus, our proposed system can be seen as a assisted-stop system. The operatorshould only receive information related to unexpected alarms or danger degreeescalation that may compromise the safety of the unit.

3.1 MAS Architecture

The Agents paradigm provides an excellent modeling abstraction for our in-telligent alarm management system due to agent’s human-like characteristics,including reasoning, proactivity, communication and adaptive behavior. More-over, Alarm Management Systems in Emergency Situations beg for technologiesthat are transparent, so that the functional behavior in an emergency situationcan be easily understood by operators.

Our model, illustrated in Figure 1, represents an intelligent support systemfor alarms management in an offshore oil production domain. The MAS archi-tecture is composed of four types of agents: Environment Agent, AutomationAgent, Log Handler Agent, Log Analyser Agent and Blackboard Agent. Theagents main functionalities are the following:

– Environment Agent: This agent monitors information from nature. It alsomanages information regarding the Oil Production Platform status such usthe identification of a SD.

– Automation Agent: Automation System (AS). Events and alarms con-tinuously monitored and identified by sensors embedded in the equipments

MAS for Alarm Management System in Emergencies 5

Fig. 1. Multi-Agent Architecture

are sent to the automation system (AS). Later, the AS triggers Actuators(pumps, valves, compressors, etc.) for actions (open/close, turn-on/turn-off).This agent creates a log of events which are sent to the Log Handler Agent.

– Log Handler Agent: This agent reads and parses the log of events in theblackboard to create structured information that can be further analyzed.

– Log Analyser Agent: This agent is actually a set of agents. Each agentunderstands about a equipment. Each agent selects, from the structuredinformation stored in the blackboard, only the information that concerns itsexpertise. Its knowledge is written in terms of production rules describingexpected and unexpected behaviors. Expected behaviors triggers an alarminformation suppression action, that means an information removal from theblackboard.

– Blackboard Agent: This agent handles information that will be displayedto operators. It must handle information synchronization since many agentsare reading and writing into its structure. It invokes the GUI where alarmsinformation are shown to the final operator.

3.2 Ontology

We model the process plant domain using an ontology that emphasizes the com-ponent and monitoring characteristics of the artifact. Modeling this environmentinvolves representing entities and relationships among these entities. The mainconcepts of the ontology and its description are as follows.

– Equipment: It is a component of a process plant.

– Actuator: It is a device, such as valves, pumps and compressors, whichcontrols the equipment behavior.

6 Garcia, A. C. B et al.

– Equipment behavior: It represents the way the equipment behaves inorder to achieve a desired functionality. Equipment behavior is measuredthrough sensors.

– Event: It is an action over the actuators that might causes a change in thestate of the alarm. For instance, the event ”close” over the actuator ”valve”should cause a decrease on oil flow in the pipeline.

– Alarm: It represents an abnormal state of the equipment behavior. Thepossible values are High (H), Very High (HH), Low (L) or Very Low (LL).HH and LL leads to equipment and even the entire unit shut down

– Sensor: It is a device that measures control variables.

– Control Variable Status: It indicates the variation of a measurement. Acontrol variable status indicates for instance if the temperature is increasing.

An equipment is part of a process plant. Equipment achieves its goals thougha set of behaviors such as oil level, pressure, temperature and flow that are mon-itored by sensors. Alarm is a special type of sensor that indicates an equipmentbehavior overtake danger threshold values. There are four types of alarms: Veryhigh, high, low and very low alarm. There is also analogic sensors that measurethe exact value of a given behavior. Automation controls equipment behaviorsthough events changing actuators status such as pump turn-off. Valves, pumpsand compressors are examples of actuators.

3.3 Stored Procedures

The automation agent continuously harvest data from environment agents toidentify emergencies and trigger shutdowns. It also sends harvested data to theLog Handler agent in the form of text messages.

The Log Handler agent interprets data from the automation system, filtersuseful data and publishes treated data in a blackboard which is stored in arelational table (Figure 2.a). Log Handler agents are platform specific becausethe syntactic rules of messages may vary between offshore units. It discardsrepeated messages, extracts structured data from unstructured messages, vali-date chronology of messages and identify the element (valve, pump, etc.) thatmessages refers to.

There are one analyser agent for each environment agent that should be mon-itored, i.e. a valve has a corresponding analyser which diagnoses its failures. Ifthe valve does not close or open when it should do, its analyser publishes diag-noses and suggestions of actions to operators in the blackboard. The blackboardagent, then, filters what is important.

The rules are propositional implications in the conjunctive normal form whereeach clause states variable transitions or states of alarms, actuators and variables.It is stored in a relational table (Figure 2.b) as well. The inferences currentlydone are checking of actual state of valves, pumps, compressors, etc. and dangerdegree escalation. Danger degree escalation is when fire or gas leak happensin the same area of the initiator of the SD. The inferences of each agent can

MAS for Alarm Management System in Emergencies 7

Fig. 2. Agents processing

be accomplished through a set of relational operations over the two mentionedrelations as shown in Figure 2.c.

The rules for all agents are fed by automation experts though a special knowl-edge acquisition interface (not included in the main model) and stored in theknowledge-base. There are two types of rules: those depending only on the sta-tus of each equipment behavior and those that depend on a combination ofequipments’ behaviors. Therefore, there is a rule chain.

The log handler, analysers and blackboard agents are all implemented usingstored procedures. This strategy allows for the designer of the system to easilydistribute processing as needed taking advantage of database replication. To doso the designer has to setup the replication of the blackboard and rules relationsand distribute the analyser agent procedures across the database replicas asneeded. The log handler and blackboard agents can also be duplicated to increaserobustness and decrease the probability of data loss of event log.

4 Case Study

Now we are going deeper into more details about the agents behavior. Theenvironment and automation agents will not be covered because they do notdiffer from existing automation systems. As example of how to automaticallymanage alarms consider an atmospheric separator SG1 (Figure 3), which is partof a process plant. It is an equipment is which petroleum is separated into oiland gas. It is equipped with three sensors and four alarms that keep track ofthe level inside the vessel and a valve SDV1 (Figure 3) in the oil outlet of SG1.The sensors and alarms of the separator detect and indicate if the level is very

8 Garcia, A. C. B et al.

high, high, normal, low and very low. When level is very high or very low theautomation system triggers a shutdown, i.e. the AS takes a set of actions thatstops the production process of the platform. One of the actions is to close theSDV1. It may happen that, because of the wear of the valve or malfunctioningof closing mechanism, the valve closes but do not completely seal the oil outletor even do not close properly as commanded by automation system.

Fig. 3. Agents processing

Then, the alarm management system has to extract from operational data(event logs) clues that could indicate that the valve is closed after a shutdownis initiated, otherwise the platform would be in an unsafe state until the valveis closed. Of course there will be other actions such as close the inlet valve of oiland the outlet valve of gas, but this actions will be monitored by other agentssimilarly.

Analysing the schema of the separator it can be seen that if the valve is closedit is unexpected that the valve will not declare itself closed. In the platformcontrol system which we were dealing with all actuators declared its state. Itis unexpected, as well, that the level inside SG1 will decrease. On the otherhand, it is expected that the level will not decrease, i.e. will increase or maintain,depending on the state of the inlet valve. It is expected that pressure downstreamSDV1 (Figure 3) decreases as well. The agent can then reason on observationsof what is unexpected and expected and diagnose what is the actual state ofthe valve. If nothing unexpected has happened and everything expected havehappened then the valve would be considered closed otherwise it might be open.

We propose to specify both unexpected and expected behavior rules. Tobetter understand why, note that in the previous example the valve could bepartially open, but an inlet flow could be compensating an outlet flow so that thelevel remains the same, i.e. the level will not decrease as unexpected behaviorrules require. In this case, valve opening would only be detected by checking

MAS for Alarm Management System in Emergencies 9

pressure downstream the valve in the expected behavior rule. Moreover, in termsof knowledge acquisition to configure reasoning rules of agents this approachfacilitates one to remember what should be checked. If we want to assure thatsomething was done it is natural to think in terms of what was unexpected andexpected to happen.

The Log Handler agent receives messages from the automation agent, inter-pret them and publish in the blackboard the state transitions of the elements.The Figure 2.a is a snapshot of the blackboard. Notice that the valve SDV1 isclosed and that the level inside SG1 decreased. It is an unexpected behaviorbecause it means that level inside SG1 decreased, but if the valve was properlyclosed the level could not have decreased.

The reasoning rules can be stored as in Figure 2.b in a relational table.The first two lines represent the unexpected rules for SDV1 analyser agent (UBmeans unexpected behavior) and the next two represent the expected rules (EBmeans expected behavior). Figure 2.c shows the set of relational operations onthe blackboard and rules relations which are processed by the SDV1 analyseragent to detect that the valve is actually open. The agent then corrects the stateof the valve on the blackboard and adds the information about the alarms whichled to the conclusion.

We have tested the system in an environment where the automation agentgenerated up to 30 messages/second during a shutdown and that there was 200analyser agents. The measured load capacity of the log handler agent was 40messages/second and the average processing time for each analyser agent was 7milliseconds.

ESD Date ESD'type Begin End Total Suppressed Presented %'Suppresed23:42:25:40 23:44:01:26 91,55%23:43:57:22 #¡DIV/0!23:45:36:06 23:47:59:22 89,66%

3 22/1/10 ESD-2 23:50:22:40 23:52:46:05 33 33 0 100,00%4 30/3/10 ESD-2 13:34:15.84 13:35:59.34 27 27 0 100,00%5 30/3/10 ESD-2 13:39:28.64 13:41:09.18 12 12 0 100,00%

19:21:52.85 19:23:21.77 94,19%19:21:52.85 19:23:21.78 #¡DIV/0!0:07:30.38 00:09:13.24 90,24% 00:07:30.37 00:09:13.24 #¡DIV/0!

10:57:40.56 10:59:51.36 82,61%20:16:00.19 20:21:20.20 95,58%20:09:09:39 20:22:20.21

ALARMS

71 65 658 52 6

86 81 541 37 4

ESD-3T9 18/6/1046 38 8113 108 5

ESD-3T

6 19/4/107 20/4/108 19/5/10

SHUT DOWN EXECUTION

ESD-2ESD-2

ESD-3TESD-3T

1 22/1/102 22/1/10

Fig. 4. Results. Alarms suppressed

Figure 4 shows the results of the analyses of data from 9 SDs. The last columnrepresents the percentage of alarm suppression. As shown, there are scenarios of93,76

5 Conclusions

In this paper, we have presented an alarm management system that provides asolution for improving operators’ situation awareness during emergency situa-tions in offshore oil platforms. Oil process plant is a complex artifact composed

10 Garcia, A. C. B et al.

of independent subparts that interacts with each other. The results of initialexperiments run in our research lab using actual data information coming fromSD scenarios have shown that only 6 percent of the total of the alarms were vi-sualized to the operator which is an outstanding result. Additionally, operatorsconfirmed that the suppressed information was unnecessary.

References

1. Health and Safety Executive: The Explosion and Fires at the Texaco Refinery,Milford Haven, 24 July 1994 (Incident Report). HSE Books (1997)

2. Mendoza, B., Xu, P., Song, L.: A multi-agent model for fault diagnosis in petro-chemical plants. In: 2011 IEEE Sensors Applications Symposium, IEEE (February2011) 203–208

3. Sayda, A.F., Taylor, J.H.: Toward a Practical Multi-agent System for IntegratedControl and Asset Management of Petroleum Production Facilities. In: 2007 IEEE22nd International Symposium on Intelligent Control, IEEE (2007) 511–517

4. Dheedan, A., Papadopoulos, Y.: Model-Based Distributed On-line Safety Moni-toring. In: EMERGING 2011 , The Third International Conference on EmergingNetwork Intelligence, Lisbon, Portugal (2011) 1–7

5. Aizpurua, O., Galan, R., Jimenez, A.: A new cognitive-based massive alarm man-agement system in electrical power administration. In: 2008 7th InternationalCaribbean Conference on Devices, Circuits and Systems, IEEE (April 2008) 1–6

6. Skogdalen, J.E., Vinnem, J.E.: Combining precursor incidents investigations andQRA in oil and gas industry. Reliability Engineering & System Safety 101 (May2012) 48–58

7. Zarri, G.P.: Knowledge representation and inference techniques to improve themanagement of gas and oil facilities. Knowledge-Based Systems 24(7) (October2011) 989–1003

8. Heydt, G.T., Vittal, V., Phadke, A.G.: The strategic power infrastructure defense(SPID) system. A conceptual design. IEEE Control Systems 20(4) (August 2000)40–52

9. Cochran, T., Bullemer, P., Nimmo, I.: Managing abnormal situations in the processindustries parts 1, 2, 3. NIST Proceedings of the Motor Vehicle ManufacturingTechnology (MVMT) Workshop (1997)

10. Hossack, J.A., Menal, J., McArthur, S.D.J., McDonald, J.R.: A multiagent archi-tecture for protection engineering diagnostic assistance. In: 2003 IEEE Power En-gineering Society General Meeting (IEEE Cat. No.03CH37491). Volume 2., IEEE(2003) 640

11. McArthur, S.D.J., Strachan, S.M., Jahn, G.: The Design of a Multi-Agent Trans-former Condition Monitoring System. IEEE Transactions on Power Systems 19(4)(November 2004) 1845–1852

12. Kornelije, R.: A Combination of Reactive and Deliberative Agents in HospitalLogistics. In: Proceedings of 17 th International Conference on Information andIntelligent Systems, Croatia (2006) 63–70

13. Rabuzin, K., Malekovic, M., Cubrilo, M.: Resolving Physical Conflicts in Multia-gent Systems. In: 2008 The Third International Multi-Conference on Computingin the Global Information Technology (iccgi 2008), IEEE (July 2008) 193–199