understand the importance of process safety management

53
UNDERSTAND THE IMPORTANCE OF PROCESS SAFETY MANAGEMENT SYSTEMS Rethink Your Process Safety Procedures Get the Best Out of Incident Data OCTOBER 2020

Upload: others

Post on 27-Jan-2022

6 views

Category:

Documents


0 download

TRANSCRIPT

UNDERSTAND THE IMPORTANCE OF

PROCESS SAFETY MANAGEMENT

SYSTEMSRethink Your Process Safety Procedures

Get the Best Out of Incident Data

OCTOBER 2020

AD INDEX

KNF • www.knfusa.com/exproof 3

Mary Kay O’Connor Process Safety Center • psc.tamu.edu 25

Instrumentation Symposium 2021 • instrumensymp.wpengine.com 46

Editor in Chief

Mark [email protected]

Executive Editor

Todd [email protected]

Associate Editors

Raju [email protected]

Kevin [email protected]

Contributing Editor

Tom [email protected]

Art Director

Jennifer [email protected]

Production Manager

Daniel [email protected]

Publisher

Brian [email protected]

Bulk Solids Innovation Center

Journal is published jointly by

the Kansas State University

Bulk Solids Innovation Center,

607 N. Front Street, Salina,

KS 67401, and Putman Media,

1501 E. Woodfield Road, Suite

400N, Schaumburg, IL 60173.

Copyright 2020, Kansas State

University Bulk Solids Innovation

Center and Putman Media. All

rights reserved. The contents

of this publication may not be

reproduced in whole or in part

without the consent of the

copyright owners.

Bulk Solids Innovation Center Journal

Understand the Importance of Process Safety Management Systems 4Public acceptance and adequate plant safety require assessment of a number of issues and factors

Get the Best Out of Incident Data 14Let machine learning and artificial intelligence do the heavy lifting

Effectively Share Insights from Incidents 19A proven approach can ensure lessons get communicated and acted upon

Consider Industrial Detonations in Vapor Cloud Explosions 26Risk assessments often overlook the issue but should not

Rethink Your Process Safety Procedures 29Interdisciplinary approach supports workers, strengthens companies

Boost Process Plant Resilience 37Measures can strengthen the ability to recover from an incident

Abandon-in-Place Must End 47Leaving equipment derelict instead of demolishing it can prove costly.

The Emerging Hydrogen Economy Demands Attention 49Recent accidents underscore the need for inherently safer designs

CONTENTS

October 2020 / MKO Process Safety Journal-2-

Editor in Chief

Mark [email protected]

Executive Editor

Stewart Behiestewart_behie@

exchange.tamu.edu

Associate Editor

Noor [email protected]

Contributing Editor

Dirk [email protected]

Art Director

Jennifer [email protected]

Production Manager

Daniel [email protected]

Publisher

Brian [email protected]

MKO Process Safety Journal is published jointly by the Mary Kay O’Connor Process Safety Center,

Texas A&M University, Jack E. Brown Chemical Engineering Building, 3122,

100 Spence St., College Station, TX 77843, and Putman Media, 1501 E. Woodfield Road, Suite 400N, Schaumburg, IL 60173.

Copyright 2020, Mary Kay O’Connor Process Safety Center, Texas A&M University and Putman Media. All

rights reserved. The contents of this publication may not be reproduced in whole or in part without the consent

of the copyright owners.

Trust KNF for proven liquid and gas pump performance in safety-critical applications.

• Suited for NEC/CEC Class 1, Division 1, Groups C & D; IEC EX, ATEX,

and other protection levels available

• Choose from a broad range of pump head and diaphragm materials

Learn more today at knfusa.com/ExProof

EXPLOSIONS.GOOD IN MOVIES.NOT AT YOUR FACILITY.

INTRODUCTION

A recent article “Guidance to Improve the Effectiveness of Pro-cess Safety Management Systems in Operating Facilities,” published in the Journal of Loss Prevention in Process Industries, summarizes the significant number of loss of process containment (LOPC) inci-dents that have occurred in Texas in 2019 and 2020. These incidents resulted in substantial damage to the plants in question, caused sig-nificant impact on the environment and damage to the reputations of the operating companies involved in the eyes of the public.

The article identified a number of factors that led to these events and offered an overall approach for operating companies to improve the effectiveness of their process safety management systems to reduce the potential of future significant

incidents occurring. A few funda-mental questions that undoubtedly are on the minds of the public living in the affected regions are as follows:

• Are these plants safe? • Am I safe to continue living

near these plants?• What systems are in place to

prevent a similar accident from happening in the future?

This article examines the relevant issues and factors that need to be assessed to provide robust answers to the questions posed above and addresses other important ques-tions that need to be considered in determining what constitutes a “safe” operation.

Answering the questions requires understanding the concepts of “hazards,” “risk,” “risk acceptability” and “safety.” Every operating plant is faced with managing a number of hazards related to the materials

handled and the chemicals used in to process feedstock into marketed products. They also must deal with hazards associated with the equip-ment and the conditions in the plant. How a facility manages these hazards and the associated risks will determine whether the operation is “safe.” To begin, let’s take a step back and examine some of the issues and factors starting with defining several relevant terms.

DEFINING THE TERMS

Hazard: A chemical or physical condition that has the potential to cause damage to a receptor such as the public, plant personnel, equip-ment or the environment as well as company reputation. A hazard is a material property. For example, a knife’s hazards are the sharp point and the cutting edge, while toxicity is the hazard posed by a toxic gas. The

Public acceptance and adequate plant safety require assessment of a number of issues and factors 

By Stewart Behie, Texas A&M University

Understand the Importance of Process Safety Management Systems

October 2020 / MKO Process Safety Journal-4-

knife and gas in and of themselves do not constitute hazards. The manage-ment of their properties will dictate the hazard.

Risk: The measure of potential damage a hazard or group of haz-ards cause to a receptor taking into account both the potential dam-age’s magnitude and the probability of its occurrence.

Risk = Consequence × Probability

Safety: A judgement of the accept-ability of risks related to the hazards

associated with a facility’s operation (in this context). A facility is con-sidered safe only if remaining risks associated with its operation are acceptable. There are degrees of risk and hence degrees of safety.

Risk Acceptability: A criterion established by a company or industry or by the risk attitude or appetite of the risk receptors such as the people living in an operation’s vicinity. A risk matrix often is used to assess risks on a qualitative basis to assign a conse-quence level of an event scenario

and the estimated probability of occurrence.

Figure 1 is an example of a 5 x 5 risk matrix that has five conse-quence categories ranging from negligible (1) to catastrophic (5) in consequence categories of “Public injury” and “Asset damage” and five probability levels ranging from very unlikely (A) to frequent (E). The probability level is determined by the number of layers of pro-tection available. The greater the number of protection layers, the lower the probability.

Table 1 shows risk matrix con-sequences or severity levels in the two categories noted above.

Table 2 is an example of a risk matrix probability or likelihood categories. The risk level determined for each event scenario is at the

POTENTIAL IMPACT

Table 1 Example of Risk Matrix Consequence Levels

Rank Categories Personnel Injury Asset Damage

5 CatastrophicFatality, permanent disabling injury, life threatening events

Loss of equipment or significant damage to the equipment; Equipment failure results in catastrophic loss of containment; Severe fire or explosion

4 CriticalMajor exposure or severe injury requiring physician or hospitalization

Upset results in major leak, spill; Minor equipment damage; minor fire event

3 ModerateMinor injury requiring medical treatment

Unplanned deviation requires equipment shutdown; Minor leak or spill

2 Marginal Minor exposure or minor injury Unplanned deviation requires intervention to correct

1 Negligible No expected effects Minor upset

Risk Ranking

Sev

erit

y

5 5A 5B 5C 5D 5E

4 4A 4B 4C 4D 4E

3 3A 3B 3C 3D 3E

2 2A 2B 2C 2D 2E

1 1A 1B 1C 1D 1E

A B C D E

Likelihood

RISK MATRIX

Figure 1. This matrix show five levels of severity and the likelihood of their occurring.

October 2020 / MKO Process Safety Journal-5-

intersection of the predicted conse-quence and the estimated probability. The predicted risk levels range from negligible (green) with no action in terms of risk reduction to unaccept-able (red), which requires operations to be terminated and mandatory risk reduction measures be implemented immediately. The intermediate risk level (yellow) is acceptable with pre-cautions while the major risk level (orange) requires a detailed safety review be undertaken and approved by management.

To be considered safe, a facility must identify all hazards associated with the materials handled and the

equipment in operation, assess the risks associated with each hazard, implement protective measures to reduce the potential for LOPC upsets from occurring, implement mitigative measures to reduce the escalation of impacts of LOPC events should they occur and mon-itor plant operations continuously to identify and react to incipient stages of upset conditions.

It should also be noted that the larger the potential consequence of an upset, the lower the acceptable probability. In addition, threats that cannot be sensed directly, such as radioactivity, are given a much

higher risk factor than upsets that can be sensed, such as an explosion.

HAZARD EVALUATION

AND RISK ASSESSMENT

(DESIGN STAGE)

The evaluation of hazards and soci-etal risks starts at the plant design stage. During a project’s concept development stage and preliminary engineering design stage, hazard identification studies are undertaken. The study results frame the nature of the anticipated plant equipment’s design. The potential impacts of upsets leading to LOPC determine the approach to managing these

LIKELIHOOD

Table 2 Example of Risk Matrix Likelihood Levels

Rank Categories Frequency definition Definition based on number of barriers

E FrequentThis event/upset reported more than once in a year or in the project lifecycle

No engineering barriers and/or only one administrative barrier

D LikelyThis event reported once in year anywhere; Event is likely to occur for new research

One preventive or mitigative engineering barrier; or combi-nation of no more than two administraive barriers

C ProbableThis event reported once in the last 10 year anywhere; event is probable for new research

Only one preventive and one mitigative engineering barriers and at least one administrative barrier; or one preventive or mitigative of engineering barrier and at least three adminis-trative barriers

B RemoteThe event is remotely probable, but not reported anywhere

At least two preventive and one mitigative engineering barriers; or one preventive and one mitigative engineering barriers, good inherently safer design and at least one ad-ministrative barrier

A Very unlikelyThis event is very unlikely and not reported anywhere

At least two preventive and two mitigative engineering barriers; good inherently safer design and at some administrative barrier

October 2020 / MKO Process Safety Journal-6-

upsets in terms of safety systems design such as pressure relief systems, process control, emergency control and shutdown systems, etc.

Once the design has progressed to a stage in which engineering drawings of the process flow and the anticipated equipment are needed, process hazards analy-sis studies are initiated to assess potential failure modes and effects (FMEA) and fault tree analysis (FTA). The most important PHA analysis is the hazard and operabil-ity (HAZOP) studies initiated to interrogate the design and identify potential causes of deviations to the design intent. HAZOP studies are completed at the preliminary design stage and at the completion of the front-end engineering design (FEED) stage and are updated as the design progresses to the final stage ready for construction.

In addition to identifying causes of deviation from design intent, HAZOP studies should uncover hidden design flaws that need to be addressed through design changes and modifications and operating procedures. The HAZOP process is critical and must be facilitated by an engineer with the requisite expe-rience in the process as well as in HAZOP studies and conducted by a team of experts in all fields relevant

to the design. The bottom line is that changes in the plant design and associated control systems at the HAZOP stage (i.e., made on paper) avoid the implementation of costly changes identified and made during the construction or startup phases. The HAZOP process is time-con-suming but extremely valuable and cannot be rushed.

An effective HAZOP process asks and answers the five questions of risk assessment shown in Figure 2 based on plausible event scenarios the assessment team identified for every part of the design (node). The components in red text below the questions are the terms in risk assess-ment terminology. The combination of the responses to questions 2 and 3 provide the risk associated with

the event scenario because, as noted above, risk is the product of the event scenario’s anticipated consequence and its estimated probability.

Several consequence categories are assessed during the risk assessment process, including personnel injury, offsite/public impacts, environmental impact, equipment damage and loss of revenue. A company-developed risk matrix is used to estimate the risk of each event scenario’s being the intersection of the predicted con-sequence and estimated probability. The risk matrix defines the action required should the predicted risk fall in the unacceptable range as defined by the company’s criteria.

When the estimated risk level of an event scenario falls into the unacceptable range, the HAZOP

1What can go wrong?

Hazard Identification

2What are the adverse impacts or consequences?

Consequence analysis

3How likely is it to happen?

Frequency analysis

4Do I need to do anything about it?

Risk Evaluation/Risk Assessment

5What should I do about it?

Risk Control

RISK MANAGEMENT COMPONENTS

Figure 2.These five risk assessments questions are part of an effective HAZOP process.

October 2020 / MKO Process Safety Journal-7-

team must identify additional control measures or layers of pro-tection to reduce the predicted consequence so that the resultant risk level falls into the acceptable level. Most companies have a rigorous process for determining which protection measures are acceptable. An example of such a program developed by a large oil and gas company is shown in the references (Behie et al, 2016). The team documents the reassessed residual risk level with the proposed measures in place. In addition to proposing preven-tion measures to reduce predicted levels, the HAZOP team identi-fies measures in place to mitigate the impact of an event scenario should it occur to minimize the impact. These preventive and mit-igative measures are recorded in the HAZOP worksheets.

An important step in the overall process is a third-party verification of the risk assessment studies to ensure that all components and risk aspects have been addressed. Even when HAZOP and FMEA members are well-regarded and have sufficient time for the anal-ysis, there is solid evidence that hazards are overlooked. Also, there are examples that residual risk still can materialize.

An effective risk assessment process will ensure that the design as proposed is safe, which means that the risks associated with the possible event scenario (i.e., upset conditions) fall within the overall acceptable range.

PREVENTION BARRIERS

Among the critical components of an overall plant design are the measures built in to function as barriers or protective measures to prevent the conditions that have the potential to lead to a LOPC event from occurring. The con-ditions that must be avoided in the operation of process vessels or piping systems include:

• prevention of overpressure or vacuum conditions from devel-oping, which can result from a variety of mechanisms

• excessive wall corrosion • vehicle collision • water hammer The design features that prevent

these conditions include: • pressure relief systems on

vessels and piping systems designed to relieve pressures at levels well below the maximum allowable working pressures

• corrosion-monitoring systems to monitor the rate of wall thick-ness degradation so action can

be taken well before minimum wall thickness is reached

• corrosion coupons placed in areas of anticipated high cor-rosion to provide an indication of the rate of degradation, allowing corrective action to be planned

These measures are defined as layers of protection designed to keep processing fluids and chemi-cals from escaping from the process units and associated piping. In other words, these systems are designed to maintain mechanical integrity of plant systems. Should the systems fail to perform as designed or perform only partially, LOPC events can occur, resulting in the release of processing fluids (gases, liquids and chemicals) with potential impacts on personnel, equipment and the environment.

FIRE AND GAS DETECTION

AND PROTECTION SYSTEMS

Should a LOPC occur, it is crit-ically important that releases are detected immediately and warn-ings given by way of alarms to alert plant operators to the release. Undetected releases could escalate into a significant incident if correc-tive action is not taken immediately and effectively. Proper design of detection systems that provide

October 2020 / MKO Process Safety Journal-8-

these warnings is critical to main-taining safe operations. The main early-warning systems from a pro-cess safety perspective are the fire and gas detection systems, which are designed to detect flammable and toxic gas releases as well as releases of excessive heat (i.e., fire) indicating a potential upset condi-tion in a process unit or piping.

The system responds by sound-ing an alarm to alert operators to a gas release or a potential fire and identifies the specific area of the plant where the event occurred. The system design includes a range of protection responses from alarm only to executive action that could initiate an automatic discharge of a water system such as a vessel deluge flood. The design of fire and gas detection systems is critical and involves expertise in the instru-mentation as well as experience in system design and installation. The fire and gas detection and protection systems design is sup-ported by sophisticated modeling to determine optimal placement of detectors as well as the number, type and voting logic to be used to initiate a response.

MITIGATION MEASURES

The plant emergency response team should go immediately to the site

of the alarm to control the situ-ation and mitigate the potential of the event escalation to a major incident. A number of incidents have occurred in which the detec-tion system did not identify a gas release or a fire until the event escalated to the point that sub-stantial impacts occurred. Other important detection systems that provide an indication of a trend to failure are conditioning monitoring systems and risk-based inspec-tion programs.

PROJECT CONSTRUCTION

AND STARTUP PHASE

This phase’s objective is to con-struct the plant as per the design depicted in the detailed drawings (be they PFDs, P&IDs, C&E, etc.) that reflect the modifica-tions recommended during the detailed risk assessment process. In addition, the construction team ensures that all equipment and components are built and installed per the required engineering spec-ifications and codes. Inspection teams visit equipment assembly sites to ensure that equipment and components are built to the design specifications and are coded/stamped as such.

During this phase, while the plant construction is going on,

operations management assembles the operations team to start the planning for operations by set-ting up the various management systems needed to ensure robust and safe operations. These systems include the HSE management plan/system, the risk management plan, the process safety manage-ment system, the asset integrity plan, the equipment inspection plan, the emergency and response plan, etc.

Managing the risks from con-struction to initial operations is discussed in detail elsewhere (Behie et al., 2008). From a safety perspective, it is critical to have these systems set up and functional so that the operations risk assess-ment team can conduct the risk assessments required to determine the readiness of the equipment components and plant systems to be started up. This process also will help confirm that all design requirements have been met. Equipment components and oper-ating systems are started up in the proper sequence to confirm that they meet the warranty require-ments the equipment suppliers guarantee. Once the startup of all the systems is complete, the entire plant is brought online and perfor-mance specifications confirmed.

October 2020 / MKO Process Safety Journal-9-

OPERATIONS PHASE

Risks occur during the operations phase in many different ways: human failure, process control failures, wear-out of rotating equipment, heavy pipe vibration, unplanned large people concentra-tions, etc. The management of risks and safety during a facility’s oper-ational phase is best accomplished by establishing a comprehensive process safety management (PSM) program that incorporates and integrates all of the relevant pro-grams such as risk management, mechanical integrity, inspection, testing and maintenance, emer-gency response, etc.

This program focuses on “critical equipment.” Equipment that meets the criteria of providing protection against major LOPC events are iden-tified by risk studies and usually are fire and gas detection systems that provide warning of releases events. The asset integrity team, which includes the equipment inspection team, establishes the initial frequency of component testing and vessel inspection of critical plant equipment and adjusts the frequency based on results. Risk-based inspection and condition monitoring inspection pro-grams offer advanced techniques to improve the performance of mechan-ical integrity programs and reduce

the potential for LOPC events from occurring. These programs are critically important as plants age, requiring adjustments to be made in the frequency and intensity of the health of protective barriers.

CHALLENGES OF MAINTAINING

AN EFFECTIVE PSM PROGRAM

Behie (2020) outlined a number of challenges to improve the effective-ness of PSM systems at operating plants in view of the significant incidents that have occurred in SE US over the past few years. These challenges include:

• Ensuring senior management is given adequate risk-based information on which to make decisions and ensure that deci-sions are referred to the correct management level based on overall risk level

• Adjusting operations staff to effectively accommodate the dynamic changes in the workforce that has resulted in a substantial drop in experi-ence levels at the facility level in particular

• Maintaining effectiveness of process safety training and knowledge at all levels of the organization

• Adjusting to meet the needs of aging plants

• Ensuring programs are in place to monitor the health of protective barriers

• Improving all aspects of emer-gency preparedness, response and recovery plans

• Improving communications with external stakeholders

Of particular importance is the need to educate, train and raise process safety awareness among all levels of the organization. Starting from frontline operators to the topmost senior management of the company, the importance of safety should be understood and made second nature. In that regard, tar-geted training focused on the level of the organization should be pro-vided. Technical colleges still largely lack in providing sufficient process safety knowledge to their students who may end up working in process facilities. Curriculum development focused to meet the need for such colleges, as well as universities, can help fill the gap. Bootcamps designed for executives can help the senior management make more informed risk-based decisions. The general public, with whom this arti-cle began, can be made risk-aware through good communication and education. Understanding the haz-ards of a chemical facility in their vicinity and the role of barriers in

October 2020 / MKO Process Safety Journal-10-

containing that hazard can impact public acceptance criteria.

THE ROLE OF GOVERNMENT

REGULATION

In response to major incidents with significant impacts, governments around the world have instituted regulations that set out minimum standards for operating companies. Examples of the major incidents that have driven government over-sight are:

EU Seveso Directives (1982, 1996 and 2012): These directives were implemented in response to major chemical plant incidents, the first of which was at the township of Seveso in Italy in 1976. The incident in Seveso resulted in dev-asting environmental impact on surrounding lands, killing livestock and wild animals from a dispersed, highly toxic dioxin released from a runaway reaction at a nearby plant. An amendment to the Directive that came in 2003 also had sev-eral incidents in its tail: one from a cyanide spill in the Danube river, another from a fireworks warehouse explosion that killed firefighters and nearby residents in Netherlands and another from an ammonium nitrate detonation in France. The final Seveso III Direc-tive came out in 2012.

United Kingdom HSE: The undertakings of the UK HSE primarily followed from 1974’s Flixborough incident in Britain where the lack of process safety expertise and insufficient appreci-ation of the failure consequences led to an explosion that killed 28 people and injured 36. HSE ensures the implementation of its legislation Control of Major Acci-dent Hazards (COMAH), which was derived from the EU Seveso Directive I. The performance-based regulation requires facilities to take

the responsibilities of protecting their employees and the public by reducing their risk of operation to “as low as reasonably practi-cable” (ALARP). The Offshore Installations (Safety Case) Regu-lations were introduced in 1992 in response to the Piper Alpha disas-ter that resulted in the 167 deaths from fires and explosions on a large offshore platform in the North Sea.

United States Occupational Safety and Health Administration

(OSHA): Established in 1975, OSHA promulgated the PSM regulations in 1990 in response to a series of industrial accidents, beginning with the toxic methyl isocyanate (MIC) release from the Union Carbide facility in Bhopal, India in 1984. In 1985, Union Carbide had another release of toxic chemicals within the United States in its West Virginia facility that injured 135 people. Other incidents included the Philips Pasadena, Texas incident in 1989, which resulted in 23 fatalities

and the complete destruction of a large chemical plant from several powerful explosions and associated fires after a massive LOPC of vol-atile liquids and flammable gases occurred during regular mainte-nance work.

The regulations that came about in response to these major incidents established minimum standards for industry to achieve from the perspective of programs related to safety and process safety of their

Governments around the world have instituted regulations that

set out minimum standards.

October 2020 / MKO Process Safety Journal-11-

operating facilities. These regula-tions, supported by regulatory audits of operating facilities, have been significant in driving safety perfor-mance. For operations that do not have the resources or expertise to implement their own corpo-rate standards, compliance to regulatory requirements guide their operating performance. This overall approach is an outside-in paradigm, the success of which by its very nature gives rise to mixed results.

The Voluntary Protection Pro-grams (VPP) promoted by OSHA encourage the development of effective work-site safety and health programs through voluntary cooperation among management, labor and OSHA. Operations that can demonstrate these cooperative relationships in the workplace and have implemented a comprehensive safety and health management system can apply for VPP status. Approval into VPP is OSHA’s offi-cial recognition of the outstanding efforts of employees and employ-ers who have achieved exemplary occupational health and safety.

OSHA also promotes other pro-grams such as the OSHA Strategic Partnerships and the Alliances Program to promote safe and healthy work environments through

cooperative programs among it, the industry and other stakeholders.

United States Environmental Pro-tection Agency (EPA): In the initial years of EPA, Superfund sites were established following the disas-trous consequences of unregulated hazardous waste dumping in the community of Love Canal. In 1994, two years after launch of OSHA’s PSM, EPA published its first List of Substances and Threshold Quantities and in 1996, it issued the Risk Man-agement Plan (RMP) rule.

While OSHA focused on employee safety within a facility through its PSM, EPA’s RMP’s purpose was to protect commu-nities beyond a facility’s fence line. The rule required the facil-ity owner or operator to conduct hazard assessment, including offsite consequence analysis; develop pre-vention programs that ran parallel to OSHA’s PSM; have emergency response programs; and submit a risk management plan to EPA.

United States Bureau of Safety and Environmental Enforcement (BSEE): Following the 2010 Deepwater Horizon incident, also known as the Macondo Disaster, the then Minerals Management Service (MMS) was broken down to Bureau of Ocean Energy Management, Regulation and

Enforcement (BOEMRE) and BSEE. BSEE enacted regulatory reforms to achieve both improved drilling safety and worker safety through the Safety and Envi-ronmental Management Systems (SEMS). The widescale environ-mental impact — and the loss of 11 lives and the entire drilling rig at the Macondo well — had a signif-icant impact on the entire oil and gas industry post 2010.

ROLE OF INDUSTRY

Most larger corporations, however, have developed internal stan-dards and corporate performance requirements that go well beyond regulatory compliance. Examples of these integrated performance driv-ing standard include ExxonMobil’s Operational Integrity Manage-ment System (OIMS) program, DuPont’s Process Safety Man-agement System and Chevron’s Operational Excellence Man-agement System. These programs integrate all of the components of process safety, mechanical integ-rity, integrity management, etc., into one over-arching program that drives operations toward the goal of achieving operational excellence.

Responsible Care (RC) Initiative: In addition to the corporate stan-dards and programs mentioned

October 2020 / MKO Process Safety Journal-12-

above, industry groups have developed standards to provide further guidance to their member companies to drive facility per-formance toward the long-term goal of operational excellence. Responsible Care is an example of such an industry initiative designed to drive improvements in the health and safety perfor-mance related to employees, the environment and the communi-ties in which they operate.

This initiative, which started in Canada in 1984, has grown into a global initiative now adopted by 68 countries worldwide including the American Chemistry Council, which has made participation in the RC program a condition of membership. The commitment to the guiding principles is an inside-out paradigm, which by its very nature drives overall performance from the C-suite to the operating floor. Member companies welcome the input of the communities in which they operate by holding open houses and proudly outlining the many programs in place to protect the health and welfare of their local communities. With programs like RC, industry is seeking to earn the right to continue to oper-ate in their current areas through overall transparency.

OVERALL SAFETY OF

OPERATING FACILITIES

Operating plants that have developed and maintained effective process safety programs are indeed safe to work in and live near. However, there are no guarantees unless proper process safety measures are taken and appreciated. While the probability of a LOPC event occurring and esca-lating into a significant incident with offsite impacts is extremely remote, things will gowrong unless adequate barriers are put in place to stop the trajectory of an initiating event. To gain a high level of confidence, members of the public are encour-aged to garner their own opinions by attending plant open houses and asking questions related to the safety of plant operations. Statistically, one of the safest places to be is in a well-managed and well-operated processing plant.

Process safety implementation that begins with the inception of a facility and continues throughout its life cycle can ensure incidents that occurred in Bhopal, Texas City, Macondo and others are not repeated. Operational discipline through commitment to safe opera-tion, not only by frontline operators but by all levels of the organization, is of utmost importance. This can be achieved through targeted training

and education on process safety.The future of process safety is

indeed promising. As companies move toward digitalization and data on plant operations becomes read-ily available, it will enable relevant parties to monitor the quality of the safety management system by following trends in data that can indicate imminent danger, termed leading indicators. Past incident data can be analyzed to learn and take proactive measures that prevent the onset of future incidents. Some companies already are on that route. It will give safety chiefs for the first time more indication than just gut feeling.

DR. STEWART BEHIE is the Interim Di-

rector of the Mary Kay O’Connor Process

Safety Center and Professor of Practice

at the Department of Chemical Engineer-

ing at the Texas A&M University. He has

more than 40 years of experience in Oil

and Gas Industry in various roles related

to process safety, risk assessment and

management in North America and the

Middle East. While with Dolphin Energy in

Qatar, Dr. Behie, filled in as Chief Emer-

gency Officer for a year responsible for

preparing the fire response and medical

teams for full operations. His last role was

HES and Safety Manager for onshore and

offshore operations. He can be reached

at [email protected].

October 2020 / MKO Process Safety Journal-13-

Digitalization and auto-mation of industrial activities have led to

data’s playing a much bigger role than ever before. Buzzwords such as big data, machine learning, Internet of Things, digital twin, 5G and similar terms are commonplace without a clear understanding of how they are related to our work and how they will change the working environment in the future.

The technologies behind the buzzwords not only are beneficial to computer science or information technology professionals but also provide a technological edge to all engineering professionals who understand and use them. These terms have infiltrated the process safety and risk assessment disci-plines in diverse ways.

However, one trend has received a lot of interest among process safety professionals in both industry and academia recently: the analysis of process safety incident data to extract useful information and trends to

generate valuable knowledge and actionable solutions. While the focus currently is on lagging indi-cators, the future focus needs to be on the development of leading indicators derived from facility safety management systems.

The ultimate goal is to under-stand what incident data is telling us about the safety of a process, an operation, a whole plant or an industry at large. We see efforts to uncover treasures from both very scant and very big datasets. The real challenge is to turn the data into actionable solutions using both human and artificial intelligence. Of course, incident data is gener-ated in individual facilities, but a great deal of data is available in the public domain. To illustrate the point that I want to make, I’ll focus on the latter.

PUBLIC DOMAIN INCIDENT DATA

A large amount of incident data is available in the public domain as collected and hosted by sev-eral federal agencies such as the

Occupational Safety and Health Administration (OSHA), Pipeline and Hazardous Materials Safety Administration (PHMSA) and Bureau of Safety and Environ-mental Enforcement (BSEE). The National Response Center (NRC) database hosted by the United States Coast Guard also collects a huge amount of release data.

A few other databases are main-tained by the industry bodies such as the Center for Chemical Process Safety (CCPS) and Center for Offshore Safety (COS) that are not publicly available and are accessible only to affiliated paid members. These databases contain incident records reported by oper-ating companies as guided by the regulatory requirements or indus-try standard (e.g., API RP 754). Different reporting criteria, format and data processing standards result in significant variation in the datasets. Another great source of incident data is the incident investigation reports available at the Chemical Safety and Hazard

Get the Best Out of Incident DataLet machine learning and artificial intelligence do the heavy lifting

By Noor Quddus, Mary Kay O’Connor Process Safety Center, Texas A&M University

October 2020 / MKO Process Safety Journal-14-

Investigation Board (CSB) and National Transportation Safety Board (NTSB) along with the other agencies mentioned above.

USING THE DATA

There is no doubt that the data is beneficial and provides valuable rec-ommendations to improve process safety management in operating facilities, but we should be aware of the limitations of the data that cur-rently is available. Extracting useful information from these colossal amounts of data is tedious and chal-lenging. Because the datasets are not like each other, the techniques developed to analyze one dataset may not be directly applicable for another. In many cases, in-depth data has not been collected because of limitations of the predetermined reporting criteria.

Nevertheless, these databases and collected data have one thing in common: They all are related to hazardous materials and hence to process safety. It is possible to translate the gathered data into information and convert it into knowledge using necessary domain expertise. However, we need appropriate computational capac-ity through advanced machine learning and artificial intelligence techniques to analyze these large sets of data, extract valuable infor-mation and convert them into useful knowledge and actionable solutions.

The greatest learning from inci-dent data is identifying an incident’s root cause and obtaining insight on which proactive measures will prevent future incidents. However, typical incident databases report and categorize the direct causes of incidents and sometimes pro-vide hints of underlying causes. Interactions among the causes or factors that contributed to the inci-dents can be identified only from in-depth analysis or incident inves-tigation reports.

It is possible to walk back to identify the deviations of process variables, inadequate measures, human and organizational issues, all the way to policy, standards and regulatory pieces that contributed to the incidents. Such information is instrumental for incident pre-vention and mitigation measures, for adapting process safety man-agement policies for the future and for helping us identify components of process safety management ele-ments that need to be improved.

TURNING DATA INTO

KNOWLEDGE

We often call it “learning from incidents” in the process safety lit-erature. Sometimes these learnings can be complex but comprehen-sible. More important, there is abundant relevant information extracted from these data sources in various well-known resources. It all becomes knowledge when

someone can connect the dots among different pieces of informa-tion and understand the underlying mechanism that led to similar incidents.

However, the challenge is that the relevant data, information and knowledge are not organized in an easy, accessible and structured manner. No formal and uniform structure or classification system has been developed that can guide us to distinguish and differenti-ate among data, information and knowledge. This overall situation is part of the reason why companies have a poor history of learning from past incidents, whether their own or from their industry sector.

Data-Information-Knowl-edge-Wisdom (DIKW), a hierarchical model, can be useful to explain some structural and func-tional relationships among data, information, knowledge and wisdom. From the perspective of incident data analysis, data represents gathered facts and figures relevant to material (e.g., flammability limit and amount of hazardous material), processes and equipment (e.g., pressure and tem-perature), personnel (e.g., experience and training) and organization and industry (e.g., standards and policy), to name a few.

Any higher level of observation that establishes an association among the data or with the inci-dent can be considered information. Typically such information is used

October 2020 / MKO Process Safety Journal-15-

to characterize most of the inci-dents. Examples includes fire, toxic release, equipment failure, lack of training and inadequate standards. The next level of conclusions that can be drawn from such informa-tion is coined as knowledge. A few examples of extracted knowledge can be determining the overpres-sure of an explosion from incident data, understanding issues leading to a deflagration to detonation transition (DDT) event or failure from corrosion in a pipeline system and understanding causes behind a failure to adhere to a procedure by an operator or the safety culture maintained in a facility.

Information gathered from var-ious sources need to be processed and analyzed before a conclusion (i.e., knowledge) can be drawn. Sometimes this information is imperiled by subjective observa-tions and interpretation. This is one of the reasons that wide variations in gathered knowledge exist. It is relatively easy to make a distinction between the data and information, but it is harder to do the same between information and different knowledge levels.

METHODOLOGY

Some knowledge can be explicit while some is tacit. It is important that we establish a methodology that allows one to convert informa-tion into knowledge in an objective manner:

• Causal inference is a process of drawing a conclusion about a causal connection between events based on the condi-tion of effects can be useful in establishing relationship among information at hand.

• Ladder of causality provides different levels of insights by converting data and informa-tion into knowledge in a more objective fashion.

The model has three levels of causation: association, intervention and counterfactual.

Basic level. At the basic level, the associations invoke purely sta-tistical relationships, defined by the data. For instance, from past incident data we know that the extent of vapor cloud congestion somehow plays an important role in DDT incidents, or the pH value of a medium is important for a corro-sion mechanism to propagate. Such associations can be inferred directly from the gathered information. However, these associations are purely statistical relations and do not necessarily provide any cause and effect relationship among the variables. Hence, these associations are placed at the bottom of the hierarchy.

Second level. The second level, intervention, ranks higher than association because it can answer what happens if there is any change in one variable. This level should be able to answer what happens

(effect) if we change the level of congestion (cause)? How does it affect the onset of DDT? If we increase the pH value (cause), how does it affect the corrosion rate (effect)?

Answers to such questions may or may not come from incident data alone. More information may be needed, from external sources, to answer these questions. However, the past explosion incidents might have occurred at varying condi-tions, or the pipe failures may have taken place at different operating conditions. The models at an inter-ventional level will not be able to answer any questions about the conditions for which they are not developed.

Top level. The top level is called counterfactuals. A typical question in the counterfactual category is “What if I were to act differently,” thus necessitating retrospective reasoning. What if the obstruction has a different geometrical shape? What if we use corrosion inhibi-tors? Counterfactuals are placed at the top of the hierarchy because they ask interventional and associ-ational questions.

If we have a model that can answer counterfactual queries, we also can answer questions about interventions and associations. For example, if we have a robust corrosion model that is at the coun-terfactual level, it will be able to answer all mechanistic questions

October 2020 / MKO Process Safety Journal-16-

regarding corrosion even if some conditions change. For example, it (not sure what it is) will be able to determine the required cathodic protection or inhibition necessary to control the rate of corrosion.

However, the models at the interventional or associational levels will not be able to answer such queries. Corrosion rate cannot be predicted simply by knowing that pH is playing an important role in a corrosion mechanism, which is an associational level knowledge. No counterfactual question involving retrospection can be answered from purely inter-ventional information because the effect may depend on multiple causes, and changes in any of those causes will change the effect.

MIND VS. MACHINE

The human mind can understand immediately what is cause and what is effect; mathematics and computers cannot. On the other hand, our mind is limited to com-prehend a multiplicity of causes and effects as exist in a system at one time, but computers can handle large quantities of opera-tions fast. To resolve this situation and apply a computer’s capability, our target should be to develop a datacentric model at the coun-terfactual level that can answer counterfactual queries.

Because ladder of causation models can be developed using a

formal mathematical structure, it is possible to develop them in a more objective manner. Moreover, machine learning algorithms are available for utilization to scav-enge and classify large datasets for information. However, currently it is only possible to develop such models for small systems such as DDT determination and corrosion rate measurement. Developing such a model is very challenging for a large complex system such as an entire chemical plant or oil refinery or offshore drilling rigs.

Machine learning and artificial intelligence techniques and tools can be very helpful when con-verting data into information and knowledge. They are beneficial for all three levels of the hierarchy of causal models. Association rule learning is a rule-based machine learning method for discovering interesting relations between variables in large datasets. It is particularly useful in reducing the number of variables to be used for further processing when a dataset has a large number of fields for each incident record.

Once the most relevant vari-ables are identified, several other machine learning techniques can be used effectively for further model development, such as clas-sification techniques (e.g., decision tree, random forest and support vector machine), clustering tech-niques (k-means and density-based)

and self-learning (neural net-work). The latter, artificial neural networks, now come in many variations for different tasks, but in general they have superfitting data capability and are suitable for pattern recognition. Some are more effective than others at identify-ing the relationships among the variables and support prediction, although they don’t have much power outside the data range of interest. Artificial neural networks vary in ease of use, data screening requirements, robustness and accu-racy, and it might not be wise to rank one over another in terms of effectiveness.

The major limitation of this class of techniques is that the models are useful as long as they are used for the same system from which the data was collected. Consider a dataset collected for corrosion failure of a pipeline system, and a model is developed with an excel-lent accuracy level. If the operator introduces a new inspection mech-anism or replaces a reinforced old steel pipeline with plastic pipes, the model’s prediction capability will deteriorate because no rel-evant data was used to train the model. A self-learning model will regain its effectiveness over time as it acquires the new data of the new system. In other words, they do not get past the second ladder of causation, i.e., intervention, to being counterfactual.

October 2020 / MKO Process Safety Journal-17-

MACHINE LEARNING

TECHNIQUES

Bayesian network, A solution to this can be the use of a Bayesian network (BN). This probabilis-tic graphical model represents the conditional dependency among all identified variables with a directed acyclic graph and as such forms a causal struc-ture. It is acyclic because an effect cannot be its own cause, although pertinent feedback can be included. It applies the Bayes’ theorem formulating that prior statistical information can be updated with new.

A BN has the advantage of meeting the requirements of all three hierarchies of the ladder of causation. Given data, the network can determine the probabilities of other dependent variables, hence effects that in turn can be causes for others, and it can take the advantage of expert knowledge where data is absent. However, building a robust BN requires precise knowledge of a system’s causation structure as in a fault tree or an event tree. Although given data, there are algorithms available to build the most probable net-work. Overall, it is safe to say that BNs have specific advantages over other machine learning techniques for both incident analysis and for learning from the incidents.

Natural language processing. Another relevant tool is useful for learning from incident findings is natural language processing (NLP), in particular for analyzing incident descriptions and inci-dent investigation reports. Very often, incidents are reported in a structured format, and if there is additional information that does not fit in that structure, then the only means to report it is through incident narratives. It is almost impossible to exam-ine thousands of such incident narratives manually and extract useful information. The NLP uses different types of machine learn-ing algorithms to analyze and extract useful information that can be used for further analysis. NLP also can be used to analyze the incident investigation reports for thematic study. The data extracted from the NLP then can be used to develop a counterfac-tual model via BN.

At the Mary Kay O’Connor Process Safety Center (MKOPSC), machine learning techniques have been implemented at all three levels of causation for various application areas. For predicting material properties and corrosion failure in pipeline, procedural complexity interventional models have been developed using algo-rithms such as random forest, support vector machines, decision tree, k-nearest neighbors and

artificial neural network. BNs have been used to predict pipe-line failure, microbial corrosion and offshore fire incident and for developing leading indica-tors, performing fault diagnosis and identifying weak signals to name a few. NLP has been used to extract information from both incident records and incident investigation reports.

Although there are only a few excellent incident data-bases and incident investigation resources, there still is a scope of improvements in data collection methodologies and subsequent analysis techniques. We con-tinue to observe similar incidents occurring repeatedly and wonder why we are not learning. The knowledge is gathered from the incident data, which is scattered and inaccessible to stakehold-ers when necessary. We need to ponder how we can build a system that will not only translate the data and information into knowledge, but also identify the required knowledge and deliver it at the right time in the right place to the right people. Perhaps we can call that wisdom. Will the future AI have that wisdom?

NOOR QUDDUS, PHD, is a research

scientist at the Mary Kay O’Connor

Process Safety Center, Texas A&M

University. He can be reached at

[email protected].

October 2020 / MKO Process Safety Journal-18-

We all are contin-ually learning. Learning from our

own mistakes as well as from the experiences of others makes us better prepared to deal with future events. Organizations are no different — successful ones use insights gained from mis-takes, incidents, accidents and other undesirable occurrences to improve.

Identifying those lessons you really must learn, regardless of their source, and putting into action a process to take advantage of the learning is the very defini-tion of continuous improvement

that can make your organization safer, smarter and more sustain-able. The opportunities those lessons afford can play a key role in remaining competitive, effec-tive, efficient and profitable.

Every site in your organiza-tion spots local opportunities. However, some ideas, e.g., ones related to operational improve-ment or process safety, that could

have broader applicability often don’t get entered into a corporate knowledge- or incident-manage-ment system because site staff don’t appreciate their wider value to the organization.

Only a small percentage of opportunities identified at a facil-ity will be important to its larger business unit. Even fewer will have enterprise-wide relevance

A proven approach can ensure lessons get communicated and acted upon

By Mike Bearrow, Rolls-Royce Controls and Data Services, and Kim Turner, consultant

Effectively Share Insights from Incidents

Site staff don’t always appreciate that some ideas have wider value to the organization.

October 2020 / MKO Process Safety Journal-19-

(Figure 1). However, those that do must be managed in an efficient and effective way.

An organization should look for more than internal ideas. It should check for opportunities uncovered elsewhere. Regulatory bodies often spell out general lessons from major events. So, review high-pro-file incident investigations and findings — e.g., on the explosion at BP’s Texas City refinery, the Buncefield fuel-depot disaster in the U.K., and the Fukushima nuclear catastrophe in Japan — for opportunities that might apply to your organization. Changes

in reporting requirements by the U.S. Environmental Protection Agency and Occupational Safety and Health Administration, and the U.K. Health and Safety Exec-utive’s Control of Major Accident Hazards (COMAH) regulations also mandate all facilities to react and adapt.

In addition, you must consider how your organization manages global opportunities today. Often, it involves casual or informal shar-ing. Unfortunately, this usually is inadequate because the appropri-ate people in your organization may not see or understand the

opportunities. Companies are using different techniques today to address this tough problem — with mixed results. We recommend an approach called HUAA that has proven effective.

A MORE EFFECTIVE APPROACH

HUAA stands for Heard, Understood, Acknowledged and Actioned. It consists of the follow-ing core steps:

1. Identifying opportunities (Heard);

2. Entering them into an elec-tronic system;

3. Having experts review them (Understood),

4. Accepting or rejecting them (Acknowledged); and

5. Assigning each to leadership to resolve and track to closure (Actioned).

A good HUAA process provides visibility to leadership about the opportunities being identified, and reports on the progress in real time (Figure 2).

The approach offers a number of other significant benefits:

• Efficient collection. Import-ant opportunities from many sources (internal and exter-nal) are collected and entered

BROADER APPLICABILITY

Figure 1. Events at sites lead to local learning and corrective or preventive action (CAPA) but some may apply more widely, to the business unit (BU) or entire enterprise.

Most IMS value is at the site level where the risk is and the events happen...

Region 17% BU learning

& CAPA

Region 2

Region 3

GlobalCompany

3% Global learning &

CAPA

Site 185% - Events locallearning & CAPA

Site 1

Site 1

October 2020 / MKO Process Safety Journal-20-

into the HUAA management system, streamlining the recording process and getting the opportunities quickly to decision-makers.

• Expert review. These oppor-tunities are vetted, parsed and resolved for inclu-sion or exclusion, with the decision documented and communicated.

• Accountable assignment. Accepted opportunities are sys-tematically assigned to leaders and managed across the enter-prise at the executive level.

• Action. Leaders are held

accountable for their actions. Moreover, results of their actions as well as inaction can be seen real time, which spurs action.

• Results. The organization will gain competitive advantage by active listening, learning from its experience as well as the experience of others, and by taking action. The results of the process can be audited, judged and continually improved.

The HUAA process can underpin an organization’s knowl-edge-management strategy and provide the most important part — action!

HUAA STEPS

The management system is a blend of people, technology and process. All the pieces must fit together and work efficiently. Probably the most important design element is the visible accountability of everyone involved, which allows leadership to constantly review progress and ask questions about global opportunities and their closure. Let’s take a deeper look at each of the HUAA steps.

Heard. First, opportunities must be heard. That means key staff members (idea generators) must be on the lookout for opportunities at all times in all places. They must

HUAA APPROACH

Figure 2. This formal process involves several distinct steps and requires both feedback to the idea initiator and action.

Heard - Opportunitycollected and submitted

to the HUAA System

• Global concerns• Incident learnings• CSB findings• API standards• Audit best practice

• Analysis of the opportunity

• Root causes• Lessons learned• Review of

applicability

• Acceptance or rejection of the opportunity into the HUAA system

• Prioritize the opportunity

• Assign and report• Measure

performance

Understood

Feedback to initiator Action for seniormanagement

Communication & reporting along the HUAA Continuum

Acknowledged Action

External & InternalOpportunities

October 2020 / MKO Process Safety Journal-21-

be active attendees at events that might produce an opportunity. For instance, a key mechanical engineer might be charged with going to meetings of the American Society of Mechanical Engineers and even participating in groups developing industry guidance. A process safety management (PSM) specialist might attend meetings of AIChE’s Center for Chemical Process Safety and the Mary K. O’Connor Pro-cess Safety Center International Symposium and even volunteer to chair a working group. This PSM person also must keep up-to-date on, e. g., safety incidents occurring in the industry as well as changes to PSM and risk-management-plan rules, suggested or on the horizon.

After identifying a potential opportunity, the idea generator enters it into the HUAA management system. This involves explaining what the opportunity is and why it’s important to the organization (upside and downside). The person should be gathering opportunities and entering them all the time.

Understood. Once an idea is captured, the HUAA process must initiate a quality review by the right person at the right time. Inaction should generate

escalation so ideas don’t await review for too long.

The reviewer acts as gatekeeper, evaluating the idea for applicability (understanding), and either reject-ing or accepting it into the system. The person also provides a value and priority for any accepted idea. The next step is the acknowledg-ment of the idea.

Acknowledged. The gatekeeper communicates the decision to the initiator and documents it in the system. If the idea is rejected, recording closure comments and sending these to the initiator closes the communication loop — the initiator has spent time and energy entering the idea in the first place and believes it has merit, and so deserves such feedback. Docu-menting the reason for rejection is a great way of monitoring involve-ment and the quality of the process.

The gatekeeper then selects a leader, like a plant manager, to be responsible for the accepted idea. Assigning a senior manager is an essential part of this process. The individual must be someone with the resources, both human and monetary, and influence to address the opportunity. More importantly, the person must be accountable to

see that these important opportu-nities are handled appropriately.

The gatekeeper also must ensure the idea is assigned a pri-ority (importance) that takes into account severity, frequency, etc. Setting that priority simply may involve checking a box for big, medium or small, or may lever-age the corporate risk-ranking matrix with severity and frequency. Severity must consider safety, envi-ronment, reputation, assets, etc., to properly compare one opportunity to another. Ideally, both current and future risk should be identified for each opportunity.

Up to this point, there’s been motion but no action. Knowledge or wisdom without action is wasted.

Actioned. This step is where we reap the rewards of the HUAA process. Creating an action plan to address the opportunity, assigning actions and tracking those actions to ensure everything is achieved on time and to quality allow realizing the opportunity quickly, effectively and efficiently. The responsible person can develop an action plan on how to address the assigned idea in as much detail desired, and can make as many subordinates as needed responsible for specific

October 2020 / MKO Process Safety Journal-22-

actions. The person must set firm deadlines for all actions to under-score that things must get done. Monitoring progress becomes easy, as does confirming the value is real-ized. Continual monitoring ensures the opportunity remains satisfied. (Maybe we should add another “A” for auditing to the HUAA process — that would make it HUAAA!)

The audit step covers two angles — ensuring changes continue to be embedded into the operation of a facility; and understanding who is making recommendations for oppor-tunities, their quality, and acceptance

and rejection rate. This allows a holistic view not only of the volume but also of the quality of opportu-nities and reviews. It can assist in identifying any weaknesses in your HUAA process. For instance:

• Are opportunities being docu-mented effectively to allow the reviewer to understand?

• Are the reviewers rejecting opportunities because of a weakness in their knowledge of a specific area?

• Are there people who you would have expected to enter opportunities who never have?

• Are there people who consis-tently miss their deadlines for actions?

All these finding can help make your HUAA process even more effective.

A good HUAA process relies on people. People identify, review and implement the opportunities. You must choose the right listeners and properly motivate them. If they are too busy, not interested, not experienced enough, too experi-enced or lack an innovative spirit, the process will flounder. People without adequate experience,

Many people use the terms data, information, knowledge,

wisdom and ideas interchangeably but they have very differ-

ent meanings.

• Data are numbers on a spreadsheet, maybe without

context or units of measure.

• Information has context like units of measure. Data are

turned into information by organizing them so one can

easily draw conclusions. Data and information deal with

the past.

• Knowledge has the complexity of experience, which

comes about by seeing it from different perspectives.

Information is static, knowledge is dynamic. Knowledge

deals with the present.

• Wisdom is the ultimate level of understanding (Figure 3).

We can share our experiences that create the building

blocks for wisdom. However, imparting wisdom involves

more than just such sharing; it requires putting knowl-

edge into the personal context of the audience.

• Ideas are thoughts or suggestions as to a possible

course of action. We refer to ideas in this discussion

as “opportunities;” opportunities can be information,

knowledge or wisdom.

AVOID CONFUSIONWisdom

Knowledge

Information

Data

REACHING THE PINNACLE

Figure 3. Data provide the foundation but putting knowledge into appropriate context is the key to gaining wisdom.

October 2020 / MKO Process Safety Journal-23-

education or curiosity won’t spot global learnings, concerns and opportunities. Those with the right credentials must be vigilant and feed the HUAA process as if the organization’s future depends upon it. It may!

The gatekeepers filtering and accepting/rejecting opportunities must be exceptional people who can be trusted to separate the wheat from the chaff. They should be held accountable for inclusion or exclusion. Likewise, it’s vital to hold members of the leadership team personally accountable for disposition of global opportunities assigned to them. Measurement by corporate-level executives and board members is essential to ensure you get value from the HUAA process. What gets mea-sured really does get done.

THE KEYS TO SUCCESS

To have a successful HUAA pro-cess, you must:

• Have the right automation for the job. It must be easy to use, intuitive, follow good business processes, enable reporting and auditing, and engage all users in the process.

• Empower the right people to be able

to identify opportunities. Curios-ity, innovation and a desire to improve, along with being given the time and space to go hunting for opportunities are key charac-teristics. Efficient identification and collection of opportunities is the foundation of HUAA.

• Empower the right people to make decisions on where the value lies in the opportunities identified. For example, a mechanical engineer reviews a new American Petro-leum Institute recommended practice on mechanical integrity, while a PSM expert evaluates opportunities arising from find-ings of a U.S. Chemical Safety Board incident report. Properly prioritizing the opportunities means your business is kept safe, productive and profitable.

• Empower the right people to make the decision on how to implement the opportunity to get the best value. Using the people who best know the particular aspect impacted and have the authority to take action to manage the process is the most efficient way to make the opportunity a reality.

• Make everyone in the HUAA process responsible for his or her part in it. Responsibility

breeds interest, involvement and commitment.

• Measure and continually improve. Getting better and better will allow you to iden-tify opportunities.

A HUAA opportunity-man-agement process helps ensure that vital initiatives and important opportunities are consistently and systematically identified, evaluated and prioritized. Imagine how much more efficient and effective an organization would run if we were taking advantage of all the knowl-edge around us.

If properly designed and imple-mented, a HUAA management system will ensure your over-all organization’s risk profile is known, visible and manageable at the lowest level possible. Not having a HUAA management system may be the most expensive mistake you ever make.

MIKE BEARROW, PE, is principal

consultant, process safety manage-

ment, for Rolls-Royce Controls and

Data Services, Houston. KIM TURNER

is a consultant based in Nottingham,

U.K. E-mail them at Michael.Bearrow@

controlsdata.com and KimnTurner@

hotmail.com.

October 2020 / MKO Process Safety Journal-24-

CONTINUING EDUCATION The Mary Kay O’Connor Process Safety Center offers continuing education courses year-round both online and in Houston. The continuing education classes are taught by experienced engineers with years of industrial, chemical, research, and process safety knowledge. The Center strives to deliver the courses and topics that are important and vital to the ever-changing environment and industrial audiences. These courses can be taken for continuing education credit and can be applied toward the Safety Practice Certificate.

PROCESS SAFETY PRACTICE CERTIFICATE FOR INDUSTRY

The Process Safety Practice Certificate is a program that allows engineers in industry to gain greater knowledge in process safety. The certificate requires 125 Professional Development Hours (PDHs) for completion within a three-year timeframe.

COST OF CERTIFICATEThe approximate cost to complete the certificate is $5,400-$6,470.- 1 day course: $495 (7 PDHs)- 2 day course: $990 (14 PDHs)- 3 day course: $1,485 (21 PDHs)- Semester long SENG Courses: $1,800 (42 PDHs)

ONLINE COURSES

Visit psc.tamu.edu for more Information

Find us on

For questions email us at: [email protected] here: tx.ag/MKOpspcert

Existing online courses are available now

- SENG 655 Process Safety Engineering

- SENG 660 Quantitative Risk Assessment

- SENG 674 System Safety Engineering

- SENG 670 Industrial Safety Engineering

- SENG 677 Fire Protection Engineering

In-person courses are being offered as online courses

New courses are being included in the program

Unwanted detonations are powerful events with enormous destructive

potential if not controlled ade-quately. The recent explosion of approximately 2,700 tons of ammo-nium nitrate at Beirut’s Port, which shook the entire city and claimed more than 200 lives, reminded us how dangerous uncontrolled det-onations can be and why we must ensure these types of hazards are managed effectively.

Although detonations can orig-inate through different means, a common route is through flame propagation in gaseous mix-tures. This phenomenon, known as deflagration-to-detonation transition (DDT), occurs when

the right conditions are in place to create and sustain high-speed flame fronts. Gaseous detona-tions are intrinsically different from solid-phase ignition, such as the Beirut explosion, because the energy is more dispersed throughout flammable gas clouds. However, this reduction in energy density does not make gaseous detonation less dangerous.

VAPOR CLOUD EXPLOSIONS

In the last decades, we have seen devastating damage from industrial vapor cloud explosions (VCEs) resulting from releases of explosive mixtures in processing facilities. A well-known example in the process safety community

is the series of explosions at a hydrocarbon storage facility on December 11, 2005, in Buncefield, United Kingdom, after a propane tank overflow. Even though mul-tiple blast events occurred during the Buncefield incident, the first and most energetic one involved a flame front propagating more than 100 m that created high-pressure loads destroying onsite build-ings, vehicles and equipment. Post-incident investigations later hypothesized this first explosion to be a detonation wave.

Industrial denotations from VCEs often are overlooked during risk assessment reports because they are believed to be possible only with more energetic

Consider Industrial Detonations in Vapor Cloud ExplosionsRisk assessments often overlook the issue but should not

By Cassio B. Ahumada, Texas A&M University

October 2020 / MKO Process Safety Journal-26-

substances. However, the latest VCE research programs have challenged this assumption and provided scientific-based evidence showing that DDT in industrial vapor clouds is more common than previously believed and involves a range of materials. Besides, it is known that confinement and congestion enhance flame acceleration, creating conditions favorable to detonation onset. Such conditions are challenging to avoid in industrial plants given space constraints on equipment arrangement, especially in offshore units that result in congestion.

In the perspective of industrial safety, DDTs are catastrophic events with consequences that can go beyond the facility boundaries. Therefore, evaluating detonation hazards in process plants becomes

crucial to facility siting and land use planning, explosion prevention and mitigation actions as well as effective emergency response plans. These components are pil-lars of good hazard identification and risk management plans.

RESEARCH UPDATES

Current research in the area of detonations at the Mary Kay O’Connor Process Safety Center (MKOPSC) at Texas A&M University is focused on under-standing how the facility layout affects flame acceleration and, ultimately, the DDT process.

More important, our goal is to identify potential layout modi-fications or mitigative measures that can be employed to achieve inherently safer facilities that avoid DDT impacts. We also are

searching for VCE assessment methods that reliably quantify DDT likelihood to support site layout decisions.

For example, in a joint project conducted with Gexcon in 2014, researchers from MKOPSC expanded the application of the computational fluid dynamics (CFD) commercial software FLACS to include DDT predic-tion for various fuels, including hydrogen, ethylene, propane and natural gas. The authors validated their proposed methodology using experimental data from multiple scenarios, ranging from confined lab-scale setups to semiconfined large-scale geometries.

A more recent study published this year reviewed easy-to-use empirical VCE correlations and evaluated their ability to assess the likelihood of DDT and fast deflagrations for less energetic fuels such as propane and methane in large, unconfined structures. The six models analyzed included the TNO Multi-Energy method, the Baker-Strehlow-Tang (BST)

Evaluating detonation hazards in process plants

is crucial for several reasons.

October 2020 / MKO Process Safety Journal-27-

method and Shell’s Congestion Assessment Method (CAM). The review demonstrated that simpli-fied VCE methodologies can be applied to indicate DDT in scales relevant for industrial applications with relatively good accuracy after minor model modifications.

Both studies demonstrate initial progress toward including deto-nation hazards in industrial risk assessments. Overall, the steps necessary for estimating DDT likelihood based on empirical correlations can be summarized as follows:1. Define the project’s scope. At

this initial stage, the safety professional should identify and establish the physical boundaries of the process under investigation.

2. Collect relevant process infor-mation, including chemical inventory, equipment list, process variables (temperature and pressure), weather condi-tions, etc.

3. Identify potential release sources and perform dispersion

studies to estimate the dimen-sions of flammable clouds formed.

4. Separate congested areas based on their degree of confinement and equipment density. In this step, it is essential to high-light regions that could result in flame acceleration should a flammable cloud be formed nearby.

5. For each explosion scenario identified, estimate the maximum flame speed and/or overpressure generation applying at least two empirical VCE models for comparison purposes.

6. Subsequently, compare the predicted outcomes obtained from the VCE modeling with DDT criteria defined by the fuel type.

7. Finally, assess onsite and off-site consequences applying pressure load response curves and estimate the event severity.

To conclude, VCE research programs and post-incident inves-tigations have demonstrated that

industrial detonations can occur in processing facilities if the right conditions prevail. Failing to account for detonation hazards from flammable gas releases may substantially underpredict the explosion severity at a particular site, which in turn underestimates the risks of event escalation and critical building damage. There-fore, identifying the potential for DDT is crucial for effectively managing explosion risks in industrial plants and reducing the potential occurrence of such cata-strophic events.

CASSIO BRUNORO AHUMADA is

a doctoral candidate in the Chemi-

cal Engineering Department at the

Texas A&M University.  His research

investigates how the congestion pattern

variation affects the deflagration-to-det-

onation transition (DDT) mechanism on

flammable gaseous mixtures. He is also

involved in many safety-related projects,

including facility risk assessments,

facility siting, and vapor cloud explosion

modeling studies. He can be reached at

[email protected].

October 2020 / MKO Process Safety Journal-28-

In 2014, a unique and special thing happened: Industry users of procedures and a

procedure technology company approached Texas A&M Univer-sity’s Mary Kay O’Connor Process Safety Center (MKOPSC) to conduct research regarding proce-dure design. This was motivated by the number of incidents that occurred as a result of procedural deviations. Many of them involved significant loss of process contain-ment and were publicly visible (e.g., Macondo, BP Texas City and the Formosa Plastics vinyl chloride explosion).

This ultimately was the begin-ning of the Next Generation Advanced Procedures (NGAP) https://advancedprocedures.tamu.

edu/) consortium — a collabora-tive effort between academia and industry to identify issues related to procedural systems that con-tribute to system failures. NGAP continues to this day and has made surprising discoveries about pro-cedures and procedural systems. This article summarizes some of our findings to date and shares our current concerns for the industry moving forward.

HUMAN FACTORS AND THE

PROCESS INDUSTRY

One of the initial charges to NGAP was to “human factor” the current procedure designs, guide-lines and frameworks. Although using this term as a verb in some industries is common, most human

factors professionals do not use it in this manner. Our consortium needed to better comprehend how this industry understands the term. We learned that many, if not most, in the process industry domain understood human factors as a list of issues associated with human error, such as fatigue, task com-plexity and quality of the interface. As Human Factors (capital “H” capital “F”) professionals, we con-sider these a list of “performance shaping factors” from the human reliability domain (e.g., SPAR-H [1], HFACS [2]).

Human factors (HF) is the sci-entific and professional domain that applies methods and theories regarding humans’ capabilities and constraints to improve the efficiency,

Rethink Your Process Safety Procedures Interdisciplinary approach supports workers, strengthens companies

By S. Camille Peres, Texas A&M University

October 2020 / MKO Process Safety Journal-29-

effectiveness and safety of their performance. The approach of many HF professionals is to identify how the system’s design is more or less likely to support the desired behav-ior. The approach is not “how to fix the humans so they are less likely to make mistakes” (e.g., follow procedures). If people reliably make mistakes when using a tool designed for a particular task, then the error is with the tool’s design, not with the human using the tool. Therefore, we do not focus on reducing human error. We focus on supporting human performance.

Although this difference may seem semantic and trivial, it puts the focus on the system’s attributes that result in humans making reli-able and predictable errors. This focus shifts the overall responsibil-ities to the system’s designers and managers to improve them to sup-port the user’s work. This systems approach is associated with highly reliable organization and requires a thorough understanding of not only the identified “problem” (here, the procedure) but also the users, the tasks and the contexts in which this problem occurs.

PAST FINDINGS

As with any endeavor, our efforts built on previous research that identified some consistent issues associated with procedural sys-tems. For several incidents, the

availability of procedures was a pri-mary issue. However, other studies found additional common prob-lems with procedures: they were incorrect or out of date; difficult to read (too technical or wordy); diffi-cult to access; or generally of poor quality, which made them difficult to use (frequent spelling, grammar and punctuation errors). In many of our investigations we have con-firmed these issues. However, we have discovered more issues that may decrease the likelihood that written procedures will support workers’ performance when they most need it.

NGAP MAJOR FINDINGS

Our findings are based on studies using multiple research methods, including interview analysis, lit-erature reviews and controlled experiments in both a lab envi-ronment and a field training environment (we went to Shell’s Robert training facility in Robert, Louisiana). Here, we will articulate a summary of the findings but not necessarily discuss the specifics of the methods used to identify those findings. The interested reader can refer to the list of publications on our website or our webinar series for more details on these studies.

Finding 1: Guidance often is not based on empirical findings. Our first effort was to identify the guid-ance currently available for procedure

writers and the current standards and regulations regarding procedures. We learned that most of that guid-ance was based on historical practice and found little to no published empirical evidence to support the guidelines in most of the regulations and standards. Further, writers’ guides, books and other guidelines regarding procedure writing and pro-cedure system development provided little evidence. When some was provided, it often was based only on “years of experience.”

Certainly, experience is a valuable teacher and without question many of the guidelines within these doc-uments are very good. However, as will be seen in some of the findings below regarding hazard statements, some seemingly intuitively obvi-ous designs can result in behavior antithetical to what the procedure writer desired. It is important to have empirical research conducted for some of the most critical aspects of the procedure design to ensure that the designs support the desired behavior.

Finding 2: Units within facilities differed remarkably on the health of their procedural systems. During studies at which we were onsite at multiple facilities, we found that within the same facilities, it was common for units to differ in their attitudes toward and reported use of procedures. The workers in those units that were more positive about

October 2020 / MKO Process Safety Journal-30-

procedures generally had a more positive opinion about procedures and reported using them more regularly as recommended by man-agement systems. These workers also seemed to have ownership of the procedures’ content, and when they requested changes or turned in redlined procedures (a procedure that has been marked up as need-ing changes — typically with a red pen), they reported receiving feed-back on them quickly (within days or weeks).

One experienced worker in this type of unit shared a challenge. As the unit became more engaged with procedures, its workers asked more questions to clarify attributes of the task and procedure. Although this seems fine on the face of it, for the diminishing number of experienced workers in the field, this means they get asked questions regularly and thus often are distracted from their own tasks.

Workers in units that were more negative about procedure use often reported the procedures did not help them much in performing their tasks and viewed them more as a control mechanism for man-agement. They were more likely to report not using them as recom-mended, and when they submitted redlined procedures, they said that if they ever received feedback (because often they would not), it could be months or even years

before that happened. This obser-vation of units having remarkably different attitudes toward proce-dures has been a pervasive finding in multiple facilities around the world. Thus, we think it does not reflect an idiosyncratic facility that has a different management method.

Finding 3: Reports regarding procedure deviation and use caused concern. In interviews with work-

ers, we found issues related to management and safety climate that were extremely concerning. When we asked them why they or another worker would deviate from a procedure, several reported that it was it was not uncommon to be in situations in which their direct supervisor would either overtly or indirectly instruct them to deviate from the procedure because of time pressures.

When asked about possible ben-efits of using digital procedures, many workers reported they might help them stick to the procedure in time pressure situations because deviations would be documented and the supervisor could not “get around that.”

Another concerning perspective was that workers always would follow procedures even if they knew they were incorrect because following the written procedure exactly would protect them from liability if something went wrong. The focus for these workers in their organizations was about protecting their jobs. Ideally, an organization should support workers’ thinking critically regarding how to perform

their tasks effectively, efficiently and safely. When workers are focused exclusively on “watching their backs,” it can put the organi-zation’s productivity and the safety at risk.

Finding 4: Procedure cannot be one size fits all. One of the clear findings we have made in both experimental and observational studies is that procedure use and performance differ by experience level. Many in the process indus-try who write and use procedures are familiar with the dilemma of how much content is enough. The experienced workers want less content and more parsimonious, bulleted-style steps, while the less experience workers want more

Work as imagined often is different from work as done.

October 2020 / MKO Process Safety Journal-31-

content to facilitate their under-standing of the task itself and how that task relates to the entire system.

From our research we have found that this is not simply a preference for these two groups, but having different amounts of content impact their performance. For those organizations still using paper procedures, this is not some-thing that can be accommodated because effectively managing mul-tiple versions of the same procedure is a recipe for a procedural system failure. However, for those organi-zations that are adopting digitally based procedures, many vendors can develop procedures with the content presented in different for-mats for more or less experienced workers.

Finding 5: Written proce-dures likely never will be perfect. Although many organizations have as a goal to develop “perfect” pro-cedures that need to be reviewed or updated only every three to four years, what we have seen and heard from workers is that this likely is not a realistic goal for many situa-tions. Many process industries are highly complex socio-technical sys-tems and constantly are undergoing subtle changes that can create the need for changes in the procedures. For instance, upgrading or replac-ing a pump, identifying a more efficient method of performing a

task or integrating a new mon-itoring system all may require procedure changes to ensure clarity and accuracy.

One of the major challenges many organizations face is having sufficient resources for ongoing procedure revision processes. We have identified two major orga-nizational components needed to maintain a healthy procedural system:

• A robust and efficient pro-cedure revision process to support safe, effective and efficient operations, which will increase the likelihood that procedures are correct. Incor-rect procedures historically have been one of the biggest worker complaints regarding procedural systems.

• Investments into the procedure change process, which may

also impact or be a reflection of the safety climate in facil-ities. When workers see that the organization is not only putting effort into having cor-rect procedures but also into incorporating their feedback to update those procedures, this may facilitate the work-ers’ ownership and use of the procedures (as we saw in those units that had a more positive attitude toward procedural sys-tems in “Finding 2”).

Finding 6: Efficacy in commu-nicating safety information was surprising. One of the goals of pro-cedures — which many regulatory agencies require — is to communi-cate hazard information regarding the task itself. We found some inter-esting and surprising issues regarding current practices for communi-cating this information in written

Good Design

1. Electrical shock hazard.

Can cause serious bodily injury or death.

Power off the equipment to prevent electrocution.

Bad Design

Electrical shock hazard.

Can cause serious bodily injury or death.

Power off the equipment to prevent electrocution.

SAFETY NOTIFICATIONS

Figure 1. These two designs were used in an eye-tracking study. Top one is the design that had the longest gaze duration and was associated with the best performance.

October 2020 / MKO Process Safety Journal-32-

procedures. The first was, contrary to evidence in the consumer product domain, when hazard statements are embedded in procedures, the more “stuff” (e.g., shading and icons) they have around them, the less likely workers will attend to them to the content of the hazard statement. We documented this with behavioral studies using existing workers as well as with eye-tracking studies (Figure 1).

Another surprising finding was that most workers equated the meaning of the signal words CAU-TION and WARNING. Although WARNING generally is supposed to communicate a more dangerous hazard than CAUTION, workers thought the two words commu-nicated the same level of hazards. This study involved workers who used procedures that had lever-aged these words to differentiate between different levels of hazards. These findings suggest that instead of using signal words to commu-nicate the level of the hazard in procedures, it is better to simply communicate the specific hazard and the method for avoiding or mitigating the hazard.

Finding 7: Procedure usefulness requires high-quality attributes. As mentioned previously, poor proce-dure quality historically has been found to be a prominent reason for workers to deviate from or not use procedures. Improving procedure

quality is challenging given the number of procedures many facil-ities have. The good news is that it is a relatively straightforward task. However, in a survey of workers who use procedures, we found that the procedure’s usefulness was related to use and deviation as much as to its quality. Further, both variables were related to the number of incidents or near misses per year. This indicates that it is important for a procedure to have the attributes of quality (meaning it is easy to understand, has no typos, contains current information, has steps in the correct order and is well-organized) and to help the workers do their jobs.

Finding 8: Adoption of digi-tal procedures is important and complex. Many of the findings and recommendations from our research indicate that using a digi-tal procedure system will allow for more flexibility with regard to the procedure’s presentation, revision and likely usability. Further, in interviews with workers using dig-ital procedures, one of the benefits was not having to handle paper that would get dirty, wet, torn, etc., while they were performing the task.

Before adopting digital pro-cedure systems, it is extremely important that organizations leverage methods of assessing their readiness to adopt and accept this

type of technology as it is not just about the hand-held digital proce-dures themselves. It also requires ensuring sufficient bandwidth in facilities for the system to be used effectively; providing sufficient resources for procedural conver-sion, including the engagement of workers during the conversion process; and planning an effective integration of digital procedures (e.g., not during a startup). If an organization does not prepare for the conversion to digital procedures properly, it may never see the ben-efit of that conversion, or, if so, it may be at a much larger cost than originally expected.

SUMMARY AND CONCLUSION

We have found through interviews, experiments and worker observa-tions that work as imagined often is different from work as done. In interviews with workers and man-agers at the same facility, the issues that managers believe workers were having with procedures were different from the issues workers were reporting.

Overall, workers perceive proce-dures as a good tool — for training, for less experienced workers and for tasks done infrequently. For more frequent critical tasks, particularly for experienced workers, it is likely that well-designed checklists would be a better solution than step-by-step procedures. These checklists

October 2020 / MKO Process Safety Journal-33-

also could indicate steps that have to be done in a particular order.

When procedures are designed primarily for documenting workers’ accountability or regulatory com-pliance, they are not necessarily going to be effective at supporting the workers’ tasks. This limited usability will decrease the likeli-hood that workers will adhere to them. Accountability is important in a working environment, but we found the procedure is not an effec-tive method for documenting it.

A healthy procedural system can play a keystone role in an organization’s safety system. The healthy system consists of pressure (information and guidance) coming from both top down and bottom up, which correlates to the right and left sides of the arch (Figure 2). For an organization with a healthy safety system, the procedural system should be a place where the top-down and bottom-up

influences converge (the keystone) to create a strong structure that can last for years. However, if the keystone is weak or has too much pressure from one side or the other, the structure falls apart.

IMPORTANT COLLABORATIONS

AND COLLABORATORS

Our findings over the past six years have contributed to a better understanding of how proce-dures and procedural systems can support workers better while they perform their tasks and also improve safety in these high-risk industries. Two important hallmarks of NGAP have made these findings possible: the inter-disciplinary team of scientists conducting the research and the tight collaborations between industry and academia regarding what questions needed to be asked and how they should be asked.

Interdisciplinary research.

Although the original charge to NGAP was to “human factor” procedures (or apply the princi-ples, theories and methods of HF to procedural systems), the work we have done to date has been an excellent example of interdisci-plinary research. My approach to HF is from the information pro-cessing paradigm as my training is primarily cognitive psychology. However, many, if not most, of our interesting findings came from intersections between my approach and that of another dis-cipline or from another discipline entirely:

• Chemical Engineering (ChemE): The collaborators from ChemE were my first regular collaborators, and they brought deep knowledge of risk analysis as well as the clas-sic engineering approach. With their influence, we looked first to the documents that guide industry practice — standards and regulatory documents as well as existing procedure writing guides. Further, they have explored methods of systematically identifying complexity using natural lan-guage processing given their comfort and prowess with many computer programs.

• Industrial and Systems Engineering (ISEN): The col-laborators from ISEN often are

ENDURING STRUCTURE

A healthy system, like a keystone, should last for years.

October 2020 / MKO Process Safety Journal-34-

closest to me scientifically in that they also are HF research-ers. However, their training and research paradigms are different, as they are engineers and not psychologists. For instance, one colleague sees issues with procedures through the lens of a systems approach. He and his team have used rigorous methods to conduct and analyze interviews of work-ers to understand the issues and strengths of procedural systems better. Another col-league used machine learning methods to identify attributes of steps in written procedures that are associated with the successful completion of that step. Still other colleagues in ISEN consider how the work-ers’ state (e.g., fatigue or stress) may impact their interactions with written procedures and task performance.

• Industrial and Organiza-tional Psychology (IO): The collaborators from IO are psychologists like me, but they typically focus more on how attributes of the social or organizational system impact worker behavior or safety (where I investigate the tool itself, meaning the procedure). This has led us to the beginnings of an inves-tigation on the relationship

between attributes of the procedural systems (redlines and turnaround times) as key performance indicators (KPIs) of safety climate. Other collaborators with IO have a strong knowledge of survey construction and analysis. This collaboration has been integral to NGAP for collecting and analyzing data from a large number of workers to under-stand trends and relations between variables in proce-dural systems better.

Collaboration between industry and academia. The second hallmark of NGAP has been the regular col-laborations with industry partners. Two industry partners — Elliott Lander from ATR and Abbe Barr from Chevron — originally came to MKOPSC with the need for this type of consortium. The industry/academia (IA) collaborations in NGAP have matured over time to the following process: 1. The industry partners (the

board) let us know the major current issues.

2. We identify several possible studies or approaches we could take to identify causal elements associated with these issues and develop and empirically test mitigation methods.

3. Given the resources available, the board votes on the stud-ies it would most like to see

move forward, and those are the studies we conduct for that period of time.

This process allows the aca-demics to hold fast to what we do well, which is rigorous, empirical research to build the body of sci-ence. At the same time, it holds us accountable to do this science in a manner and on a topic that is directly relevant to topics that can be applied immediately in these high-risk industries. This translation of science to practice is something particularly exciting about NGAP — and honestly, I am pretty proud of it. The added benefit is that many of our findings regarding procedural systems have come from outside the consortium’s funding (such as federal and local funds), so the NGAP has been more of a constant source of seed funding to keep the effort going with the IA there to continue to hold us accountable to the origi-nal effort.

STILL TO COME

Like any good academic effort, the more we learn, the more questions we discover. As can be seen by the list below, a lot of work remains to be done regard-ing procedural systems:1. What are the most important

design guidelines for digital procedures — for example, for moving between multiple

October 2020 / MKO Process Safety Journal-35-

procedures and for providing feedback regarding incorrect procedures?

2. How can we identify whether organizations are ready to adopt digital procedures? What are the measures and methods they should us to identify this?

3. What are the implications for electronic performance mon-itoring (EPM) with digital procedures? Previous research has found that workers’ per-formance can suffer if EPM is implemented badly.

4. Can attributes of procedural systems (such as number of redlines or redline turnaround time) be leveraged as KPIs for safety climate?

5. Can we develop a procedural system that adapts to user needs and profile, task attributes and the context of the work-ing environment?

ACKNOWLEDGEMENTS

An acknowledgement section for this effort is difficult because there have been, and are, a lot of people who have contributed sub-stantively to this effort’s success. A few deserve special thanks and acknowledgment.

First is Dr. Farzan Sasango-har who for the past two years has been a Co-PI on NGAP and

works side by side with me to manage the projects and students. He is a premier researcher with a systems engineering perspective and has contributed greatly to NGAP’s strength.

Dr. Joseph Hendricks, an IO researcher, has been the brains behind all of the statistical analysis that was beyond my reach (and that’s saying something because I know some stats). He also has ensured that any survey we use goes through a rigorous develop-ment process.

Roger Young, with NovaChem-ical, is an industry partner who has been with us from the begin-ning and provides insight and support that keep us motivated.

Wendy Schram is with Dow and worked for two years to pro-vide us important opportunities to learn more about how workers interact with both paper and digital procedures.

We appreciate all the personnel at Shell who provided us access to the BOOST facility at the Robert training facility. Conducting the experiment there was a wonder-ful experience, and the staff was superlative.

Dr. Noor Quddus, from MKOPSC, was the first person from MKOPSC with whom I collaborated. He and I learned so much about the different

“languages” that psychologists and engineers speak, even though we were saying the same words. Through intense and intentional listening to hear and learn from each other, we started the ethic of collaboration for NGAP that remains to this day.

Of course, as always, I thank Dr. Sam Mannan. His presence with NGAP as well as his encourage-ment and support for those first years were pivotal.

REFERENCES

[1] Gertman, D., Blackman, H., Marble, J., Byers, J., & Smith, C. (2005). The SPAR-H human reliability analysis method. US Nuclear Regulatory Commission, 230, 35.

[2] Shappell, S. A., & Wieg-mann, D. A. (2000). The human factors analysis and classification system—HFACS.

DR. CAMILLE PERES is an Associ-

ate Professor with Environmental and

Occupational Health at Texas A&M Uni-

versity as well as the assistant director

of Human Systems Engineering with

the Mary Kay O’Connor Process Safety

Center. Her expertise is Human Fac-

tors and she does research regarding:

procedures; Human Robotic Interaction

in disasters; and team performance in

Emergency Operations. She can be

reached at [email protected].

October 2020 / MKO Process Safety Journal-36-

ABSTRACT

Conducting robust risk assessment does not guarantee that all possible hazardous events have been identi-fied. Unacceptable residual risk may materialize after risk assessments have been completed. To that end, resilience measures that may alert of a threat, call for action and require emergency response and recovery preparedness should be available. Error-tolerant equipment will help, too. The paper will describe how resilience can be measured and what is being done to minimize communication and coordination errors in a complex envi-ronment. Future research priorities also will be summarized.

Keywords: resilience, disaster, pro-cess safety, emergency response

The term resilience is become increasingly known in management

circles. It refers to an organization’s ability to survive and recover from

a sudden substantial upset such as a market breakdown, a strike or any disastrous effect on its product output and turnover. Furthermore, guided by the United Nations during the past 20 years, national governments have been urged to work on their resilience capabilities so they are able to respond better to large-scale damage from extreme weather events such as tropical cyclones and natural calamities such as earthquakes.

In the 1990s and early 2000, scientists attempting to describe what made an organization strong and successful mentioned resil-ience as one of the properties (e.g., Weick & Sutcliffe, 2011). In 2004 , Erik Hollnagel launched the first in a series of symposia on what he called resilience engineering (Hollnagel et al., 2006).

These symposia focused on the functioning of the people in an organization and, quoting

Hollnagel et al. (2011, p. xxix), on strengthening the “four abilities that are necessary for a system to be resilient. These are the ability to respond to events, to monitor ongo-ing developments, to anticipate future threats and opportunities, and to learn from past failures and successes alike.” The “engineering” here, though, is not the way we, engineers, understand it as it misses the physical hardware component.

At process plants, in addition to business aspects such as pro-duction efficiency, process safety is an important consideration. The question of how safe we are is determined by risk assessment and management. But one has to ask whether or not a detailed quantita-tive risk assessment is fully reliable. The short answer is no.

Many reports indicate that tools to identify hazards and threats such as hazard and operability (HAZOP) are fallible, and teams

Boost Process Plant ResilienceMeasures can strengthen the ability to recover from an incident

By Hans J. Pasman and Changwon Son, Texas A&M University

October 2020 / MKO Process Safety Journal-37-

using them can fail (Baybutt, 2015; Cameron et al., 2017; Casal & Olsen, 2016; Jarvis & Goddard, 2017; Lauridsen et al., 2002; Pasman et al., 2017; Suokas & Rouhiainen, 1989; Taylor, 2016). As such, unexpected externally or internally initiated hazardous events with serious consequences are possible. Despite all safety measures a disastrous event can occur without warn-ing, causing a great deal of damage and interruption to business (such as COVID-19 to the airline industry!). Resilience measures should help to avoid such events not foreseeable in traditional risk assessment or to lower the damage and accelerate recovery to restore perfor-mance when the events occur, as shown in Figure 1.

Failures of hazard identification methods have led to impressive efforts attempting to improve pro-cess hazard analysis. Two of these efforts, based on a socio-technical system (STS) approach, will be mentioned briefly. An STS encompasses the entire hierarchical line from regulators down via board and plant management to the various work floor levels to the plant equipment and technology. STS theory was introduced to the process industry by Rasmus-sen (1997) for accident investigations and further

developed by Leveson (2004) and Leveson (2011), who also turned it into a predictive analysis tool (system theoretic process analysis, or STPA).

Recognizing that safety is a control problem, STPA considers all process control loops — organizational and technical — acting on internal and external dis-turbances. By asking four questions, a team can find out what may go wrong. The other highly automated effort is called blended HAZID, or BLHAZID. It focuses on plant, people and procedures and their interactions. BLHAZID is making use of massive digitization for equipment and generates causal models (Németh & Cameron, 2018).

Even though STS provides a comprehensive system view and promises improved risk assessment, in safety the devil often is in the details. And in the myriad possible details, hazards still can be overlooked. In addition, there may be unknown threats while unex-pectedly accepted residual risks still may materialize, so resilience has an important role to play.

Starting around 2009, the Mary Kay O’Connor Process Safety Center (MKOPSC) began studying the concept of resilience as a means to guard against unexpected mishaps from the STS point of view. To date, two MKO students have completed their Ph.D. studies on the topic of plant resilience: Dinh (2011) and Jain (2018). Pasman et al. (2020) presented a sum-mary and review of their work that includes not only references to their dissertations but also references to their published journal articles.

The work by Dinh (2011) and Jain (2018) covers mainly resilience aspects of process operations. A co-author of this article, Changwon Son, who is near his Ph.D. graduation, focused on organizational resil-ience engineering and elaborated how resilience of

PERFORMANCE RESILIENCE DURING A DISASTEROUS EVENT

Figure 1. Effective resilience measures can have a demonstrable effect on performance during an unexpected disastrous event.

Resiliencemeasuresapplied

Time

Per

form

ance

October 2020 / MKO Process Safety Journal-38-

emergency response organizations is achieved through interactions among cognitive system elements such as humans and technologies.

This article will discuss meth-ods to strengthen resilience, the current state of resilience, the analysis of emergency response operations with respect to organi-zational resilience, future priorities and conclusions.

MEASURES TO STRENGTHEN

RESILIENCE

Dinh (2011) focused on avoiding process upsets and identified a number of resilience principles:

• Minimization of failure• Early detection• Flexibility• Controllability • Minimization of effects• Administrative controls

and proceduresAll these principles were eluci-

dated and illustrated. In particular, process flexibility and controllability are much influenced by process and plant design, but the other prin-ciples have their effects. Further, besides the design factor, Dinh et al. (2012) proposed additional con-tributing factors including warning signal detection potential, emer-gency response capability, human

factors and safety management system effectiveness. This was fol-lowed by case studies.

Jain (2018) followed this line of thought by condensing the resil-ience requisites into four elements that will be explained further:

• Error-tolerant design • Detection of early warn-

ing signals• Plasticity of thinking• Preparation of recoverabilityError-tolerant design of process

and plant includes inherently safer design, but its scope is broader. It means that human-machine interac-tion is optimized, e.g., maintenance

operations are facilitated and in control human capability in decision making speed is reckoned with. Also, as much as possible the design should be forgiving” so that when an error is made, consequences should be limited and/or recovery should be possible.

Detection of early warning signals of disturbance is an important

element in trying to avert and avoid mishaps. It includes the recognition of incident precursors, knowing their root causes and correcting them before a serious event occurs. Recognition means that one has learned from previous incidents.

Warning signal detection goes far beyond this, particularly when cyberattacks on installations are possible. Even physical attacks cannot be excluded. Extreme weather conditions with hurricanes and flooding from the effects of climate change should be warned for. A company’s business intelli-gence used to signal detection and

interpretation also could warn in case of heightened risk.

Early detection and identification of signals is not enough. If a fast response is required, it is critical that the organization’s top management be informed and understand the need to alert the entire organiza-tion and take the necessary actions. The required mental attitude from

Starting around 2009, the Mary Kay O’Connor Process

Safety Center began studying the concept of resilience.

October 2020 / MKO Process Safety Journal-39-

management down throughout the organization is called plasticity of thinking. It means having the flexibility to divert attention from ongoing business to quickly grasp-ing the seriousness of a threat and taking the right action. However, a company needs to avoid jumping to conclusions and embrace the concept of resistive flexibility. History has seen many instances in which clear signals were available but manage-ment response was inadequate or nonexistent.

Emergency response capability is fully recognized as a necessary asset of any organization. In case of a major upset event such as an explosion, fire or toxic chemical release, avoiding delays in imple-menting mitigative actions to a minimum is crucial. Many mecha-nisms incorporated in the layers of protection of a plant are available.

Ultimately, well-prepared and trained plant and community fire brigades will do their jobs to minimize damage. However, recoverability encompasses more. It requires a detailed plan of what should be done to repair damage quickly, to get a supply of required materials and to maintain avail-ability of a specialized workforce to repair the plant. In case of fully

halted production, to avoid losing market share, customers could be supplied with products from elsewhere, while additional man-agement capacity may be needed to handle the disaster aftermath. Most important, reserve financial fluidity should be available.

HOW RESILIENT ARE WE?

By starting from the resilience concept and the resulting strategies to build resilience and then fol-lowing the principles, Dinh (2011) sorted out the factors that influence resilience and their weights and composed an algorithm to deter-mine factor index values.

Jain (2018) went further and developed the first process resilience analysis framework (PRAF). By prediction, PRAF should be able to shed light on the influencing factors and effectiveness of the three phases of resilience: avoidance, survival and recovery. The basis of PRAF is an extensive risk assessment, consider-ing the system architecture of people, plant and procedures with inputs about processes, safety require-ments and costs. However, crucial and a step beyond conventional risk assessment is the identification of uncertainties, their quantifica-tion and determination of safety

constraints. All this is supported by process models (after process digiti-zation in the future, digital twins).

This is realized by using vari-ous statistical methods (analytics) to treat and mine data, dynamic model simulation and congruence analysis. Congruence analysis is about the extent of dependen-cies in an STS determined by the coordination of (temporal) tasks to run the process and the communication and coordination dependencies as determined by the organizational structure (Cataldo et al., 2008). The larger the inten-sity of dependencies, the larger the chance something goes wrong. The analysis is completed by economic optimization under defined limits of product quality, safety, environ-mental impact and sustainability.

After the development of PRAF, the next question is how to obtain data that indicate plant perfor-mance with respect to resilience. To this end, Jain, Mentzer et al. (2018) introduced 26 measurable indicators based on the PRAF fac-tors, such as alarm rates, number of trips per month, process safety action item closure and number of mock drills for emergency sit-uations per year. These indicators come in three groups according to

October 2020 / MKO Process Safety Journal-40-

the three PRAF phases: avoidance, survival and recovery.

A survey to determine the weighting of proposed indicators was given to respondents from the oil, gas, and chemical indus-tries; academia; risk analysts; and those with expertise in process operations, process safety and risk assessment, as well as individu-als involved in business, finance and procurement (for details see Jain, Mentzer et al., 2018). A few indicators appear in more than one phase, so altogether there were 30 questions. More than 250 responses were collected and weights determined.

The metrics are relative mea-sures. In actual plants the metrics would be monitored and trends analyzed. Each indicator metric also was linked with one of the four resilience elements: design, warn-ing, plasticity and recovery.

Next, Jain, Rogers, Pasman, Keim et al. (2018) and Jain, Rogers, Pasman and Mannan (2018) devel-oped the resilience-based integrated process system hazard analysis (RIPSHA). This is a HAZOP-type protocol conducted by a team con-sisting of a facilitator, subject matter professionals and a scribe, cover-ing the operation under question

holistically, hence including simul-taneous and transient operations (startup, turnaround and shutdown). The approach is bi-layered, meaning it distinguishes management and plant system layers, each with three subsystems, which in turn consist of parameters (functions and aspects).

First, the plant system layer is considered with the subsystems: process/plant equipment hazards, procedural hazards and operator/human hazards, with safeguards analysis conducted separately. Equipment failure rate and unavailability also are taken into account, both of which are substan-tially influenced by organizational and human performance factors.

Next, the management system layer is analyzed with process safety system, operational disci-pline and process safety culture and leadership subsystems. The analysis process runs similar to the conventional HAZOP. Guide words for the plant equipment part are the same as for a HAZOP, but new guide words were defined for the people and procedures (Jain, Rogers, Pasman, Keim et al., 2018).

New guide words also have been created for the management system layer’s subsystem (Jain, Rogers, Pasman & Mannan, 2018). All

guide words are linked to an indica-tor metric. Using the guide words, deviations are considered and causes and consequences of the deviations identified. The worksheet also cap-tures the actions required as well as the person responsible for complet-ing the action.

A few example cases have been worked out, including LNG storage tank startup (Jain, Chakraborty et al., 2018), a PVC batch reactor upset event prediction analysis (Jain, Chakraborty et al., 2018; Jain, Diangelakis et al., 2019) and a cool-ing tower maintenance optimization (Jain, Pistikopoulos et al., 2019).

RESILIENCE AFTER

INCIDENTS OCCUR

The fourth element of the resil-ience process Jain (2018) proposed is recovery. Unlike the design that prevents or tolerates process upsets and the early detection of such upsets, responding to and recovering from process incidents inevitably necessitates communi-cation and coordination between multiple human elements.

An archetypical case that revealed the need for adaptive response to and recovery from a disastrous incident is the Deepwater Horizon (DWH) incident in 2010. The official

October 2020 / MKO Process Safety Journal-41-

government response to the DWH began when the U.S. Coast Guard was deployed to search for missing drilling crew members. After the DWH sank into the water, the oil spill became a serious issue that required an adaptive transition from the search and rescue to the oil spill response (National Commission on the BP Deepwater Horizon Oil Spill and Offshore Drilling, 2011).

Because of the unprecedented amount of oil spilled, the DWH disaster required immediate, large-scale response efforts to mitigate the detrimental effects. More than 47,000 people were involved in the containment, recovery and dispersion of the released oil across multiple emergency response organizations (EROs) (Starbird et al., 2015; U.S. Coast Guard, 2010). Because of unknown factors about the oil spill

and continued failures to shut the oil well, the response organizations’ abilities to adjust their functioning, or resilience, was the key to success-ful management during the disaster (Birkland & DeYoung, 2011).

Realizing the growing impor-tance of resilience in emergency management, Changwon Son started examining the current state of resilience research and conducted a series of empirical investigations. Son, Sasangohar, Neville et al. (2020a) conducted a systematic review of resilience literature in the emergency management domain. The review suggests four key factors of resilient performance of EROs: collective sensemaking, coordi-nated decision making, interactions between ERO members and recon-ciling work-as-imagined (WAI) and work-as-done (WAD).

Collective sensemaking indicates the ERO’s awareness of evolving situations by sharing relevant infor-mation in a timely manner. Based on the collective sensemaking, organizational decisions are made to bring necessary adjustment in incident objectives and strategic and tactical plans. As the EROs consist of multiple members with different expertise and knowledge, interactions between them are a crucial mechanism that facilitates the sharing of incident information and making mutually agreed-upon decisions as the disaster continues.

The uncertainty associated with situations that unfold during the disaster means that expected course of actions (WAI) often is different from activities that actually take place in the field (WAD). Findings from the review indicate that it is important to identify and reconcile the gaps between the two, assum-ing that neither WAI or WAD is absolutely correct and hence should be solely pursued.

Acknowledging a research gap that lies in the methodology of identifying WAI and WAD in EROs, Son et al. (2018) devel-oped an analytical method called interaction episode analysis (IEA) that incorporates cognitive systems

TEEX EMERGENCY OPERATIONS TRAINING CENTER

Figure 2. An observational study at the TEEX Emergency Operations Training Center was under-taken to look at how WAI and WAD were handled..

October 2020 / MKO Process Safety Journal-42-

engineering theory to represent interactions between humans and technologies to deal with tasks. The IEA enables analysts to identify three essential aspects of interactions: context (which human operators and technologies are involved in the interactions), characteristics (how often and long the interactions occur) and content (what conversation or action occurs during the interactions).

To apply the IEA to a naturalis-tic ERO, Son, Sasangohar, Neville et al. (2020b) conducted an obser-vational study at the Emergency Operations Training Center of TEEX (Figure 2) and identified WAI and WAD of incident com-mand operations. Findings from the observational study confirmed that not every expected interaction occurs and that the members of EROs strive to achieve given tasks via alternative patterns of interac-tions (e.g., communicating with different roles from expected ones) despite challenges (e.g., confusing information) they face during the emergency operations.

To understand resilient perfor-mance of real-world EROs, Son, Sasangohar, Peres et al. (2020) and Son, Larsen et al. (2020) carried out interview studies with

emergency managers of govern-ment organizations and a regional hospital that responded to Hurri-cane Harvey in 2017, respectively. These studies identified what has made the EROs able to function in resilient ways as well as what has hindered them from doing so.

One of the major character-istics of resilience of the EROs was creating and maintaining a common operating picture (COP), a practical concept of collective sensemaking proposed in the litera-ture review work (Son, Sasangohar, Neville, et al., 2020a). For a deeper understanding of cognition in the EROs, Moon et al. (2020) conducted an integrative review in which a number of concepts for cognition in EROs have been defined.

A well-organized and fast response will help to enlighten recovery, but the latter usually takes much longer and more attention, e.g., because of improvised action and liability cases.

WHAT ARE FUTURE

RESEARCH PRIORITIES?

On error-tolerant design, includ-ing inherently safer design, work has been done at various places, but an overall integrating and

resilience-oriented study is lacking. As regards resilience in opera-

tions, a cooperative research study with industry would be welcome to see how the approach could contribute in practice using the indicator metrics. In addition, the resilience aspects of the dynamics in process operation by adaptation to product requirements, chang-ing feed stock and such should be investigated. In other words, how is process resilience to overcome abnormal situations changing when sudden process condition modifica-tions need to be implemented?

The previous and current efforts to identify the gaps between WAI and WAD, two essen-tial phenomena that reveal how EROs cope with complexities imposed by disasters, are relevant to work relations in general and in particular to cases of complex operations under time pressure. A question to tackled in the future is how we can address such gaps to strengthen resilience. Based on our current knowledge, the following are suggested as future research agenda items to better answer to the question:

• Predicting the unpredictable — the gaps between WAI and WAD often result from a lack

October 2020 / MKO Process Safety Journal-43-

of understanding about what undesired events could happen. Therefore, future research should be focused on predict-ing extreme incident scenarios quickly and developing asso-ciated response plans and translate them into procedures.

• Improving organizational adaptive capacity — it is nearly impossible to consider all potential hazards and to prepare prescribed courses of actions for such hazards (“thickening a rule book does not solve all the problems”). Thus, what is required to make an organization more resilient during major upset condi-tions is to increase adaptive capacity — to flexibly adjust performance to changes in the environment. An inven-tory of strategies used for the adaptive capacity in the health care domain (Son et al., 2019) can be applicable to the process procedures and train-ing programs.

CONCLUSIONS

Resilience analysis comes on top of risk assessment to increase not only process safety but also to deal with uncertainties and will contribute to

the long-term benefit of profitabil-ity. It may influence maintenance and other work procedures so they are less vulnerable to unexpected threats and can recover more quickly if incidents occur.

REFERENCESBaybutt, P. (2015). A critique of the

Hazard and Operability (HAZOP) study. Journal of Loss Prevention in the Process Industries, 33, 52-58.

Birkland, T. A., & DeYoung, S. E. (2011). Emergency response, doctrinal confusion, and federalism in the Deepwater Horizon oil spill. Publius: The Journal of Federalism, 41(3), 471-493.

Cameron, I., Mannan, S., Németh, E., Park, S., Pasman, H., Rogers, W., & Selig-mann, B. (2017). Process hazard analysis, hazard identification and scenario definition: Are the conventional tools sufficient, or should and can we do much better? Process Safety and Environmental Protection, 110, 53-70.

Casal, A., & Olsen, H. (2016). Operational risks in QRAs. Chemical Engineering Transactions, 48, 589-594.

Cataldo, M., Herbsleb, J. D., & Carley, K. M. (2008). Socio-technical congruence: A framework for assessing the impact of technical and work dependencies on soft-ware development productivity. The 2nd ACM-IEEE International Symposium on Empirical Software Engineering and Mea-surement, Kaiserslautern, Germany.

Dinh, L. T., Pasman, H., Gao, X., & Mannan, M. S. (2012). Resilience engineer-ing of industrial processes: principles and contributing factors. Journal of Loss Preven-tion in the Process Industries, 25(2), 233-241.

Dinh, L. T. T. (2011). Safety-oriented

resilience evaluation in chemical processes [Doctoral Dissertation, Texas A&M Uni-versity]. College Station, TX.

Hollnagel, E., Paries, J., Woods, D. D., & Wreathall, J. (2011). Resilience engi-neering in practice: A guidebook (Vol. 3). Ashgate Publishing.

Hollnagel, E., Woods, D. D., & Leveson, N. (2006). Resilience Engi-neering: Concepts and Precepts. Ashgate Publishing.

Jain, P. (2018). Process Resilience Anal-ysis Framework for Design and Operations [Doctoral Dissertation, Texas A&M Uni-versity]. College Station, TX.

Jain, P., Chakraborty, A., Pistikopoulos, E. N., & Mannan, M. S. (2018). Resil-ience-based process upset event prediction analysis for uncertainty management using Bayesian deep learning: application to a polyvinyl chloride process system. Indus-trial & Engineering Chemistry Research, 57(43), 14822-14836.

Jain, P., Diangelakis, N. A., Pistiko-poulos, E. N., & Mannan, M. S. (2019). Process resilience based upset events pre-diction analysis: Application to a batch reactor. Journal of Loss Prevention in the Process Industries, 62, 103957.

Jain, P., Mentzer, R., & Mannan, M. S. (2018). Resilience metrics for improved process-risk decision making: survey, analysis and application. Safety Science, 108, 13-28.

Jain, P., Pistikopoulos, E. N., & Mannan, M. S. (2019). Process resilience analysis based data-driven maintenance optimization: Application to cooling tower operations. Computers & Chemical Engi-neering, 121, 27-45.

Jain, P., Rogers, W. J., Pasman, H. J., Keim, K. K., & Mannan, M. S. (2018). A resilience-based integrated process systems hazard analysis (RIPSHA) approach: Part

October 2020 / MKO Process Safety Journal-44-

I plant system layer. Process Safety and Environmental Protection, 116, 92-105.

Jain, P., Rogers, W. J., Pasman, H. J., & Mannan, M. S. (2018). A resilience-based integrated process systems hazard analysis (RIPSHA) approach: Part II management system layer. Process Safety and Environ-mental Protection, 118, 115-124.

Jarvis, R., & Goddard, A. (2017). An analysis of common causes of major losses in the onshore oil, gas & petrochemical industries. Loss Prevention Bulletin(255).

Lauridsen, K., Kozine, I., Markert, F., Amendola, A., Christou, M., & Fiori, M. (2002). Assessment of uncertainties in risk analysis of chemical establishments: The ASSURANCE project. Final Report, Risø.

Leveson, N. G. (2004). A new accident model for engineering safer systems. Safety Science, 42(4), 237-270.

Leveson, N. G. (2011). Engineering a safer world: Systems thinking applied to safety. The MIT Press.

Moon, J., Sasangohar, F., Son, C., & Peres, S. C. (2020). Cognition in Crisis Management Teams: An Integrative Anal-ysis of Definitions. Ergonomics, 63(9).

National Commission on the BP Deep-water Horizon Oil Spill and Offshore Drilling. (2011). The Gulf Oil Disaster and the Future of Offshore Drilling: Report to the President.

Németh, E., & Cameron, I. (2018). Multi-level failure, causality and hazard insights viaknowledge based systems. In Mary Kay O’Connor Process Safety Center (MKOPSC) 21st Annual International Symposium Octo-ber 23–25, College Station, TX.

Pasman, H. J., Kottawar, K., & Jain, P. (2020). Resilience of process plant: what, why, and how - How resilience can improve safety and sustainability. Sustainability, 12, 6152; doi:10.3390/su12156152

Pasman, H. J., Rogers, W. J., & Mannan, M. S. (2017). Risk assessment: What is it worth? Shall we just do away with it, or can it do a better job? Safety Science, 99, 140-155.

Rasmussen, J. (1997). Risk management in a dynamic society: a modelling problem. Safety Science, 27(2-3), 183-213.

Son, C., Larsen, E. P., Sasangohar, F., & Peres, S. C. (2020). Opportunities and Challenges for Resilient Hospital Incident Management: Case Study of a Hospital’s Response to Hurricane Harvey. Journal of Critical Infrastructure Policy, 1(1), 81-104.

Son, C., Sasangohar, F., Neville, T., Peres, S. C., & Moon, J. (2020a). Investigating resilience in emergency management: An integrative review of liter-ature. Applied Ergonomics, 87, 103114.

Son, C., Sasangohar, F., Neville, T. J., Peres, S. C., & Moon, J. (2020b). Eval-uation of work-as-done in information management of multidisciplinary incident management teams via Interaction Episode Analysis. Applied Ergonomics, 84, 103031.

Son, C., Sasangohar, F., Peres, S. C., & Moon, J. (2020). Muddling through troubled water: resilient performance of incident management teams during Hurri-cane Harvey. Ergonomics, 63(6), 643-659.

Son, C., Sasangohar, F., Peres, S. C., Neville, T. J., Moon, J., & Sam Mannan, M. (2018). Modeling an incident management team as a joint cognitive system. Journal of Loss Prevention in the Process Industries.

Son, C., Sasangohar, F., Rao, A. H., Larsen, E. P., & Neville, T. (2019). Resil-ient performance of emergency department: Patterns, models and strategies. Safety Sci-ence, 120, 362-373.

Starbird, K., Dailey, D., Walker, A. H., Leschine, T. M., Pavia, R., & Bostrom, A. (2015). Social media, public participation,

and the 2010 BP Deepwater Horizon oil spill. Human and Ecological Risk Assessment: An International Journal, 21(3), 605-630.

Suokas, J., & Rouhiainen, V. (1989). Quality control in safety and risk analyses. Journal of Loss Prevention in the Process Industries, 2(2), 67-77.

Taylor, R. (2016). Can process plant QRA reduce risk?–experience of ALARP from 92 QRA studies over 36 years. Chem-ical Engineering Transactions, 48, 811-816.

U.S. Coast Guard. (2010). National Incident Commander’s Report: MC252 Deepwater Horizon. http://www.nrt.org/production/NRT/NRTWeb.nsf/AllAttachmentsByTitle/SA-1065NICRe-port/$File/Binder1.pdf

Weick, K. E., & Sutcliffe, K. M. (2011). Managing the unexpected: Resilient per-formance in an age of uncertainty (Vol. 8). John Wiley & Sons.

DR. HANS J. PASMAN is a Research

Professor at the Mary Kay O’Connor

Process Safety Center at the Texas A&M

University. Together with being Emeritus

Professor Chemical Risk Management at

the Delft University of Technology, and in

management of TNO Industrial safety NL,

he has more than 50 years of experience in

various roles related to almost all areas of

process safety and risk management. He

can be reached at [email protected].

CHANGWON SON is a graduate student

at the Applied Cognitive Ergonomics

Lab, Industrial and Systems Engineering

Department, Texas A&M University, and

the Mary Kay O’Connor Process Safety

Center. Email: [email protected].

October 2020 / MKO Process Safety Journal-45-

CALL FOR PAPERS Abstract topics include but are notlimited to the focus areas below:Instrumented Safeguards- Alarm Management- Safety Instrumented Systems- Fire and Gas Systems- BPCS Protection Layers Technology Focus- Control System Migration- Process Control and Optimization- Artificial Intelligence (Manufacturing 4.0)- New Field Device Technology- Smart Sensors & IIOT Instrument Reliability- Methods and Tools- Maintenance Strategies- Technician Training- Prior Use Justification- Case Studies Regulatory Compliance- Control and Monitoring Systems- Maintenance Tools and Practices- Business Case for Automation- Cybersecurity

Held virtually February 16-17, 2021

For more information email [email protected]

Due: October 15, 2020

Submit your abstract here:tx.ag/2021callforpapers

or visittx.ag/instrumentation

symposium

Technical paper and Workshop topic submission:- An abstract submission- A manuscript submission (a template will be available in the website)- A presentation (30 minute presentation and 10 minute live Q&A)- Workshops consist of 1 hour lecture and 15 minute Q&A

Find us on

At 2:09 p.m., February 16, 2007, a cracked elbow in bypassed

piping at the Valero-McKee refinery in Sunray, Texas, leaked propane. This leak triggered a series of events around a de-as-phalting extraction column that, although serious, could have been much worse. As it was, three workers were severely burned. The U.S. Chemical Safety Board (CSB) investigated and concluded that a foreign object had probably lodged in the block valve that was supposed to isolate the dead-leg section of piping. Water, an impu-rity in the feed, accumulated in the elbow. In early February, tem-peratures dropped to well below zero — the water turned to ice and expanded, cracking the elbow. Then, when temperatures rose and the ice melted, propane escaped.

It’s interesting to speculate on who opened the block valve and why. The dead leg was properly isolated for many years. This question wasn’t addressed in the CSB video: “Fire from Ice,” http://www.chemsafety.gov/.

The CSB found the dead-leg piping originally was designed to provide propane mixed with pitch to the top of the extraction column. In the early 1990s the process was changed, presumably after a hazards and operability (HAZOP) review.

This accident highlights a common problem in our industry. Plants sideline equipment and pro-cesses for weeks, months or even years. The units are isolated, or per-haps not, and allowed to rust. I’ve seen this at refineries, chemical and food plants, and even in municipal water-treatment facilities. This

approach euphemistically is called, “abandon-in-place.” Regardless of what you call it, it’s a poor engi-neering practice.

At Anheuser-Busch, we aban-doned piping because of asbestos insulation; it was cheaper to leave it in place than to remove the asbes-tos. In the end we took out the pipe because block valves leaked, causing product contamination. We had a similar situation when I worked at Ralston Purina.

Sometimes, abandon-in-place can involve electrical service. Der-elict switchgear, it seems to me, prompts a great many so-called ground faults. At the isomeriza-tion unit of a refinery such a fault caused a power blip. As a process engineer at a chemical plant I saw a similar problem affect our dis-tributed control system and backup programmable logic controller.

Abandon-in-Place Must EndLeaving equipment derelict instead of demolishing it can prove costly.

By Dirk Willard, Contributing Editor

October 2020 / MKO Process Safety Journal-47-

These incidents, really near misses, are more serious than ones that spoil product.

When I was at Anheuser-Busch we lost several batches of yeast because water poured down inside a rusty neglected starter box, causing a ground-fault trip. If the electrician had pulled the wires as instructed, this never would have occurred.

SO HOW CAN THESE

HAZARDS BE AVOIDED?

For aboveground piping, the solu-tion is simple. Install complete isolation: a blind flange is better than a block valve; a weld cap is best. For underground piping, costs significantly increase — but dig-ging is a lot cheaper than risking an accident or a visit from the EPA. You have more to worry about than leaks.

Dealing with wiring can pose greater challenges, especially at an older site. Drawings may not be current or accurate; this is espe-cially true for one-line diagrams.

Regardless of the nature of the abandoned equipment, a manage-ment of change committee should review its disposition. Inspect the equipment after it’s been isolated to

assure compliance with the wishes of the committee. Don’t forget the drawings! Electrical and mechani-cal diagrams should reflect current plant operation.

While much of the responsibil-ity falls on operating staff, design engineers aren’t completely off

the hook. They should allow for potential situations that require bypassing a line. Ideally, design for double-block-and-bleed (DB&B). This is familiar to most engineers from the food-and-drug side of our business but new to refineries. DB&B enables safe removal of equipment such as pressure gauges. If DB&Bs had been used instead of a single block valve, Valero may have avoided its accident.

NOW, LET’S MOVE ON

TO “TEMPORARILY”

BYPASSED EQUIPMENT.

Some engineers expect such equipment to perform like new.

Don’t be fooled. Take special precautions to avoid problems. Maintain records of the equip-ment. Completely tear down rotary equipment — even if properly stored with correct lubricant (make sure the stor-age lubricant is compatible with

the lubricant required when the equipment is running) — if left standing for more than a month. Check and clean static equipment such as tanks. Replace all relief equipment. Inspect high main-tenance items. And, of course, pressure-test the bypassed pro-cess. As with most start-ups, a thorough checklist can prevent anything from being overlooked. Modify the checklist to include what-if provisions. This tool will provide inexperienced engineers with some insight into solv-ing problems.

With good judgment and cau-tion, you can avoid accidents involving bypassed equipment.

“Abandon-in-place is a poor engineering practice.”

October 2020 / MKO Process Safety Journal-48-

As the world moves increasingly toward a hydrogen economy, the

safe use and management of this important material is coming into sharper focus. The researchers at Mary Kay O’Connor Process Safety Center (MKOCPS) are identifying safer design alternatives for the hydrogen economy to facili-tate its further commercialization.

HYDROGEN ECONOMY

John Bockris coined the term “hydrogen economy” in the 1970s. It refers to the application of hydrogen as a fuel in a clean energy system instead of conventional hydrocarbon fuels. Hydrogen is the lightest and one of the most abundant elements on earth and is present in molecules of organic compounds and water. Hydrogen currently is used in a gaseous form

for various industrial applications such as metallurgy, the chemical industry, the glass industry, etc.

Based on analyses by a market research firm, Markets and Markets, approximately 55% of the global hydrogen demand is for ammonia synthesis, 25% in refineries and 10% for methanol production. The hydro-gen market can be classified broadly into a “merchant hydrogen market” and a “captive hydrogen market.” The merchant hydrogen market com-prises central production of hydrogen that is supplied to end point consum-ers through transportation methods such as truck delivery and pipeline. In the captive hydrogen market, hydrogen is produced on-site by the consumers themselves. Currently the captive hydrogen market dom-inates the overall hydrogen market with a 95% market share. However, the merchant hydrogen market

share is growing rapidly at a rate of 7% per year.

Interestingly, hydrogen is gaining increasing attention as a potential fuel in the energy sector, which would allow the estab-lishment of an overall hydrogen economy. Hydrogen can be used as an energy carrier in both stationary and transport applications. This increased interest in hydrogen as an energy carrier is based on the following advantages:

• Combustion of hydro-gen results in formation of steam and water as by-products as opposed to environmental pollutants from combusting conventional hydrocarbon fuels.

• Hydrogen is nontoxic.• A major raw material is water

(which is abundantly present on earth).

The Emerging Hydrogen Economy Demands AttentionRecent accidents underscore the need for inherently safer designs | By Nilesh Ade, Texas A&M University

October 2020 / MKO Process Safety Journal-49-

• It can be used as an energy source in fuel cells that con-vert hydrogen into electricity without the formation of heat, resulting in higher efficiency.

• Transmission of hydrogen over long distances is more economical than high-voltage AC current.

Hydrogen typically exists in a compound form, thus the gen-eration methods for hydrogen involve hydrogen extraction from these compounds. Some of the current and future methods for hydrogen production include steam reforming of hydrocarbons, partial oxidation of hydrocar-bons, thermal decomposition of

hydrogen-containing compounds, thermochemical cycles, electrolysis of water, electrochemical decom-position, photolysis of water, photochemical decomposition of water, photo-electrochemi-cal decomposition of water and biological decomposition. A schematic representation of the hydrogen economy is shown in Figure 1.

SAFETY INCIDENTS RELEVANT

TO THE HYDROGEN ECONOMY

Although the application of hydro-gen as a fuel is gaining attention, among the key factors inhibiting its growth are the associated safety concerns. These safety concerns

arise mainly because of hydro-gen’s unique properties such as fast burning speed, high energy content, low ignition energy and wide flammability range. Since 1969, more than 200 incidents pertaining to hydrogen have been reported, and these incidents serve as a major hindrance toward the hydrogen economy’s further commercialization.

A study of 32 hydrogen-based incidents published in Interna-tional Journal of Hydrogen Energy that occurred before 2011 found that approximately 44% of these incidents resulted in a fire, 31% resulted in an explosion, and 16% were both fire and explosion. Only

HYDROGEN ECONOMY

Figure 1. This chart shows the various stages of the hydrogen economy, from production to usage.

Hydrogen packaging:

compression, liquefaction and hydrides

Hydrogen distribution:

pipelines, road, rail, ship

Hydrogen storage:

Pressure and cyrogenic

Hydrogen usage:

Transportation and stationary applications

Hydrogen production:

electrochmical, thermochemical and biochemical methods

October 2020 / MKO Process Safety Journal-50-

a small fraction of these incidents were near misses (9%). However, one of the key findings from this study was that “design error” con-stituted the primary cause leading to these incidents.

PEMFC forklift incident in May 2018. One of the recent explosions pertaining to hydrogen was the forklift explosion that occurred on May 24, 2018 at a Procter & Gamble plant in Pineville, Louisi-ana. The forklift was based on the proton exchange membrane fuel cell (PEMFC) technology. The incident led to the forklift oper-ator’s death and injured six other plant personnel.

Although the causes behind the explosion are ambiguous, a lawsuit was filed against the fork-lift manufacturing company with design defect as one of the reasons leading to the fatality. The inci-dent also resulted in significant economic repercussions to the forklift manufacturing company, thus hampering the growth of the hydrogen economy.

Hydrogen refueling station explosion in June 2019. Another recent explosion pertaining to

hydrogen is the hydrogen refuel-ing station (HRS) explosion that occurred in Sandvika, Norway on June 11, 2019. The impact result-ing from the explosion triggered the airbags of cars in the station’s vicinity and led to two injuries. The incident currently is under inves-tigation; however, the preliminary

findings indicate an assembly error in the plug of the high-pressure accumulators led to a hydrogen leak and subsequent explosion. Following the incident, 10 HRSs constructed by the associated company were closed temporarily, reiterating the impact of safety incidents on further hydrogen economy implementation.

INHERENTLY SAFER

DESIGN PHILOSOPHY

The inherently safer design (ISD) philosophy was put forth by Dr. Trevor Kletz in his seminal article

“What You Don’t Have Can’t Leak,” which encourages design processes that eliminate or reduce process hazards. ISD philosophy is based on the following principles:

• Intensification or minimiza-tion is reducing the amount of hazardous chemicals involved in the process.

• Substitution means substitut-ing a hazardous chemical in the process with a safer one. The hazards associated with a chemical can be determined according to its flammability, explosiveness, toxicity and chemical reactivity.

• Attenuation or moderation refers to reducing the severity of operating conditions (such as operating temperature and pressure) of the process involv-ing hazardous chemicals.

• Limitation of effects involves altering the design (mainly

Safer design alternatives will foster the emergence of the hydrogen economy.

October 2020 / MKO Process Safety Journal-51-

process design and operat-ing conditions) based on the hazards associated with the process to limit the effect of hazardous chemicals.

Simplicity is designing simpler plants with relatively fewer pieces of equipment to reduce opportuni-ties for failures in the process.

The ISD philosophy typically is used to generate alternate process designs that improve overall safety. This philosophy is based on the understanding that modifying the process in the early design stages can be most effective in reducing the associated process hazards.

RESEARCH OBJECTIVES

A detailed literature review per-formed as a part of this study identified significant research gaps. The first gap was the limited research pertaining to the hydro-gen safety incidents such as those discussed above and the current scientific literature to reduce the risk of such incidents. Second, a substantial gap existed between safety research and fundamental engineering research relevant to hydrogen.

Limited research was carried out that involved a compara-tive analysis between safety and performance for components of a hydrogen economy such as PEMFC and HRS. Last, although the ISD philosophy has been applied widely in onshore and offshore chemical processing facilities, its application toward all components of a hydrogen economy was limited. Based on the identified research gaps, the following research objectives were proposed:

• Identify the potential causes that led to improper design of hydrogen economy com-ponents in recent incidents (forklift explosion and HRS explosion).

• Apply the ISD philosophy to investigate potential improvements to compo-nent design.

• Perform a comparative anal-ysis between a safety metric and a performance metric for the suggested ISD alterna-tives so that performance is not affected negatively while improving their safety.

TOWARD AN INHERENTLY

SAFER HYDROGEN ECONOMY

A two-fold study was proposed to achieve the above objectives. The first part focuses on the PEMFC’s design relating to the forklift explosion incident. The second part focuses on the HRS design relating to the HRS explosion incident. In Part I, a mathematical model was devel-oped that relates the microscale PEMFC degradation to the probability of a macroscale explo-sion in a fuel cell electric vehicle (FCEV).

Using the model and the inherent safety principle of inten-sification, one can conclude that increasing the PEMFC system’s operating temperature can improve both its safety and durability significantly while intensifying membrane design parameters, i.e., membrane thickness and mem-brane conductivity do not provide any significant improvements. A key observation from this study is that a PEMFC system’s durabil-ity (expressed in voltage loss) and safety (expressed in explosion prob-ability) are not correlated perfectly.

October 2020 / MKO Process Safety Journal-52-

In Part II, the research team developed an integrated model using queuing theory, process syn-thesis, quantitative risk assessment (QRA) and economic analysis for designing HRS. Currently HRS designs are based primar-ily on economic consideration to supply hydrogen at a competitive price, and their safety is evaluated through QRA as dictated by the codes and standards.

However, the lack of relevant safety perspective in the design stage itself leads to a possibility of HRS being overdesigned in terms of safety. The application of the integrated model was demonstrated using ISD philosophy. For the base design under consideration, the results indicated that reducing liquid storage capacity can reduce the risk associated with explosion signifi-cantly along with improving HRS economics, while reducing the dis-penser hose diameter can reduce the risk associated with jet-fire with a slight detriment to HRS economics. These two studies were published in International Journal of Energy to disseminate the findings. The article titles are as follows:

• “Intensifying vehicular proton exchange membrane fuel cells for safer and durable, design and operation”

• “An integrated approach for safer and economical design of hydrogen refueling stations”

CONCLUSIONS AND

FUTURE WORK

It can be concluded from this study’s findings that ISD can be effective for improving hydro-gen system safety. However, ISD implementation can result in both beneficial and detrimental effects on the performance/economics and overall safety of these systems. It is imperative to support such ISD improvements with a holistic anal-ysis, incorporating both safety and performance quantification.

The current research impetus is on the aspect of hydrogen produc-tion within the hydrogen economy. This research focus is motivated by the explosion that occurred at a hydrogen reforming facility in Catawba County, North Carolina on April 7, 2020. The explosion resulting from this incident dam-aged 60 homes in the facility’s

vicinity while negatively impacting the perception of hydrogen as a fuel and thus serving as a hindrance toward hydrogen economy growth.

The incident highlights the need to revisit the design of the traditional steam reforming pro-cess. The research team currently is investigating potential ISD-based safety improvements in the reforming process while trying to ensure that the economics of hydrogen production remain competitively viable to hydrocar-bon-based fuels.

NILESH ADE is a Ph.D. student at the

Mary Kay O’Connor Process Safety

Center in the Chemical Engineering

Department at the Texas A&M Univer-

sity. He pursued his bachelor’s degree

in Chemical Engineering from Institute

of Chemical Technology, Mumbai.

Nilesh has been involved in multiple

areas of research in process safety

including consequence analysis, inher-

ent safety, reliability analysis, quantita-

tive risk analysis, and human factors.

Nilesh is currently working on the safe-

ty of the Hydrogen economy as part of

his dissertation. He can be reached at

[email protected].

October 2020 / MKO Process Safety Journal-53-