human performance - southwest power pool · 2019. 2. 8. · characteristics, performance...
TRANSCRIPT
Human Performance: The Science behind the Tools
James Merlo, PhD October 2012
2 RELIABILITY | ACCOUNTABILITY
NERC Pillars
• Reliability – to address events and identifiable risks, thereby improving the reliability of the bulk power system.
• Assurance – to provide assurance to the public, industry, and government for the reliable performance of the bulk power system.
• Learning – to promote learning and continuous improvement of operations and adapt to lessons learned for improvement of bulk power system reliability.
• Risk-based model – to focus attention, resources and actions on issues most important to bulk power system reliability.
3 RELIABILITY | ACCOUNTABILITY
Top Priority Reliability Issues
• Misoperations of relay protection and control systems
• Human errors by field personnel
• Ambiguous or incomplete voice communications
• Right-of-way maintenance
• Changing resource mix
• Integration of new technologies
• Preparedness for high impact, low frequency events
• Non-traditional threats via cyber security vulnerabilities
NERC President’s Top Priority Issues for Bulk Power System Reliability, http://www.nerc.com/news_pr.php?npr=723 at http://www.nerc.com/fileUploads/File/News/NERC%20President%20Top%20Priority%20BPS%20Reliability%20Issues%201-7-11.pdf
4 RELIABILITY | ACCOUNTABILITY
• Misoperations of relay protection and control systems
• Human errors by field personnel
• Ambiguous or incomplete voice communications
• Right-of-way maintenance
• Changing resource mix
• Integration of new technologies
• Preparedness for high impact, low frequency events
• Non-traditional threats via cyber security vulnerabilities
Top Priority Reliability Issues
NERC President’s Top Priority Issues for Bulk Power System Reliability, http://www.nerc.com/news_pr.php?npr=723 at http://www.nerc.com/fileUploads/File/News/NERC%20President%20Top%20Priority%20BPS%20Reliability%20Issues%201-7-11.pdf
5 RELIABILITY | ACCOUNTABILITY
Reliability Risk Management Concept
Severity
Inverse Cost-Benefit
Reporting Threshold
Learn and Reduce
Avoid
6 RELIABILITY | ACCOUNTABILITY
Reliability Risk Management Concept
Severity
Inverse Cost-Benefit
Reporting Threshold
Learn and Reduce
Avoid
7 RELIABILITY | ACCOUNTABILITY
Drifting to Failure*
Latent Error Inconspicuous and seemingly harmless buildup of “hidden” error and organizational weaknesses
Relia
bilit
y
Hi
Lo Time
Drift
Stated Expectations
“Normal” Practice
Real Margin for Error
* Adapted from Muschara Error Management Consulting, LLC
Expectations: Desired approach to work (as imagined) Normal Practices: Work as actually performed (allowed by mgmt!)
Error
Hidden hazards, threats, unusual conditions, & system weaknesses
8 RELIABILITY | ACCOUNTABILITY
Peer Check
Safety Check
9 RELIABILITY | ACCOUNTABILITY
Too Hard?
“Complicated Industry” “Come along way” “Can’t get to zero” “Automate, technology reduces the need for human operator”
10 RELIABILITY | ACCOUNTABILITY
Too Hard?
“Complicated Industry” “Come along way” “Can’t get to zero” “Automate, technology reduces the need for human operator”
11 RELIABILITY | ACCOUNTABILITY
Challenge
www.airlines.org/PublicPolicy/Testimony/Pages/testimony_5-13-09Senate.aspx&docid=qnHU9MAraY_WIM&w=550&h=403&ei=mdRbTvkrhLm3B8nyibgM&zoom=1&iact=rc&dur=62&page=2&tbnh=167&tbnw=216&start=50&ndsp=31&ved=1t:429,r:4,s:50&tx=110&ty=85
12 RELIABILITY | ACCOUNTABILITY
Human Performance
It is not a matter of if the automation fails, it
is a matter of when.
13 RELIABILITY | ACCOUNTABILITY
Stuff Happens
14 RELIABILITY | ACCOUNTABILITY
Human Performance Analysis
• We have not fully understood an event if we don’t see the actors’ actions as reasonable.
• The point of a human error investigation is to understand why people did what they did, not to judge them for what they did not do.
• The difference between an accident and a serious incident lies only in the result.
15 RELIABILITY | ACCOUNTABILITY
Challenge
www.airlines.org/PublicPolicy/Testimony/Pages/testimony_5-13-09Senate.aspx&docid=qnHU9MAraY_WIM&w=550&h=403&ei=mdRbTvkrhLm3B8nyibgM&zoom=1&iact=rc&dur=62&page=2&tbnh=167&tbnw=216&start=50&ndsp=31&ved=1t:429,r:4,s:50&tx=110&ty=85
16 RELIABILITY | ACCOUNTABILITY
Flowchart for Human Performance
17 RELIABILITY | ACCOUNTABILITY
Sometimes it is a Human
18 RELIABILITY | ACCOUNTABILITY
Human Performance Tenets
• People are fallible, and all people make mistakes • Error-likely situations are predictable, manageable, and preventable • Individual behavior is influenced by organizational processes and values • People achieve high levels of performance largely because of the encouragement and reinforcement received from leaders, peers, and subordinates • Events can be avoided through an understanding of the reasons mistakes occur and application of the lessons learned from past events or near misses
19 RELIABILITY | ACCOUNTABILITY
Five Questions
• Elegant simplicity • Know thy user • The rat is never wrong • Actions not words • You can’t afford not to know the truth
20 RELIABILITY | ACCOUNTABILITY
Five Questions
• Elegant simplicity • Know thy user • The rat is never wrong • Actions not words • You can’t afford not to know the truth
21 RELIABILITY | ACCOUNTABILITY
Elegant simplicity
• Elegant simplicity Russians and the US Space Program How many tools in the box? The tool shouldn’t be harder than the task. Surround the truth…it is out there somewhere…
22 RELIABILITY | ACCOUNTABILITY
Human Performance Tools
• Two Minute rule
• Stop when unsure
• Self checking (also called STAR and touch STAR)
• Procedure use and adherence
• Three way communication
• Phonetic alphabet
• Pre-job brief
• Peer check
• Concurrent verification
• Independent verification
• Flagging operational barriers
• Place keeping
• Post job interview
• First Check
23 RELIABILITY | ACCOUNTABILITY
Human Performance Tools
24 RELIABILITY | ACCOUNTABILITY
Read All About It
25 RELIABILITY | ACCOUNTABILITY
Five Questions
• Elegant simplicity • Know thy user • The rat is never wrong • Actions not words • You can’t afford not to know the truth
26 RELIABILITY | ACCOUNTABILITY
Know thy user
• Know thy user Human Ingenuity Only two hands, two eyes, see the pattern? If you only have a minute, it only takes a
minute… Set me up for success…please… Human nature
27 RELIABILITY | ACCOUNTABILITY
Darnell, M. J. (2006). Bad Human Factors Designs. Baddesigns.Com
Signs
28 RELIABILITY | ACCOUNTABILITY
29 RELIABILITY | ACCOUNTABILITY
Five Questions
• Elegant simplicity • Know thy user • The rat is never wrong • Actions not words • You can’t afford not to know the truth
30 RELIABILITY | ACCOUNTABILITY
The rat is never wrong
• The rat is never wrong Behaviorism Not enforcing a policy is like not having a
policy at all Don’t have a rule that you aren’t going to
enforce
31 RELIABILITY | ACCOUNTABILITY
The rat is never wrong
Human behavior is shaped by interaction in the world…
• Punishment stops behavior • Reinforcement shapes and sustains behavior
32 RELIABILITY | ACCOUNTABILITY
Silence is Consent
33 RELIABILITY | ACCOUNTABILITY
Punishment vs. Negative Reinforcement
Does the behavior increase or decrease?
34 RELIABILITY | ACCOUNTABILITY
Get ‘er done
35 RELIABILITY | ACCOUNTABILITY
Five Questions
• Elegant simplicity • Know thy user • The rat is never wrong • Actions not words • You can’t afford not to know the truth
36 RELIABILITY | ACCOUNTABILITY
Actions not words
• Actions not words It is not important unless it is checked. What is your story? Are you telling your story up or down? Live the dream
37 RELIABILITY | ACCOUNTABILITY
Tell your story…
38 RELIABILITY | ACCOUNTABILITY
Five Questions
• Elegant simplicity • Know thy user • The rat is never wrong • Actions not words • You can’t afford not to know the truth
39 RELIABILITY | ACCOUNTABILITY
You can’t afford not to know the truth
• You can’t afford not to know the truth Root cause Just Culture Near misses
40 RELIABILITY | ACCOUNTABILITY
A Tale of Two Cylinders
41 RELIABILITY | ACCOUNTABILITY
Or…When Good Pistons Go Bad!
42 RELIABILITY | ACCOUNTABILITY
Why Root Cause Versus Apparent Cause?
• Facts: Jeep had 107k miles
Cylinders were fine…no abrasions (whew, got lucky)
Approximately $2,500 to completely rebuild, same block just new pistons
Just mean time between failures for pistons…or maybe not
43 RELIABILITY | ACCOUNTABILITY
The Rest of the Story…
• Mechanic noticed some scalding on other pistons • No history of ever overheating
44 RELIABILITY | ACCOUNTABILITY
The Rest of the Story...
• Mechanic noticed some scalding on other pistons • No history of ever overheating • Jeep was hit on right side, at 70k miles • Right fender was replaced, radiator and fan blade...no
damage to engine block
45 RELIABILITY | ACCOUNTABILITY
The Rest of the Story…
• Mechanic noticed some scalding on other pistons • No history of ever overheating • Jeep was hit on right side, at 70k miles • Right fender was replaced, radiator and fan blade...no
damage to engine block • New fan blade was installed backwards!
• Jeep was running hotter than it should…just slightly…not enough to notice…and there was a new owner so there was no baseline
46 RELIABILITY | ACCOUNTABILITY
Peer Check
Safety Check
47 RELIABILITY | ACCOUNTABILITY
Can Your Organization Handle the Truth?
"Before you tell the "truth" to the patient, be sure you know the "truth,“ and that the patient wants to hear it."
Journal of Chronic Diseases (1963) Dr. Richard Clarke Cabot (1868-1939)
48 RELIABILITY | ACCOUNTABILITY
Five Questions
• Elegant simplicity Russians and the Space Program
• Know thy user Human Ingenuity
• The rat is never wrong Behaviorism
• Actions not words It is not important unless it is checked.
• You can’t afford not to know the truth Root Cause
49 RELIABILITY | ACCOUNTABILITY
Human Performance Tools
50 RELIABILITY | ACCOUNTABILITY
Six Human Considerations
• Attention • Sensation • Perception • Cognition • Decision making • Action
51 RELIABILITY | ACCOUNTABILITY
Six Human Considerations
• Attention • Sensation • Perception • Cognition • Decision making • Action
52 RELIABILITY | ACCOUNTABILITY
Attention
• Spotlight metaphor • Each modality has its strengths • Multiple Resource Theory
53 RELIABILITY | ACCOUNTABILITY 53 53
Photo by isafmedia. Used with permission.
54 RELIABILITY | ACCOUNTABILITY 54 54
Photo by isafmedia. Used with permission.
Primarily Scanning
Primarily Focusing
The sniper’s mental state is Focused. The spotter’s mental state is Scanning. Both communicate effectively with each other. The result? Situational Awareness that you can bet your life on.
55 RELIABILITY | ACCOUNTABILITY 55
56 RELIABILITY | ACCOUNTABILITY 56
Primarily Scanning
Primarily Focusing
Communicating
57 RELIABILITY | ACCOUNTABILITY 57
58 RELIABILITY | ACCOUNTABILITY 58
Primarily Focusing
Primarily Scanning
59 RELIABILITY | ACCOUNTABILITY 59
Primarily Focusing
Primarily Scanning
Primarily Focusing
Primarily Scanning
60 RELIABILITY | ACCOUNTABILITY
Six Human Considerations
• Attention • Sensation • Perception • Cognition • Decision making • Action
61 RELIABILITY | ACCOUNTABILITY
Sensation
• Human limitations • Absolute Threshold • Physiological Psychology
62 RELIABILITY | ACCOUNTABILITY
Regular Insulin N (for NPH Insulin).
Hindsight is 20/20
63 RELIABILITY | ACCOUNTABILITY
64 RELIABILITY | ACCOUNTABILITY
65 RELIABILITY | ACCOUNTABILITY
Six Human Considerations
• Attention • Sensation • Perception • Cognition • Decision making • Action
66 RELIABILITY | ACCOUNTABILITY
Perception
• Perception is Reality • Bottom Up versus Top Down • Expectations
67 RELIABILITY | ACCOUNTABILITY
What about the science…
• CAPITAL LETTERS
WORD
EASIER
FASTER
BUT BECOMES MORE DIFFICULT WHEN PART OF A SENTENCE BECAUSE…
We use context to read and the shape matters
68 RELIABILITY | ACCOUNTABILITY
Regular Insulin N (for NPH Insulin).
Hindsight is 20/20
69 RELIABILITY | ACCOUNTABILITY
70 RELIABILITY | ACCOUNTABILITY
Communicating
71 RELIABILITY | ACCOUNTABILITY
Six Human Considerations
• Attention • Sensation • Perception • Cognition • Decision making • Action
72 RELIABILITY | ACCOUNTABILITY
Biological Bases for Behavior
Cognition
73 RELIABILITY | ACCOUNTABILITY
Cognition
• What are you thinking about?
• Working memory versus Long Term Memory
• Experts versus Novices
74 RELIABILITY | ACCOUNTABILITY
Limited Working Memory • Mind's short-term memory is the “workbench” for problem solving and decision-making. • Actively involved during learning, storing, and recalling information. • Often expressed as 7+ or -2. • Limitations of short-term memory are at the root of forgetfulness; forgetfulness leads to omissions when performing tasks. • Applying place-keeping techniques while using complex procedures compensates for this human limitation.
75 RELIABILITY | ACCOUNTABILITY
Working Memory
• Size 7 +/- 2 chunks • VAFBICIADODIRA
– VA FBI CIA DOD IRA • Area codes • Credit card numbers are divided into
chunks…. • Expert memories…or really good
chunkers
76 RELIABILITY | ACCOUNTABILITY
Doubting me still….
• Alphabet 26 letters…
or 8 chunks?
ABCD EFG HIJK LMNOP QRS TUV WX YZ
77 RELIABILITY | ACCOUNTABILITY
78 RELIABILITY | ACCOUNTABILITY 78
79 RELIABILITY | ACCOUNTABILITY
Mental Model
• One’s understanding of a system, how it operates, its characteristics, performance parameters, couplings within itself and other systems and how one interacts with it.
• It is a representation of the surrounding world, the relationships between its various parts and a person's intuitive perception about his or her own acts and their consequences.
• Our mental models help to shape our behavior and define our approach to solving problems (a personal algorithm) and carrying out tasks, especially within a system.
• Mental models are like opinions, they can be partially or completely right or wrong, complete or incomplete and most often are unique for each individual.
80 RELIABILITY | ACCOUNTABILITY
Perfectly aligned mental model
81 RELIABILITY | ACCOUNTABILITY
Improper Mental Model Example
Some people believe that you can heat/cool a room faster by setting the thermostat to a higher/lower temperature than you really want, as if the thermostat were a valve for the heating/cooling system that lets more heat/cool air into the room the higher/lower you set it. In fact, the thermostat is simply an on/off switch for the heat/cool. It turns on as long as the room temperature is below/above the thermostat setting, and turns off when the thermostat setting is reached.
82 RELIABILITY | ACCOUNTABILITY
Contextual Task Analysis
Max Whittaker for The New York Times
Hydro Quebec
83 RELIABILITY | ACCOUNTABILITY
Six Human Considerations
• Attention • Sensation • Perception • Cognition • Decision making • Action
84 RELIABILITY | ACCOUNTABILITY
Decision Making
• Information overload • Experts vs Novices • Heuristics and Biases
85 RELIABILITY | ACCOUNTABILITY
Darnell, M. J. (2006). Bad Human Factors Designs. Baddesigns.Com
Signs
86 RELIABILITY | ACCOUNTABILITY
Darnell, M. J. (2006). Bad Human Factors Designs. Baddesigns.Com
87 RELIABILITY | ACCOUNTABILITY
Information Overload
88 RELIABILITY | ACCOUNTABILITY
Heuristics and Biases
• Avoidance of Mental Strain – Humans are reluctant to engage in lengthy concentrated thinking, as it requires high levels of attention for extended periods. Thinking is a slow, laborious process that requires great effort. People tend to look for familiar patterns and apply well-tried solutions to a problem. The mental biases and heuristics, or shortcuts, often used to reduce mental effort and expedite decision-making include:
• Assumptions – A condition taken for granted or accepted as true without verification of the facts.
• Habit – An unconscious pattern of behavior acquired through frequent repetition.
89 RELIABILITY | ACCOUNTABILITY
Heuristics and Biases
Confirmation bias – The reluctance to abandon a current solution—to change one's mind—in light of conflicting information due to the investment of time and effort in the current solution. This bias orients the mind to “see” evidence that supports the original supposition and to ignore or rationalize away conflicting data. Similarity bias – The tendency to recall solutions from situations that appear similar to those that have proved useful from past experience. Frequency bias – A gamble that a frequently used solution will work; giving greater weight to information that occurs more frequently or is more recent. Availability heuristic – The tendency to settle on solutions or courses of action that readily come to mind and appear satisfactory; more weight is placed on information that is available (even though it could be wrong).
90 RELIABILITY | ACCOUNTABILITY
• People generally seek evidence that will confirm, not falsify, a hypothesis
• Solve problems and syllogisms by applying information to pre-existing schemas
• More relevant = easier to solve
• The Bottom Line: People are not logic machines who can plug any problem into a logical formula
Confirmation Bias Continued
91 RELIABILITY | ACCOUNTABILITY
• Availability Heuristic
estimating the likelihood of events based on their availability in memory
if instances come readily to mind (perhaps because of their vividness), we presume such events are common
We tend to be overly influenced by events that come easily to mind
Availability Heuristic
92 RELIABILITY | ACCOUNTABILITY
Availability Heuristic
Is the letter “k” most likely to occur in the first position of a word or the third position?
93 RELIABILITY | ACCOUNTABILITY
• Answer: “k” is 2-3 times more likely to be in the third position
• Most people respond that “k” is more frequent in the first position. Why does this occur?
• Because it is easier to recall words starting with “k” , people overestimate the number of words starting with “k”
Availability Heuristic
94 RELIABILITY | ACCOUNTABILITY
Which of the following are more frequent causes of death in the U.S.?
Rate how confident you are in your choice on a scale from 0 (guessing) to 100 (absolutely certain that your choice is correct).
1. All accidents or strokes? confidence rating?
2. Electrocution or asthma? confidence rating?
3. Homicide or diabetes? confidence rating?
4. Lightning or appendicitis? confidence rating?
5. Drowning or Leukemia? confidence rating?
Availability Heuristic
95 RELIABILITY | ACCOUNTABILITY
Which of the following are more frequent causes of death in the U.S.?
1. All accidents (55,000) or strokes (102,000) 2. Electrocution (500) or asthma (920) 3. Homicide (9200) or diabetes (19,000) 4. Lightning (52) or appendicitis (440) 5. Drowning (3600) or Leukemia (7100)
Availability Heuristic
96 RELIABILITY | ACCOUNTABILITY
Six Human Considerations
• Attention • Sensation • Perception • Cognition • Decision making • Action
97 RELIABILITY | ACCOUNTABILITY
Moving slow to move fast
• All human performance tools deliberately slow things down to ultimately speed things up by avoiding delays that accompany events triggered by active errors.
• When used conscientiously, these tools give the individual more time to think about the task at hand; about what is happening, what will happen, and what to do if things do not go as expected.
98 RELIABILITY | ACCOUNTABILITY
Action
99 RELIABILITY | ACCOUNTABILITY
Rasmussen’s Classifications
• Human Error Classifications Skill Based
Rule Based
Knowledge Based
• Driving example: Often times a human will operate in all three levels, going back and forth in a single event.
100 RELIABILITY | ACCOUNTABILITY
Improper Mental Model (cont)
• Skill - Does not really effect
• Rule – Usually not a factor
• Knowledge – Real problem
101 RELIABILITY | ACCOUNTABILITY
Six Human Considerations
• Attention • Sensation • Perception • Cognition • Decision making • Action
102 RELIABILITY | ACCOUNTABILITY
Six Human Considerations
• Attention • Sensation
• Perception • Cognition
• Decision making • Action
Scan
Focus
Act
103 RELIABILITY | ACCOUNTABILITY
Situational Awareness
• Situational awareness is defined as the accuracy of a person’s current knowledge and understanding of actual conditions compared to expected conditions at a given time. DOE
• The perception of the elements in the environment within a volume of time and space, the comprehension of their meaning and the projection of their status in the near future.
Endsley, M. R. (1995). Toward a theory of situation awareness in dynamic systems. Human Factors, 37(1), 32-64.
104 RELIABILITY | ACCOUNTABILITY http://en.wikipedia.org/wiki/Situation_awareness
Situational Awareness
105 RELIABILITY | ACCOUNTABILITY
Stress
• Stress is the body’s mental and physical response to a perceived threat(s) in the environment. It is the perception one has about his or her ability to cope with the threat.
• Stress in itself is not a bad thing. Some stress is normal and healthy. Stress may result in more focused attention, which in some situations could actually be beneficial to performance.
• The problem with stress is that it can accumulate and overpower a person, thus becoming detrimental to performance. Stress increases as familiarity with a situation decreases. It can result in panic, inhibiting the ability to effectively sense, perceive, recall, think, or act. Anxiety and fear usually follow when an individual feels unable to respond successfully.
• Along with anxiety and fear, memory lapses are among the first symptoms to appear. The inability to think critically or to perform physical acts with accuracy soon follows.
106 RELIABILITY | ACCOUNTABILITY
Is stress always a bad thing?
Inverted-U Hypothesis
emotional arousal vs task performance
107 RELIABILITY | ACCOUNTABILITY
TWIN - Error Precursors Task Demands Work Environment
Time pressure (in a hurry) Distractions / Interruptions High workload (memory requirements) Changes / Departure from routine Simultaneous, Multiple tasks Confusing displays / control Repetitive actions (monotony) Work - arounds Unclear goals, roles, or responsibilities Unexpected equipment conditions Lack of or unclear standards Back shift or recent shift change Complex / High information flow
Individual Capabilities Human Nature Unfamiliarity with task (first time) Stress Lack of knowledge (faulty mental model) Habit patterns Imprecise communication habits Assumptions Lack of proficiency; inexperience Complacency / over confidence Overzealousness for safety critical task Inaccurate risk perception Illness or fatigue – Fitness for duty Communication shortcuts Lack of big picture
108 RELIABILITY | ACCOUNTABILITY
Latent Organizational Weaknesses
Pre-Job Briefing Values & Norms
Communications – Oral & Written Maintenance Processes
Work Planning & Scheduling Procedure Development
Controls, Measures and Monitoring Goals & Priorities
Design & Modifications Organizational Structure
Task Structure Roles & Responsibilities
Written Guidance: Rules, Policies and Practices
Training & Qualification
A review of the INPO industry event data base reveals that events occur more often due to error-prone tasks and error-prone work environments than from error-prone individuals
Error-prone tasks and work environments are typically created by latent organizational weaknesses. Source: Reason – 1991 (modified)
109 RELIABILITY | ACCOUNTABILITY
Defenses
Defense 1
But it is possible that under the wrong set of circumstances, an event could occur….
Defense 2 Defense 3
Defense 4
Event
Hazard
110 RELIABILITY | ACCOUNTABILITY RELIABILITY | ACCOUNTABILITY
Set Me Up for Success
• Distractions
• Interruptions
• Unplanned changes
Those things that “set-up” a mistake to happen
Task demands are greater than the worker’s abilities
Confusing conditions make the job harder
New techniques not used before
Mental shortcuts
Lack-of or unclear standards
Illness / Fatigue
111 RELIABILITY | ACCOUNTABILITY
Reliability
Equipment Reliability Human Performance Human Interaction with Equipment (coupled w/Automation)
112 RELIABILITY | ACCOUNTABILITY
835
432
48 20
0
100
200
300
400
500
600
700
800
900
Unnecessary Trips during fault
Unnecessary Trips other than fault
Failure to trip Slow trip
Misoperation Count by Category (total reported)
Misoperation Categories (2011 Q2-Q3)
Mis
oper
atio
n Co
unt
Misoperation Category
113 RELIABILITY | ACCOUNTABILITY
Misoperation Causes (2011 Q2-Q3)
424
305
219
155
88 91
43 10
0
50
100
150
200
250
300
350
400
450 Total Reported
Mis
oper
atio
n Co
unt
Misoperation Cause
114 RELIABILITY | ACCOUNTABILITY
138
96
63
8
0
50
100
150
200
250
300
350
Electromechanical Microprocessor Solid state Unknown
Misoperation Relay Technology (2011 Q2-Q3)
Mis
oper
atio
n co
unt
Misoperations by Relay Technology (only Relay Failure/Malfunction Cause)
115 RELIABILITY | ACCOUNTABILITY
68
316
19 21
0
50
100
150
200
250
300
350
Electromechanical Microprocessor Solid State Unknown
Mis
oper
atio
n co
unt
Misoperations by Relay Technology (only incorrect Settings/Logic/Design Error Cause)
Misoperation Relay Technology (2011 Q2-Q3)
116 RELIABILITY | ACCOUNTABILITY
Mental Model of the System
• Apparent simplicity with hidden complexity Fitting your understanding
into the system constraints (hardware and software)
Different manufacturers, diverse applications and tools, requires different approaches
• Design for a person to use, set and operate
117 RELIABILITY | ACCOUNTABILITY
Malcolm K. Sparrow John F. Kennedy School of Government, Harvard University
Solving Problems: Untying the Knot
118 RELIABILITY | ACCOUNTABILITY
Initiative
The Reliability Risk Management Group (RRM) has designed, developed, and implemented the North American Energy Reliability Corporation (NERC) Causal Code Assignment Process to allow accurate, efficient trending and subsequent analysis of events for sharing and providing a cooperative forum focused on improving the reliability of the Bulk Power System (BPS).
NERC CCAP North American Electric Reliability Corporation Causal Code Assignment Process An event and data analysis tool
119 RELIABILITY | ACCOUNTABILITY
Purpose
• Establish NERC causal coding program to guide, assist, and inform industry
• Venue to share data across NERC • Opportunity to collect and disseminate reliability data • Expanded communication channel to reach parts of
industry and Electric Reliability Organization • Foster trust and collaboration between Applicable
Governmental Agency, ERO and stakeholders
120 RELIABILITY | ACCOUNTABILITY
Cause Code Assignment Process (CCAP)
• A1 Design/Engineering Problem • A2 Equipment/Material Problem • A3 Individual Human Performance LTA • A4 Management Problem • A5 Communication LTA • A6 Training Deficiency • A7 Other Problem
121 RELIABILITY | ACCOUNTABILITY
A3 - Human Performance
Cause Code Assignment Process (CCAP)
• A1 Design/Engineering Problem • A2 Equipment/Material Problem • A3 Individual Human Performance B1 SKILL BASED ERROR
B2 RULE BASED ERROR
B3 KNOWLEDGE BASED ERROR
B4 WORK PRACTICES
• A4 Management Problem • A5 Communication LTA • A6 Training Deficiency • A7 Other Problem
122 RELIABILITY | ACCOUNTABILITY
A3 - Human Performance Cause Code Assignment Process (CCAP)
• A3 Individual Human Performance B1 SKILL BASED ERROR
o C01 Check of work LTA
o C02 Step was omitted due to distraction
o C03 Incorrect performance due to mental lapse
o C04 Infrequently performed steps were performed incorrectly
o C05 Delay in time caused LTA actions
o C06 Wrong action selected based on similarity with other actions
o C07 Omission / repeating of steps due to assumptions for completion
B2 RULE BASED ERROR
B3 KNOWLEDGE BASED ERROR
B4 WORK PRACTICES
123 RELIABILITY | ACCOUNTABILITY
NERC CCAP
124 RELIABILITY | ACCOUNTABILITY
NERC CCAP
125 RELIABILITY | ACCOUNTABILITY
25
49
66
46
3732
2723
0
10
20
30
40
50
60
70
2010Q4 2011Q1 2011Q2 2011Q3 2011Q4 2012Q1 2012Q2 2012Q3
Qualified Events by Quarter
2010Q4 only contains two months of data – Field trial began in Oct 2010
2012Q3 only contains two months of data
126 RELIABILITY | ACCOUNTABILITY
23
19.43
8.30
0
5
10
15
20
25
30N
ov-1
0
Dec
-10
Jan-
11
Feb-
11
Mar
-11
Apr
-11
May
-11
Jun-
11
Jul-1
1
Aug
-11
Sep-
11
Oct
-11
Nov
-11
Dec
-11
Jan-
12
Feb-
12
Mar
-12
Apr
-12
May
-12
Jun-
12
Jul-1
2
Aug
-12
Qualified events (October 25, 2010 - August 25, 2012)
Monthly average = 13.86 events
Event Trending *
* Control chart of monthly events, with control limits calculated by using 3-month Moving Average method
127 RELIABILITY | ACCOUNTABILITY
14.16
4.75
0
2
4
6
8
10
12
14
16
18
20N
ov-1
0
Dec
-10
Jan-
11
Feb-
11
Mar
-11
Apr
-11
May
-11
Jun-
11
Jul-1
1
Aug
-11
Sep-
11
Oct
-11
Nov
-11
Dec
-11
Jan-
12
Feb-
12
Mar
-12
Apr
-12
May
-12
Jun-
12
Jul-1
2
Aug
-12
Category 1 Events (October 25, 2010 - August 25, 2012)
Category 1 Events *
Monthly average = 9.45 Cat 1 events
* Control chart of monthly events, with control limits calculated by using 3-month Moving Average method
128 RELIABILITY | ACCOUNTABILITY
0
2
4
6
8
10
12
14
16
18
20
Nov
-10
Dec
-10
Jan-
11
Feb-
11
Mar
-11
Apr
-11
May
-11
Jun-
11
Jul-1
1
Aug
-11
Sep-
11
Oct
-11
Nov
-11
Dec
-11
Jan-
12
Feb-
12
Mar
-12
Apr
-12
May
-12
Jun-
12
Jul-1
2
Aug
-12
FRCC
MRO
NPCC
RFC
SERC
SPP
TRE
WECC
Total
Category 1 Events (October 25, 2010 - August 25, 2012)
Category 1 Events (by Region)
Monthly average = 9.60 Cat 1 events
129 RELIABILITY | ACCOUNTABILITY
-
0.0050
0.0100
0.0150
0.0200
0.0250
0.0300
FRCC
MRO
NPCC
RFC
SERC
SPP
TRE
WECC
Total
Cat 1 events normalized by 1000 miles of Transmission Line
Category 1 Events (by Region)
Red lines indicate timeframe of concern
130 RELIABILITY | ACCOUNTABILITY
Category 1 Events (by Region)
Red lines indicate timeframe of concern
-
0.0005
0.0010
0.0015
0.0020
0.0025
0.0030
FRCC
MRO
NPCC
RFC
SERC
SPP
TRE
WECC
Total
Cat 1 events normalized by 1000MW NDC
131 RELIABILITY | ACCOUNTABILITY
Cause Coding Initial Look
The following is an initial look at cause of events NERC has currently cause coded.
Cause coding of all events captured in the NERC database has not been completed.
132 RELIABILITY | ACCOUNTABILITY
Cause Code Definitions
Short Title Definition Design/Engineering Problem An event or condition that can be traced to a defect in
design or other factors related to configuration, engineering, layout, tolerances, calculations, etc.
Equipment/Material Problem Is defined as an event or condition resulting from the failure, malfunction, or deterioration of equipment or parts, including instruments or material.
Individual Human Performance LTA
An event or condition resulting from the failure, malfunction, or deterioration of the individual human performance associated with the process.
Management Problem An event or condition that could be directly traced to managerial actions, or methodology (or lack thereof).
Communications LTA Inadequate presentation or exchange of information.
Other Problem The problem was caused by factors beyond the control of the organization
LTA = Less Than Adequate
133 RELIABILITY | ACCOUNTABILITY
Root Cause determinations
NERC has “Cause Coded” 229 out of 305 Qualified Events (75%, as of 8-25-2012). Of these events, we were able to assign some type of “Root Cause” coding for 173 events (~75%).
40% of the reports did not contain sufficient information to determine causal factors.
2510%
2811%
42%
4318%
62%
198%
21%
42%
11046%
Region =(All); Year =(All)
A1 Design/Engineering Problem
A2 Equipment/Material Problem
A3 Individual Human Performance LTA
A4 Management Problem
A5 Communication LTA
A7 Other Problem
AN No Causes Found
AX Overall Configuration Issue
AZ
A-Level Cause Codes (241 entered)
134 RELIABILITY | ACCOUNTABILITY
2519%
2821%
43%
4333%
65%
1914%
22%
43%
Region =(All); Year =(All)
A1 Design/Engineering Problem
A2 Equipment/Material Problem
A3 Individual Human Performance LTA
A4 Management Problem
A5 Communication LTA
A7 Other Problem
AN No Causes Found
AX Overall Configuration Issue
A-Level Cause Codes (131 entered : removing AZ)
Identified Root Causes
Discounting the 46% of reports that did not contain sufficient information to determine root causes, we have assigned root cause for 131 events.
See Deeper dive Chart
See Deeper dive Chart
See Deeper dive Chart
135 RELIABILITY | ACCOUNTABILITY
0
1
2
3
4
5
6
7A
4B3C
08
A4B
5C04
A4B
1C03
A4B
1C04
A4B
1C05
A4B
1C06
A4B
5C05
A4B
1C08
A4B
1
A4
A4B
1C09
A4B
3C09
A4B
4
A4B
5
A4B
5C02
A4B
5C03
A4B
5C13
A4B
2C07
A4B
3C05
A4
"Management Problem" Cause FactorsA4B3C08 = Job Scoping did not identify special circumstances or conditionsA4B5C04 = Risks/consequences associated with change not adequately reviewedA4B1C03 = Management direction created insufficient awareness of impact of actions on safety/reliabilityA4B1C04 = Management follow-up did not identify problemsA4B1C05 = Management assessment did not determine cause of previously event or known problemA4B1C06 = Previous Industry or in-house experience was not effectively used to prevent recurrenceA4B5C05 = System interactions not consideredA4B1C08 = Corrective action responses to a known or repetitive problem was untimely
Deeper Dive into Management
136 RELIABILITY | ACCOUNTABILITY
0
2
4
6
8
10
12
A2B6C01 A2B6C07 A2B3C03 A2B6C04 A2B6C06 A2B5C02 A2B3C02 A2B5C04 A2B2C01
A2
"Equipment/Material Problem" Cause Factors
A2B6C01 = Defective or failed partA2B6C07 = Software FailureA2B3C03 = Post maintenance / Post-modification Testing LTAA2B6C04 = End-of-life failureA2B6C06 = ContaminantA2B5C02 = Fabricated item did not meet requirementsA2B3C02 = Inspection / Testing LTAA2B5C04 = Product acceptance requirements LTAA2B2C01 = Preventive maintenance for equipment LTA
Deeper Dive into Equipment
137 RELIABILITY | ACCOUNTABILITY
0
1
2
3
4
5
6
7
8
9
A1B4C02 A1B2C01 A1 A1B1C02 A1B2C05 A1B3C01
A1
"Design/Engineering Problem" Cause Factors
A1B4C02 = Testing of design/installation LTAA1B2C01 = Design output scope LTAA1 = Design/Engineering problemA1B1C01 = Design input obsoleteA1B2C05 = DEsign input not addressed in design outputA1B3C01 = Design / documentation not complete
Deeper Dive into Design
138 RELIABILITY | ACCOUNTABILITY
Lessons Learned – Published (2012)
Region Lessons Learned Brief Description Date TRE TRE-LL-05 – Plant Onsite Material and Personnel Needed for a Winter
Weather Event 1/06/2012
TRE TRE-LL-06 - Plant Operator Training to Prepare for a Winter Weather Event 1/06/2012
TRE TRE-LL-07 - Transmission Facilities and Winter Weather Operations 1/06/2012
NPCC LL-54 - DC Supply and AC Transients 3/06/2012
WECC LL-58 – Saturated Bus Auxiliary Current Transformer causes Bus Differential Operations during Line Fault 3/06/2012
TRE TRE-LL-34 – Rotational Load Shed 3/06/2012
WECC LL-59 - Auxiliary Relay Contact Contamination 6/19/2012
WECC LL-60 – Remote Terminal Units not on DC Sources 6/19/2012
WECC LL-61 – EMS Database Corruption Problem 6/19/2012
WECC LL-62 – Unmanned Forklift contact with Energized Bus 6/19/2012
RFC LL-65 – Excessive Resource Utilization 6/19/2012
TRE LL-66 – Alarm Interpretation Leads to Generator Stator Coil Failure 6/19/2012
NPCC LL-67 – Protective Relaying Digital Input Board Loading 6/19/2012
139 RELIABILITY | ACCOUNTABILITY
Lessons Learned – Published (2012)
Region Lessons Learned Brief Description Date TRE LL-80 – Wind Farm Winter Storm Issues 9/12/2012
TRE LL-81 – Transformer Oil level issues during cold weather 9/12/2012
TRE LL-82 – Winter storm inlet air duct icing 9/12/2012
SPP LL-83 – Capacity Awareness during an energy emergency event 9/12/2012
SPP LL-84 – Electricity and Natural Gas interdependency 9/12/2012
http://www.nerc.com/page.php?cid=5|385
140 RELIABILITY | ACCOUNTABILITY
Peer Check
Safety Check
141 RELIABILITY | ACCOUNTABILITY
Opportunities
• RRM has a tremendous opportunity for collaboration between the ERO and industry
• Working towards appropriate balance for Event reporting
• Mine all sources possible
• Near-miss database white paper http://www.wecc.biz/committees/StandingCommittees/OC/OTS/HPWG/Shared%20Docu
ments/Forms/DispForm.aspx?ID=106&Source=http%3A%2F%2Fwww%2Ewecc%2Ebiz%2Fcommittees%2FStandingCommittees%2FOC%2FOTS%2FHPWG%2FShared%2520Documents%2FForms%2FAllItems%2Easpx&RootFolder
142 RELIABILITY | ACCOUNTABILITY
No Vampires at NERC
143 RELIABILITY | ACCOUNTABILITY
Got Collaboration?
144 RELIABILITY | ACCOUNTABILITY
Questions and Answers
Michael Moon Senior Director of Reliability Risk Management 404-446-2567 office | 609-651-9693 cell [email protected]
James Merlo Associate Director, Human Performance, RRM 404-446-2560 office | 404-387-5249 cell [email protected]
Ben McMillan Risk Analysis Engineer, RRM 404-446-9729 office | 404-823-1362 cell [email protected]