2015 risk element: monitoring and situational awareness operations webinars... · “system control...
TRANSCRIPT
RELIABILITY | ACCOUNTABILITY2
• NERC Antitrust Guidelines It is NERC’s policy and practice to obey the antitrust laws and to avoid all
conduct that unreasonably restrains competition. This policy requires the avoidance of any conduct that violates, or that might appear to violate, the antitrust laws. Among other things, the antitrust laws forbid any agreement between or among competitors regarding prices, availability of service, product design, terms of sale, division of markets, allocation of customers or any other activity that unreasonably restrains competition.
• Notice of Open Meeting Participants are reminded that this webinar is public. The access number
was widely distributed. Speakers on the call should keep in mind that the listening audience may include members of the press and representatives of various governmental authorities, in addition to the expected participation by industry stakeholders.
Administrative Items
RELIABILITY | ACCOUNTABILITY3
• Purpose of Webinar Series• Overview of Risk Elements• Monitoring and Situational Awareness Inputs into Risk Element Situational Awareness Monitoring Areas of Focuso EOP-010-1, Requirement R2o IRO-002-2, Requirements R6, R7, and R8o IRO-005-3.1a, Requirement R1o IRO-008-1, Requirements R1 and R2o IRO-014-1, Requirement R1o PRC-001-1.1, Requirement R6o TOP-002-2.1b, Requirements R4, R11, and R19
Overview
RELIABILITY | ACCOUNTABILITY4
• Educate stakeholders on role of risk elements in compliance monitoring
• Provide resources and good industry practices related to Reliability Standards associated with each risk element
Purpose of Webinar Series
RELIABILITY | ACCOUNTABILITY5
Subject DateMonitoring and Situational Awareness May 21, 2015
Infrastructure Maintenance June 18, 2015Protection System Misoperation July 16, 2015
Workforce Capability August 20, 2015Long Term Planning and System Analysis September 17, 2015
Extreme Physical Events October 15, 2015Threats to Cyber Systems November 19, 2015
Webinar Series
RELIABILITY | ACCOUNTABILITY6
What are Risk Elements?
Risk-based Compliance Oversight Framework (Framework)
RELIABILITY | ACCOUNTABILITY7
• ERO Priorities: RISC Updates and Recommendations Unavailability of monitoring tools can increase magnitude of other small
problems
• ERO Top Priority Reliability Risks 2014-2017 report Precursor event or contributing cause to events
• Cyber Attack Task Force final report Compromised situational awareness can be a component of cyber attacks
Monitoring and Situational Awareness Inputs
RELIABILITY | ACCOUNTABILITY8
Situational Awareness
“… is defined as the accuracy of a person’s current knowledge and understanding of actual conditions compared to expected conditions at a given time.” Department of Energy
“The perception of the elements in the environment within a volume of time and space, the comprehension of their meaning and the projection of their status in the near future.”Endsley, M. R. (1995). Toward a theory of situation awareness in dynamic systems. Human Factors, 37(1), 32-64.
RELIABILITY | ACCOUNTABILITY9
• July 19, 1965 “System control centers should be equipped with display and recording
equipment which provide the operator with as clear a picture of system conditions as possible.”
• July 2, 1996 “… review the need… to monitor operating conditions on a regional
scale...”
• August 10, 1996 “…train operators to make them aware of system conditions and
changes…”
• August 14, 2003 “…inadequate situational awareness…”
• September 8, 2011 “This failure stemmed primarily from weaknesses in two broad areas –
operations planning and real-time situational awareness…”
Why is Situational Awareness Important?
RELIABILITY | ACCOUNTABILITY14
Individual SA – A Simpler Construct
AttentionSensation
PerceptionCognition
Decision makingAction
Scan
Focus
Act
What am I seeing or can
I see?
What does it mean?
What am I going to do
about it?
RELIABILITY | ACCOUNTABILITY15
Team and Shared Situational Awareness
Team SA Shared SA
Adapted from Endsley & Jones, 1997, 2001
RELIABILITY | ACCOUNTABILITY16
Building Team and Shared Situational Awareness
1. What info do we share?
2. How do we share it?
3. How do we consistently interpret it?
4. How do we make it repeatable?
Adapted from Endsley & Jones, 2001
RELIABILITY | ACCOUNTABILITY17
• Individual Human factors Training, planning, technical support
• Organizational Determine the requirements Acquire and maintain the devices (tools) Continually refine the mechanisms Institutionalize the processes
Organizational Opportunities to Improve SA
RELIABILITY | ACCOUNTABILITY18
Category 2b - Complete loss of SCADA, control or monitoring functionality for 30 minutes or more
Category 1h - Loss of monitoring or control, at a control center, such that it significantly affects the entity’s ability to make operating decisions for 30 continuous minutes or more. Examples include, but are not limited to the following: Loss of operator ability to remotely monitor, control Bulk Electric System (BES)
elements, or both Loss of communications from SCADA RTUs Unavailability of ICCP links reducing BES visibility Loss of the ability to remotely monitor and control generating units via AGC Unacceptable State Estimator or Contingency Analysis solutions
Monitoring - EMS Outages
RELIABILITY | ACCOUNTABILITY19
• Energy Management Systems (EMS) are extremely reliable• EMS outages increase the risk to the real-time reliability of
the grid• EMS outages have NOT had an adverse impact to reliability• 113 complete outages (Category 2b) reported through
December 31, 2014; 24 event analyses in progress• 84 partial outages (Category 1h) reported through
December 31, 2014; 57 event analyses in progress• 109 entities reported either a 1h or a 2b or both 49 experiencing multiple outages
Monitoring - Major Takeaways
RELIABILITY | ACCOUNTABILITY21
EMS Outages Contributing Causes
0
5
10
15
20
25
30
35
40
A2B6
C07
A1B2
C01
A4B5
C03
A2B7
C04
A2B6
C01
A1B4
C02
A2B7
C01
A4B5
C05
A2B3
C03
A4B5
C04
A1B2
C08
A2B3
C02
A5B2
C08
A2B3
C01
AXB1
AXB2
A3B1
C01
A3B2
C01
A3B2
C05
A3B3
C01
A4B1
C08
A4B5
C13
A5B3
C01
A5B4
C01
AZB3
C02
A2B2
C01
A4B3
C08
A5B1
C03
A7B1
C02 AX
A1B2
C05
A1B5
C02
A2B7
C02
A3B1
C02
A3B1
C06
A3B3
C03
A3B3
C04
A4B1
C09
A4B2
A4B2
C08
A4B3
A4B3
C09
A4B5
C09
A6B3
C02
A7B1
A7B3
Top Contributing CausesSoftware failure (A2B6C07)Design output scope LTA (A1B2C01)Inadequate vendor support of change (A4B5C03)Undesired operation of coordinated systems (A2B7C04)Defective or failed equipment (A2B6C01)Testing of Design/Installation LTA (A1B4C02)Communication path LTA (A2B7C01)System interactions not considered or identified (A4B5C05)Post maintenance/post-modification testing LTA (A2B3C03)Inadequate Risk Assessment of Change (A4B5C04)
RELIABILITY | ACCOUNTABILITY22
•Communicating with neighboring operators and Reliability Coordinator
•Decision making processes to man substations during an outage
• Increased vigilance during outages, partial monitoring from neighbors and RC
• Improving causal analysis of EMS outage events• Improving Event Analysis report quality•Sharing Lessons Learned from EMS outages,
helping everyone get better
Monitoring - Strengths to Sustain
RELIABILITY | ACCOUNTABILITY23
•Change management, particularly in the context of job scoping
•Use of EMS test systems (aka sandbox, QA, dev)•Network infrastructure testing•Routine failover testing•Vendor relationship•Communication – internal and external•Task scoping and job aids
Monitoring - Opportunities to Avoid EMS Outages
RELIABILITY | ACCOUNTABILITY24
• Assessments• Tools• Procedures• Information sharing
Themes of Associated Requirements
RELIABILITY | ACCOUNTABILITY25
EOP-010-1, Requirement R2
*EOP-010-1 becomes effective on April 1, 2015. Pursuant to the implementation plan, Requirement R2 of EOP-010-1 will become effective on the first day following the retirement of IRO-005-3.1a.
RELIABILITY | ACCOUNTABILITY32
• Project 2009-02 Real Time Reliability Monitoring and Analysis Capabilities Recently resumed development activities June 4 SAR Technical Conference at NERC’s Headquarters in Atlanta, GA
• TOP/IRO Reliability Standards Revisions Petition filed at FERC and pending regulatory approval
• Geomagnetic Disturbance Mitigation Reliability Standards FERC NOPR issued May 14, 2015 on TPL-007-1
• For more information on these efforts, contact Senior Standards Developer Mark Olson at [email protected]
Standards Development Update
RELIABILITY | ACCOUNTABILITY33
• TOP or RC does not have adequate system models 2011 SW black out
• TOP or RC next day or current day plans are not based off of studies that represent projected conditions for that day 2011 SW black out
Known Risks
RELIABILITY | ACCOUNTABILITY34
• RC performs studies focusing on the wide area view Provides assurance that TOP impacts by neighbors are monitored
• TOP performs studies focusing on their foot print Voltages below 100 kV that may be equivalized in the RC model should be
modeled by the TOP
• RC and TOP compare next day and current day study results Provides a check and balance to test model qualityo Time frame doesn’t allow for review and testing of every model for each studyo Allows for a process to improve modelso Strong input to detect and correct modeling issues
Good Industry Practices
RELIABILITY | ACCOUNTABILITY35
• Real-time Tools Survey Analysis and Recommendationshttp://www.nerc.com/comm/OC/Realtime%20Tools%20Best%20Practices%20Task%20Force%20RTBPTF%2020/Real-Time%20Tools%20Survey%20Analysis%20and%20Recommendations.pdf• Loss of Real-time Reliability Tools Capability/Loss of Equipment
Significantly Affecting ICCP Datahttp://www.nerc.com/comm/OC/Reliability%20Guideline%20DL/Loss%20of%20Real-Time%20Reliability%20Tools%20Capability%20Loss%20of%20Equipment%20Signficantly%20Affecting%20ICCP%20Data%2020150219.pdf
Resources
RELIABILITY | ACCOUNTABILITY36
• Monitoring & Situational Awareness Conference Materials and EMS Conference Presentations
http://www.nerc.com/pa/rrm/Resources/Pages/Conferences-and-Workshops.aspx• ERO Event Analysis Programhttp://www.nerc.com/pa/rrm/ea/Pages/EA-Program.aspx• NERC Lessons Learnedhttp://www.nerc.com/pa/rrm/ea/Pages/Lessons-Learned.aspx• NERC Operating Committee Reliability Guidelineshttp://www.nerc.com/comm/OC/Pages/Reliability-Guidelines.aspx
Resources
RELIABILITY | ACCOUNTABILITY37
• Reliability Coordinator Compliance Analysis Report (2013)http://www.nerc.com/pa/comp/Compliance%20Analysis%20Reports%20DL/Compliance%20Analysis%20Report%20Reliability%20Coordinator.pdf• TOP-002 Compliance Analysis Report (2011)http://www.nerc.com/pa/comp/Compliance%20Analysis%20Reports%20DL/TOP-002.pdf
Resources
RELIABILITY | ACCOUNTABILITY38
• NOAA Space Weather Centerhttp://www.swpc.noaa.gov/• Space Weather Canadahttp://www.spaceweather.gc.ca/• Project 2013-03 Geomagnetic Disturbance Mitigation http://www.nerc.com/pa/Stand/Pages/Project-2013-03-Geomagnetic-Disturbance-Mitigation.aspx
Resources – EOP-010-1
RELIABILITY | ACCOUNTABILITY39
• NERC GMDTF 2013http://www.nerc.com/comm/PC/Pages/Geomagnetic-Disturbance-Task-Force-%28GMDTF%29-2013.aspx• NERC GMDTF 2011 & 2012http://www.nerc.com/comm/PC/Pages/Geomagnetic%20Disturbance%20Task%20Force%20%28GMDTF%29/Geomagnetic-Disturbance-Task-Force-GMDTF.aspx
Resources – EOP-010-1