incident management system (ims) - blackrock 3 partners ...€¦ · about us 4 who we are – deep...

56
Incident Management System (IMS) © 2014 Blackrock 3 Partners LLC Chris Hawley, Rob Schnepp, and Ron Vidal

Upload: others

Post on 16-May-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

Incident Management System (IMS)

© 2014 Blackrock 3 Partners LLC

Chris Hawley, Rob Schnepp, and Ron Vidal

Page 2: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

About Us

4 Who We Are– Deep global experience in Incident Management and Critical Infrastructure– Fire, Special Operations & Law Enforcement

• Chemicals, Technical Rescue, Anti-Terrorism, Counter Proliferation– Critical Infrastructure

• Fiber Networks, Data Centers, Oil & Gas, Power, Capital Markets– Market Leader in IMS for IT

4 What We Do– Maximize Uptime During High Severity IT Incidents

• Assess, Train, Evaluate & Exercise Incident Response Teams– Engage with Teams Across the Customer’s Organization

• NOC, Site Reliability, Cybersecurity, Mission Critical Support, Executives– Customers: Global Cloud Providers, Fortune 500 Enterprises & Developers

• Incorporate IMS into ITIL, DevOps, Agile, Lean Practices• Publicly traded and privately held companies

2

Page 3: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

www.blackrock3.com

Chris Hawley –[email protected]

Rob Schnepp –[email protected]

Ron Vidal –[email protected]

San Francisco & Baltimore

3

Blackrock 3 Partners LLC

Page 4: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

4

Day 109:00 – 10:00 Introductions and course overview10:00 – 10:30 Team Building exercise10-minute break10:40-11:50 IMS terminology and CAN report

exercise3 minute paper

12:00-13:00 Lunch13:00-14:00 Incident response decision making14:00-15:30 Role of the Incident Commander10-minute break15:40-16:45 Span of control16:45-17:00 Wrap up and questions

Day 1

Page 5: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

5

Day 2

Day 209:00-10:30 Problem solving exercise10-minute break10:40-11:30 Exercise debrief and discussion11:30 – 12:00 Lunchtime exercise briefing

12:00-13:00 Lunchtime exercise14:00-14:30 Lunchtime exercise debrief10-minute break14:50-16:00 Personalities!16:00-16:45 After Action Reviews (AAR)16:45-17:00 Wrap up and questions

Page 6: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

6

TIME and transitionsOn DutyOn Call

Life clock Game clock

ToneInteractionManagementEngagement

Page 7: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

Resolution

Incident resolution is a

people to people activity

7

Page 8: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

4 Incident Management System (IMS)– National standard for managing all-hazard incidents for the last 40 years

4 IMS specifically designed for Emergency Operations– High stakes, life or death situations 8

IMS Overview

Page 9: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

IMS Overview

9

The Process - Management The PeopleLeadership

Incident Management Incident Command

Page 10: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

How Do You Respond?

PredictableRepeatableOptimizedClearEvaluatedScalableSustainable

10

Derisking response in advance of the incident

AvailableNotifiedRespondingEngagedAssigned (Staged)Released

Page 11: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

Peacetime vs. Wartime

4Process must be in place to accept the rapid change

4Everybody has to be on board with the change

4Emergency Services come to workexpecting wartime operations

4Hope is never a viable plan

11

Do you see problem or solution?

Page 12: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

12

The world of Operations

Incident vs. Emergency vs. Event vs. Problem vs. Alert

Respond or React

Bodies and Players

Page 13: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

13

Issue Monitoring

Incident Commander

Network Database

DBA - 1 DBA - 2

SAN / Storage

Customer Liaison

Executive Liaison

Response Resolution

AARNotification

Incident Lifecycle

Severity

MTTA MTTR AAR

Page 14: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

CAN Report

ConditionsWhat’s happening?

ActionsWhat’s being done?

NeedsWhat are the needs?

14

Always consider the consumer of the CAN report beforegiving it!

Page 15: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

Developing The "Battle Plan" For The Incident.

Size-UpTriageActReview

Getting Oriented to the Incident

Page 16: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

Size-up

4Size-up is a mental process of evaluation of the incident – 360 degree view

4Gathering as much information as quickly as you can, but realizing the incident is not going to be on hold while you complete this

4Facts, possibilities, and probabilities 4Distilled down to the most important

pieces of information16

Page 17: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

17

Dispatch - - - - - - - - NotificationSpecific - - - - - - - - - AccurateResponder - - - - - - - ReactorInvestigatingSize UpStand ByStagingCANSpan of ControlLNOIC

Terminology

Page 18: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

18

Decouple the process of decision making from the outcome you

anticipate

Making Decisions – Rule #1

Page 19: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

19

Making Decisions – Rule #2

Own the Process not the Problem!

Page 20: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

20

Making Decisions – Rule #3

In most cases, idea generation isn’t the challenge – it’s idea selection

Page 21: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

21

It’s not about making quick decisions – it’s about making the best decision in the shortest amount of time . . . .

Based on what you know at the time!

Making Decisions – Rule #4

Page 22: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

22

Pitfalls of group decision making:

ConformityGroup polarization

Obedience to AuthorityWonder becomes Wander

Making Decisions

Page 23: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

23

Bigger pool of responders = more differentiation of opinion and

perspective

ISTP = widespread responsibility for decision

Support vs. Consensus

Making Decisions

Page 24: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

Making Decisions4As decision making cycles get tighter,

communications must keep up with the pace of the incident.

4Incident response communication is based on TRUST

– TIME

– Recognize Fact Pattern

– Understand the Circumstances

– See the Linkages

– Transmit the decision(s), thinking, opinion(s) etc.24

Page 25: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

CAN Report

ConditionsWhat’s happening?

ActionsWhat’s being done?

NeedsWhat are the needs?

25

Always consider the consumer of the CAN report beforegiving it!

Page 26: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

DISASTER

Linking Response to Risk

26

Low Risk Moderate Risk

High Risk Extreme Risk

Page 27: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

27See it – Fix It

Page 28: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

28

Basic solution set is available. Finding the right one is the trick

Page 29: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

29

No solution may be readily available or a new solution needs to be determined.

Trial and Error

Creative Thinking

Page 30: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

30

It becomes a brave new world

Page 31: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

Manageable Span of Control4Span of control:

– How many individuals or resources should a supervisor lead during an incident?

31

Page 32: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

IMS

Incident Commander

Applications

App 1 App 2

Database

DBA - 1 DBA - 2

Continued Operations

Comm’s

Liaison LNO

Scribe

Disaster Recovery

32

Incident Commander

(IC)

Subject Matter Experts(SME)

Page 33: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

Incident Commander

DBA Network Triage & Diagnostics

Liaison (LNO) Scribe/Comms

Span of Control

Page 34: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

Incident Commander

DatabaseGroup Lead

Applications Group Lead

Infrastructure Group Lead

Disaster Recovery Lead

LNO Plans

Scribe Comms

Span of Control

Page 35: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

Incident Commander

DBA Group Lead

DBA-1 DBA-2

DBA-3

Network Triage & Diagnostics

Liaison (LNO) Communications

Scribe

Span of Control

Page 36: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

Incident Commander

DBA Group Lead

DBA-1 DBA-2

DBA-3

Applications Lead

App-1 App-2

Site Switch Lead

LNO Plans

Scribe Comms

Span of Control

Page 37: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

During complex incidents, group leaders coordinate their own actions and report up to IC.

Unified CommandIC

Network LeadStorage LeadApp 2 LeadApp 1 Lead

Operations

SME SMESME SMESME SMESME SME

Page 38: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

During complex incidents, group leaders coordinate their own actions and report up to IC.

Unified CommandUC

Network LeadStorage LeadApp 2 LeadApp 1 Lead

IC

SME SMESME SMESME SMESME SME

Page 39: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

Lost at Sea

4You have chartered a yacht for a trip over the Atlantic with 3 friends. As none of you have any sailing experience you hire a crew of 3. In the mid-Atlantic there is a fire and all of the crew is lost. The yacht is sinking. You do not know where you are, but you are hundreds of miles away from land. You have saved 15 items that are undamaged. You have a 4 person lifeboat and a box of matches.

4Your individual task is to rank these items from the most important (1) to the least important (15)

39

Page 40: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

The job of an IC

40

Page 41: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

4The IC must direct the group to accept the existence of second and third order consequences

4Developing a Plan B, C…...4Forwarding thinking4Taking care of their people4Making notifications

41

Why is the IC critical?

Page 42: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

Operational Periods

4The IC should provide regular operational briefings– Timing is determined by the anticipated MTTR– For incidents under 4 hours, briefings are helpful every 30-

45 minutes– For incidents over 4 hours, briefings should be every 60-95

minutes– The need for information dissemination also drives the

timing

42

Page 43: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

Transfer of Command

4May be passed temporarily– Announced to the bridge– No formal transfer

4Formal transfer includes;– IC to IC discussion (offline)– Off-going IC announces change– On-coming IC provides situational status and operational

period briefng

43

Page 44: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

Role of SME’s in Response

4Your role = a Subject Matter Expert to make recommendations

4Arrive in a timely fashion4Identify yourself when entering the bridge4Ensure your work environment is quiet4Speak up and speak clearly4Be direct and factual4Respect IC timeline4If you need more help – ask for it4Never let the IC fail!

44

Page 45: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

45

DescriptionExplanation

SolutionResponse by SME’s – “I support the plan”

WHAT

WHY

HOW

Page 46: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

Listen to the Chatter

4“I don’t know, maybe we should . . .”4“I guess we could . . . . “4“I always knew . . . “4“This always happens . . .”4“Listen, I’m not here to . . . .”4“Whatever . . . “4“That’s impossible . . . “4“That never happens . . . “4“We’ve always done it this way before . . .”

46

Page 47: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

Personalities

4The Joker 4The gun slinger/savior4Overbearing4Over explainer4Uncertain SME4The interrupter4The grenade thrower

47

Page 48: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

Personalities

4Quiet one4Naysayer4Bridge lurker4Tunnel rat4Jumper to conclusions420/20 Hindsight – Monday

morning quarterback4Chicken little

48

Page 49: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

Comparison

4 Executive swoop4 Long Conversations4 Bad SME4 Team friction4 Background noise4 Language challenges4 Cultural challenges4 Lack of progress4 Lack of sense of urgency4 Fatigue

4 The savior4 Over explainer4 Uncertain SME4 The interrupter4 The grenade thrower4 Naysayer4 Bridge lurker4 Tunnel rat4 20/20 – Monday morning4 Chicken little

49

Situations Personalities

Page 50: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

Three parts of an AAR

Determine the root CAUSE of

the problem

Evaluate the Impact to the Business and how to

PREVENT future incidents

Evaluate the people

RESPONSE

50

Page 51: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

51

Looking for “Some Guy”

4Failure4Operations4Process4Software4Hardware4Response4Responsibilities

Page 52: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

Evaluate the TALENT

TrainingAccountabilityLeadershipEmpowermentNotificationTrust/Temptation

52

Page 53: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

53

Identify a Group Leader (GL)

Timeline for the exercise (30 minutes at (3) 10 minute blocks)

(1) Identify a recent incident within the last 12 months and write a brief incident synopsis including incident benchmarks, challenges; barriers to success; response time to SLA’s; etc.

(2) Discuss the actions retrospectively as they relate to the principles and of IMS.

(3) Determine points on the line where adherence to IMS may have changed the trajectory of the incident for the positive.

Identify the top 3 lessons learned (Q/A and Q/I) as they relate to the incident. TALENT

GL provides debrief to larger group

Putting it All Together – AAR Exercise

Page 54: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

What does a good SME look like?

4Just the facts (Dragnet)4Straight shooter, no sugar coating4Trusted advisor4Answers quickly and concisely 4Responds quickly4Anticipates the needs of the IC (Radar O’Reilly)4Skill and temperament of a surgeon and pilot

54

Page 55: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

Establishing a Culture

4Things WILL go wrong4There should be a process to address them4Incident Management System (IMS)

– Handling of an occurrence– Handling of an issue– Having a really bad day

4AAR review4Retrospective and prospective

4The key is a commitment to making change

55

Page 56: Incident Management System (IMS) - Blackrock 3 Partners ...€¦ · About Us 4 Who We Are – Deep global experience in Incident Management and Critical Infrastructure – Fire, Special

www.blackrock3.com

Chris Hawley –[email protected]

Rob Schnepp –[email protected]

Ron Vidal –[email protected]

San Francisco & Baltimore

56

Blackrock 3 Partners LLC