problem solving workshop - itil certification … and root cause analysis problem-solving activities...

36
Copyright Protection: No part of these notes may be reproduced in any form electronic or printed without the written consent of B Wyze Solutions Inc 1 IT Service Management Problem Solving Workshop: Critical Thinking and Root Cause Analysis Seminar Workbook Version 2.1.0

Upload: hoangtruc

Post on 11-Apr-2019

218 views

Category:

Documents


0 download

TRANSCRIPT

Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 1

IT Service Management

Problem Solving Workshop: Critical Thinking and Root Cause Analysis

Seminar Workbook

Version 2.1.0

Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 2

Problem Solving Workshop Critical Thinking and Root Cause Analysis

Acknowledgements

B Wyze Solutions Inc LCS Accredited ITIL® Training Provider B Wyze copyright © 2011

Author: Graham Furnis ITIL® v3 Expert Certified

ITIL® v3 Intermediate Lifecycle Certified ITIL® v2 Manager Certified

LCS Accredited ITIL® Trainer

Works Cited

Fishbone Diagram, Ishikawa, Kaoru (1990)

Kepner-Tregoe Problem Solving (Kepner & Tregoe, 1958)

Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 3

Problem Solving Workshop Critical Thinking and Root Cause Analysis

TABLE OF CONTENTS

ACKNOWLEDGEMENTS .............................................................................................................. 2

WORKS CITED ................................................................................................................................. 2

TABLE OF CONTENTS ................................................................................................................. 3

OVERVIEW: WHAT TO EXPECT ............................................................................................... 5

Who should attend? .................................................................................................................... 5

Prerequisites ................................................................................................................................ 5

Learning Objectives .................................................................................................................... 5

SECTION 1: PROBLEM SOLVING WITHIN IT SERVICE MANAGEMENT ................... 6

IT Service Management (ITSM)................................................................................................ 6

Key Terms ................................................................................................................................... 6

SECTION 2: PROBLEM SOLVING .............................................................................................. 7

What is Problem Solving? .......................................................................................................... 7

SECTION 3: PROBLEM-SOLVING PERSPECTIVES ............................................................. 8

Adjust your Thinking and Reasoning ........................................................................................ 8

SECTION 4: PROBLEM SOLVING AS A STRUCTURED PROCESS................................ 10

Problem Solving ....................................................................................................................... 10

Root Cause Analysis (RCA) .................................................................................................... 10

Kepner-Tregoe Root Cause Analysis Method ........................................................................ 11

The Problem Solving Plan Using Kepner-Tregoe .................................................................. 12

SECTION 5: RCA METHODS AND TECHNIQUES ............................................................... 13

Comparison of Methods and Techniques ................................................................................ 13

The Journalism Standard .......................................................................................................... 14

ACTIVITY: CASE STUDY ............................................................................................................ 15

ACTIVITY 1: THE JOURNALISM STANDARD ..................................................................... 16

Facts Table: ............................................................................................................................... 16

ACTIVITY 2: PROBLEM SOLVING PLAN AND STEPS ...................................................... 18

Pareto Analysis ......................................................................................................................... 19

ACTIVITY 3: PARETO ANALYSIS ............................................................................................ 20

Cause and Effect (Relationship) Analysis............................................................................... 22

ACTIVITY 4: CAUSE & EFFECT - TECHNICAL RELATIONSHIPS ............................... 23

Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 4

Problem Solving Workshop Critical Thinking and Root Cause Analysis

ACTIVITY 5: CAUSE & EFFECT – PROCESS AND CHRONOLOGICAL

RELATIONSHIPS ........................................................................................................................... 24

Ishikawa Diagram (Cause and Effect Diagram) ..................................................................... 26

ACTIVITY 6: ISHIKAWA DIAGRAM - CREATIVE LEADS ............................................... 28

SECTION 6: GETTING TO THE TRUE ROOT CAUSE ........................................................ 30

Hypothesis Testing and Validation.......................................................................................... 30

ACTIVITY 7: HYPOTHESIS TESTING AND VALIDATION .............................................. 31

The 5 Whys ............................................................................................................................... 33

ACTIVITY 8: BUT ARE WE DONE? THE 5 WHY’S.............................................................. 34

SECTION 6: PROBLEM OPTIONS AND SOLUTIONS ......................................................... 35

ACTIVITY 9: PROBLEM OPTIONS AND SOLUTIONS ....................................................... 36

Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 5

Problem Solving Workshop Critical Thinking and Root Cause Analysis

Overview: What to Expect

Who should attend? This course will be of interest to:

IT support individuals who would like to strengthen their problem-solving skills

IT managers seeking to improve their team skills in problem solving

IT process management individuals who are looking to improve their problem

process and Root Cause Analysis problem-solving activities

Prerequisites No prerequisites are required, but an IT background is highly recommended and an

understanding of ITIL (IT Infrastructure Library) may be useful.

Learning Objectives The learning objectives of this workshop provide value to participants in the areas of:

Understand the basic elements of the Problem Solving Skills

Understand the perspectives of Critical Thinking and Creative Thinking

Understand a process approach to Problem Solving and Root Cause Analysis

Understand how to maintain progress in problem solving by adjusting your

approach, methods, and techniques

Understand how to get to the true Root Cause of a Problem

Understand how to solve the Problem

Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 6

Problem Solving Workshop Critical Thinking and Root Cause Analysis

Section 1: Problem Solving within IT Service

Management

IT Service Management (ITSM) IT Service Management is a discipline for managing IT systems and technology centered

on the identification and delivery of IT Services used by the business. These IT Services

are defined in business terms and are the final outcome for IT systems and technology.

Within the practice of ITSM, the ITIL (IT Infrastructure Library) framework links Root

Cause Analysis to the process of Problem Management.

The ITSM discipline and the ITIL framework approach provide a beneficial relationship to

successful IT related Problem Solving by IT Service Support professionals. As such, it is

useful to consider the following definitions:

Key Terms

Problem and Problem Management A Problem is the unknown cause of one or more related Incidents.

Problem Management manages the investigation into the cause of these related

Incidents and ensures an appropriate resolution to the Problem is found.

Incident and Incident Management An Incident is an unplanned event that is a deviation from normal (as defined by the

Service Level Agreement) that affects an IT Service. This deviation could include:

o Disruption to the agreed service

o Reduction in the quality of agreed service

o Something that could lead to a disruption or reduced quality of agreed service

Incident Management manages the quick response and restoration of these incidents,

and may further escalate issues to Problem Management for further investigation.

Priority, Impact, Urgency Priority is a generic ITSM definition that defines the priority in which issues, such as

Incidents and Problems, are dealt with. Priority of an issue is a combination of Impact and

Urgency. These are defined as the degree of:

Impact - positive or negative business effect of something.

Urgency - response time required to address an “impact” event.

Service Level Agreement (SLA) A written or understood agreement between an IT Service Provider and the Business

(through identification of the Business Customer) that outlines all arrangements for the

use, performance levels, and management of an IT Service. Incident and Problem priority

objectives and other details are often documented and agreed in this Agreement.

Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 7

Problem Solving Workshop Critical Thinking and Root Cause Analysis

Section 2: Problem Solving

What is Problem Solving? Problem Solving is a basic and key life skill that is often considered to be one of the most

complex human intellectual functions. Fortunately, it’s also something that can be

continually learned, practiced and improved.

In its basic sense, Problem Solving is a mental process for thinking and reasoning. It can

be refined and improved when broken into the separate, but related, parts of Problem

Finding and Problem Shaping.

Problem Finding Identifying the Problem is the first step to good Problem Solving. The problem statement

becomes the target that is being solved for, and getting this target right or wrong can have

serious good or bad consequences. In many cases, identifying the problem is more

complex than actually solving the problem.

A key to good problem finding involves the use of Creative Thinking.

Problem Shaping Problem Shaping follows Problem Finding. Once the Problem has been correctly

identified, questions need to be asked that shape the direction and findings of problem

investigation. Each question leads to insight into the underlying Problem Cause(s), and

thus refines and shapes further questions to be asked.

A key to good problem shaping involves the use of Critical Thinking.

Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 8

Problem Solving Workshop Critical Thinking and Root Cause Analysis

Section 3: Problem-Solving Perspectives

Adjust your Thinking and Reasoning Problem Solving as a skill requires that the problem solver adjust their approach and

perspective to solving the problem. Failure to adjust and taking the wrong perspective from

the onset is one of the primary reasons problem solvers fail, or fail to act in a timely

manner. This can be accomplished by understanding the differences between:

Critical Thinking in Problem Solving can be

considered logical thinking. It is a methodical

approach that involves the cognitive skills of

goal clarification, observation, interpretation,

analysis, categorizing, relating, inference

making, evaluation of results, assessment, and

explanation of conclusions.

Creative thinking in Problem Solving can be

thought of as finding options and alternatives,

or more commonly referred to as “thinking

outside the box” of common and tried

solutions.

Critical Thinking is associated with deductive

reasoning, where it is based on a set of

propositions and the subsequent investigation

and factual discoveries that bring to light the

root cause of a Problem.

Tends to be a top-down approach to

Problem Solving. It requires detailed

knowledge or experience combined with a

logical process that confirms each

proposition is NOT the source of the

Problem.

Creative thinking is associated with inductive

reasoning, where Inductive Reasoning can be

thought of as assumptions or generalized

conclusions drawn from a set of observations.

These assumptions are not necessarily valid

conclusions, but start points to be further

investigated and validated.

Tends to be a bottom-up approach to

Problem-Solving. It is based on both

intuition and making guesses based on

experience. It must be followed up by

verification of these guesses, or

assumptions, typically using an

experimental approach.

Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 9

Problem Solving Workshop Critical Thinking and Root Cause Analysis

When to Use Critical Thinking and

Deductive Reasoning A problem is familiar or of a familiar type

A problem solver has sufficient skill and

experience

When to Use Creative Thinking and

Inductive Reasoning A problem is unfamiliar

Deductive reasoning has reached a dead

end and more alternatives are needed

For example:

A critical marketing application has several

different user error messages across the

marketing department. With our programming

experience we know that each error message is

triggered by application error trapping code.

Therefore, we deduce that we should investigate

the programming code related to the application

modules that produced the error message to

confirm the application logic.

For example:

A critical marketing application has several

different user error messages across the

marketing department. We have no

programming skill, but have observed in our

past experience that shared applications are

run from a central server. The marketing

application is a shared application; and

therefore we induce (assume) that the

Problem must be based on a server. Our

investigations will now take this path.

Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 10

Problem Solving Workshop Critical Thinking and Root Cause Analysis

Section 4: Problem Solving as a Structured Process

Problem Solving Problem Solving involves a structured and methodical approach to problem solving. In

general, this structure involves:

(1) Correctly defining the problem,

(2) Finding the root cause(s) of the problem through Root Cause Analysis,

(3) Determining the most effective corrective actions to take, and

(4) Implementing the solution to successfully manage the problem.

Primary Goal: The Problem Solving process seeks to prevent Problems from ever recurring by taking

effective corrective actions.

Root Cause Analysis (RCA) Root Cause Analysis is a sub-process of the larger Problem Solving process that requires

an appropriate application of Problem Solving skills in conjunction with a methodical and

systematic approach to identifying the true root cause(s) of Problems.

Each Root Cause Analysis approach shares a common aim to avoid focusing on and

solving the symptoms of the problem, and to instead drill down to identify and solve the

true root cause of the problem.

A key assumption to Root Cause Analysis is that there is always one true root

cause for any given problem. This leads to a key challenge of having sufficient focus and

perseverance to find this one true root cause.

Primary Goal Root Cause Analysis endeavors to determine the lowest level “root” cause of a Problem

that supports taking the most effective corrective actions.

Primary Objective

The objective of Root Cause Analysis is to reveal the correct root cause of the Problem,

because without it we cannot determine what effective corrective actions must be taken.

Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 11

Problem Solving Workshop Critical Thinking and Root Cause Analysis

Kepner-Tregoe Root Cause Analysis Method

The Kepner-Tregoe method to analyze problems was developed by researchers Dr. Charles

Kepner and Dr. Benjamin Tregoe. This method emphasizes a structured approach to

problem solving that relies on setting priorities and making use of technician knowledge

and experience. The method includes five steps to problem solving.

Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 12

Problem Solving Workshop Critical Thinking and Root Cause Analysis

The Problem Solving Plan Using Kepner-Tregoe It is recommended that a structured problem solving plan should be created when solving

any Problem. The plan should follow the problem solving steps, such as those defined by

Kepner-Tregoe, and should include the business goals and objectives that need to be

achieved. Each step can then be managed at an appropriate level based on priorities, time

pressures, and availability of information.

The Problem Solving Plan is iterative, where new facts and observations shed new and

increasingly accurate light on both the Problem definition as well as the root cause

investigation.

Sample Problem Solving Plan Goal:

Objectives:

Constraints:

Problem Definition:

Problem Solving Steps Technique(s) to Use

1. Define the Problem

2. Assess the Problem

3. Establish Possible Causes

4. Explore Possible and Probable Causes

5. Verify Root Cause(s)

Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 13

Problem Solving Workshop Critical Thinking and Root Cause Analysis

Section 5: RCA Methods and Techniques

Comparison of Methods and Techniques There are many Root Cause Analysis related methodologies and techniques. The more

common ones along with their primary characteristics are listed in the following table.

Each has particular strengths that make it suitable for use in specific situations. These are

defined more fully in the subsequent pages.

Method / Technique Approach Top Down /

Bottom up

Requires Complementary

Methods / Techniques

Journalism Standard Problem Finding Problem Shaping

All methods / techniques

Pareto Analysis Problem Finding Problem Shaping

Both Top-down / Bottom-up

ITIL Processes

Ishikawa

The 5 Whys

Cause and Effect Analysis

Problem Shaping Top-Down ITIL Processes

Configuration Relationships

Systems Analysis

The 5 Whys

Change Analysis Problem Shaping Top-down Systems Analysis

The 5 Whys

Ishikawa Diagram Problem Finding Problem Shaping

Both Top-down / Bottom-up

Brainstorming

The 5 Whys

The 5 Whys Problem Shaping Top-Down All methods / techniques

Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 14

Problem Solving Workshop Critical Thinking and Root Cause Analysis

The Journalism Standard The Journalism Standard is a technique that is focused on factual reporting and analysis,

where emotion and assumptions are removed from consideration. It can be thought of as

listing and considering “just the facts”. The Journalism Standard reminds the Problem

Solver to research and list the basic facts of the situation first, to seek interviews and

independent confirmation, and then to evaluate using a neutral approach.

This technique also reminds the Problem Solver to avoid some of the more common

Problem Solving mistakes including eliminating or limiting possible causes that includes:

Avoid Jumping to Assumptions Avoid eliminating possible causes due to one or more incorrect assumptions. Making

assumptions is a necessary part of life. However, when the stakes are high and risks of

failre increase, making assumptions can be dangerous! Many times a Problem has

escalated or dragged on due to an assumption being made that eliminated a check point.

For example:

A technician may eliminate checking the application drivers on Server PC “knowing that

it can’t be the Server as no one has updated the Server since it was last working…”

Avoid Tunnel Vision Avoid missing possible causes due to an obsession or narrow focus on one or a small range

of assumptions. Jumping to a conclusion may lead you to a quick diagnosis of a Problem,

but more often than not it leads to a failure to correctly identify the root cause. In other

cases, the narrow focus misses the correct root cause and the Problem escalates as the

investigation drags on.

For example:

A technician may limit investigation to a Server PC “concluding that it must be the Server

as the application error is generated from the software residing on the Server…”

The 5 w’s The 5 w’s is a simple and commonly known rule to gather the facts:

who was involved

what events happened

when did the sequence of events happen

where did the sequence of events happen

how did the sequence of events happen

why did the problem happen (initial inductions and deductions)

Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 15

Problem Solving Workshop Critical Thinking and Root Cause Analysis

Activity: Case Study

The Marketing Client Support team of LITI Corporation has reported two related Incidents

this week related to their Sales Management Service. While each individual Incident has

been restored, the issue has been escalated by Marketing staff to the Marketing Manager,

who in turn has raised the issue with Service Level Management (SLM). SLM has agreed

that this is a “Problem” that needs to be solved by IT in order to maintain marketing staff

productivity and keep good relations with the Business.

The issue has been assigned to the Problem Manager, who in turn has opened a Problem

record and assigned “you” to resolve this issue. Initial investigation of related Incident

records has revealed the following descriptions:

Bob in Marketing has experienced a series of CRM Application "freezes" where he

has tried to save an updated client history. Bob called the Service Desk and was

advised to re-boot the application. As a consequence, Bob lost his data updates and

had to re-enter the updates. Bob mentioned that several other Sales Representatives

have experienced the same issue.

Mary in Marketing has experienced extreme slowness and failure to save updates

several times in the last month when updating client billing and payment

information using the CRM Application. Mary did not contact the Service Desk

until the event occurred again this week. Mary was dispatched a Level 2 support

technician. The technician was able to confirm local network connectivity, and also

confirmed that the CRM Application server was up and running. Desktop memory

and disk space were confirmed to be sufficient. The technician resolved the

Incident by refreshing the client profiles (and consequently losing current data

updates), re-entered 3 of the 20 original client updates, and successfully saved to

the server without any Incident occurring. Mary was advised to save more frequent.

Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 16

Problem Solving Workshop Critical Thinking and Root Cause Analysis

Activity 1: The Journalism Standard

Facts Table: Who

What

When

Where

How

Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 17

Problem Solving Workshop Critical Thinking and Root Cause Analysis

Why

Deductions

Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 18

Problem Solving Workshop Critical Thinking and Root Cause Analysis

Activity 2: Problem Solving Plan and Steps

Goal: Goal: The Problem root cause is to be found

Objectives: Objective: The Problem is to be solved using a structured

problem solving approach

Constraints: Problem is categorized as Priority 2

Root cause must be found within one week of assignment

One problem solving analyst assigned

Problem Definition:

Problem Solving Steps Tactics, Approach, and Technique(s) to Use

1. Define the Problem

2. Assess the Problem

3. Establish Possible Causes

4. Explore Possible and

Probable Causes

5. Verify Root Cause(s)

Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 19

Problem Solving Workshop Critical Thinking and Root Cause Analysis

Pareto Analysis Pareto analysis is based on the Pareto principle, which is often quoted as the “80/20 rule”.

This principle states that (in general) 80% of the effects of something are a result of 20%

of the inputs or causes.

In the world of ITSM problem solving, Pareto analysis states that problem causes

accounting for 80% of problems should be investigated first. Thus, Pareto analysis

becomes a statistical technique that relies upon assembling historical problem records.

This technique is both top-down and bottom up; by identifying possible causes that can

serve as start points to be investigated further using other techniques and methods (such as

Cause and Effect analysis).

Some considerations include: Pareto Analysis forces the Problem Solver to avoid the temptation to jump directly to

problem investigation without a quick check of known or obvious problems.

A history of problems and causes is required to produce the analysis. In a green-field

situation, this information may take too long to assemble. In such situations, Pareto

analysis may be based on technician experience and recollection of past similar

events.

It’s important to remember that the 80/20 rule suggests “the most likely causes”, and

is not intended to be correct every time. As such, it may result in misdiagnosis by

overlooking smaller frequency causes or new causes.

Pareto Analysis Steps Step 1: Compile a history of unique causes along with specific problems, or types of

problems, components or services affected, and/or users affected.

Step 2: Compute the percentage that each unique cause has been the source of a

problem.

Step 3: Apply the 80/20 rule to determine the most likely causes to investigate.

Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 20

Problem Solving Workshop Critical Thinking and Root Cause Analysis

Activity 3: Pareto Analysis

The following Problem History Report is produced when searching for the Generic Problem of “Application Failure to Save Data”.

Last 80/20 Percentage Frequency Category1 Category2 Cause Resolution Assessment

18 month

ago

1.64% 8 Network Hardware Local area network connectivity issues

at the local network hub

Workaround - Reboot

1 month

ago

2.46% 12 Server Hardware Server disk capacity limitations caused

by accumulation of data

Workaround - Archive

data

2 months ago

4.11% 20 Desktop Software Local PC application code error Workaround – Reboot

or Re-enter data

1 month

ago

5.13% 25 Desktop Software Local PC network connectivity issues at

the user's Desktop PC

Workaround - Reboot

5 months ago

6.16% 30 Desktop Software Local PC memory limitations caused by

too many open applications Workaround - Close Applications

8 months

ago

24.64% 120 Server Hardware Server memory limitations causes by

application processing routines overload

Workaround - Reboot

2 months

ago

27.10% 132 Desktop Hardware Local PC disk capacity limitations

caused by accumulation of data

Workaround - Archive

data

3 weeks ago

28.75% 140 Desktop Hardware Local PC invalid data types entered as

record field data Workaround –translate data to acceptable type

Deductions (what should we investigate first, second, etc.?):

Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 21

Problem Solving Workshop Critical Thinking and Root Cause Analysis

Research Information:

Contact the Server Support Group The Server Support group assures you that the Server PC supporting the CRM

Application is fully monitored and showed no processing overload according to Server

log files, and currently has 50% available disk capacity after a data clean up last month.

Contact Bob and Mary Bob and Mary both respond that there’s nothing unusual or incorrect with their data

as they re-entered and attached the same data after re-booting and successfully saved.

While they are on the phone, you also find out:

o The Problem first appeared (but was not reported) 4 to 5 weeks ago. The first

occurrence was several weeks after a CRM release introducing a Billing

module, but just before the Billing bug fix.

o The “others” were almost all other Sales Reps, and they believe the frequency

of these failed saves is increasing.

o Sales Reps and the Marketing Manager believe that the application is used

more heavily and stores more information as time goes on. It must be a failure

to save the quantity of data. They demand this failure be addressed to allow

them to store the critical information required.

Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 22

Problem Solving Workshop Critical Thinking and Root Cause Analysis

Cause and Effect (Relationship) Analysis Cause and affect analysis is a deductive, top-down problem solving method that requires the

problem solver to conduct proper research and information gathering in order to make use

of critical thinking and deductive reasoning skills.

Cause and Effect Analysis can also be thought of as Relationship Analysis as it first

requires identifying relationships between components (including people, process, and

technology) and then to determine a cause and effect path.

The cause and effect tree or diagram is a visual description of the relationship between two

events, where the first event is the cause (the trigger) and the second event is the effect (the

consequence). This is a description of causality.

Understanding Causality Conditions Causality can better be understood by classifying causes as one of three types of conditions:

Necessary Conditions:

If event “B” is always caused by “A”, then the presence of “B” implies “A” must

have happened.

Sufficient Conditions:

If event “B” is sometimes caused by “A”, then the presence of “B ” implies “A” may

have happened. This leaves open the possibility of another event resulting in “B”.

Contributory Conditions:

If event “B” is caused by “A” and other factors (ie: “A” is not sufficient by iteself to

cause “B”), then the presence of “B” implies “A and other factors” may have

happened.

Neither necessary nor sufficient but it must be contributory. This requires other events

to be searched for and found.

Common Cause and Effect Analysis Methods Technical Relationships through the formal or informal Configuration Management

Database (CMDB)

o Fault Relationships through Fault Tree Analysis (FTA). Fault Tree Analysis is

used to analyze a single fault event. It makes use of a cause and effect tree

structure that analyzes the relationships of complex system faults (Technical

Faults) using logic diagrams displaying the states of each part of a system.

IT Process Relationships through the Configuration Management System (CMS) and

related process activity records of the Service Management System (SMS), such as

Incident, Problem, and Change records.

End-User Interaction Relationships through formal or informal observation of IT

Service modules of functionality.

Chronological Event Relationships (chain of events) through sequencing all known

events for people, process, or technology relationships.

Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 23

Problem Solving Workshop Critical Thinking and Root Cause Analysis

Activity 4: Cause & Effect - Technical Relationships

The Service Management System (support ticketing system) contains information that

relates IT components as well as Incident and Change records.

1. What Configuration Items

(CIs) - based on the

Service Configuration

Model - should be

investigated?

2. If the Cause exists within

the components identified,

how would you rate the

Conditions?

Configuration Item Cause-Effect Necessary

Condition

Sufficient

Condition

Contributory

Condition

Server PC Failure leads to Problem

CRM Server Application Failure leads to Problem

Network Router Failure leads to Problem

CRM Desktop Application Failure leads to Problem

Desktop PCs Failure leads to Problem

MS Windows Failure leads to Problem

MS Office Failure leads to Problem

Deductions:

Problem Solving & Critical Thinking Workshop

Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 24

Problem Solving Workshop Critical Thinking and Root Cause Analysis

Activity 5: Cause & Effect – Process and Chronological Relationships

The following report is produced for all Incident and Changes for the last 2 months for all Configuration Items identified above:

Record Occurrences Last Time Component Description Incident 2 1 day ago CRM Desktop CRM Desktop application failure to save Incident 3 2 weeks ago Desktop PC Desktop network cable disconnected, desk-side reconnect Maintenance 2 2 weeks ago Server PC Server shut down and restart – standard Sunday maintenance window Incident 5 3 weeks ago Desktop PC Desktop PC performance degradation, close applications or reboot required

Incident 3 3 weeks ago CRM Desktop CRM Desktop application performance degradation, close applications and reboot

required Incident 1 1 month ago Server PC Server PC disk space alarm, historic data archived Change 1 1 month ago CRM Server Application Release Bug Fix Updates Maintenance 4 1 month ago Server PC Server disk clean up and tuning – standard Sunday maintenance window Change 1 2 months ago CRM Desktop Update Desktop CRM Application Drivers Change 1 2 months ago CRM Desktop Application Release functionality Update to Client Billing module

Deductions:

Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 25

Problem Solving Workshop Critical Thinking and Root Cause Analysis

Research Information:

Contact the CRM Application Development and Support Group The Application Support group indicates the bug fix was released in response to an

undersized data size field limit in the new Client Billing module. Since the bug fix

there have been no further related Incidents.

The Application Support group believes it must be a User data entry error as the

Application has been fully tested.

The Application further explains that when data is updated by a User, it is held in

memory on the User’s PC. When the User saves this data, each record is written to the

record on the Server PC Application database. There are no errors recorded in the

database error log.

Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 26

Problem Solving Workshop Critical Thinking and Root Cause Analysis

Ishikawa Diagram (Cause and Effect Diagram) Kaoru Ishikawa was a pioneer of quality management processes in the 1960s. His Ishikawa

diagram, or “fishbone diagram”, is a founding tool for modern management and is

considered one of the seven basic tools of quality control.

The tool forces a problem solver to think creatively across several different categories, the

most common of which are shown. Additional value is gained when “causal factors” are

considered in relation to other categories of the fishbone diagram.

Causes and Effects The tool, or diagram, is divided into two problem components:

Effect

The Problem effect can be thought of as part of Problem Finding. The symptoms need to

be identified and the problem properly defined.

Causes

Problem Causes can be thought of as part of Problem Shaping. The detail work of the

Ishikawa diagram is performed here, where all possible causes are identified and broken

down to sub-causes.

Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 27

Problem Solving Workshop Critical Thinking and Root Cause Analysis

Problem Causes and Categories The numbers and types of main “causal bones” that are included in the Ishikawa diagram

can be quite varied. These can be considered as causal categories that can then be broken

down further. Some of the more common categories are:

The Basic Ishikawa Factors

People (individuals, teams, skills and experience)

Process (process and procedure)

Technology (hardware and software)

Materials (raw materials, consumables, and/or information)

Environment (social culture and physical building)

Management (decision makers)

The 4 Ss (used in the service industry)

Surroundings

Suppliers

Systems

Skills

Considerations when using the Ishikawa Technique The following considerations are useful when using the Ishikawa diagram technique:

The cause-and-effect diagram reveals key relationships between variables and

possible causes that are useful in a complex systems environment.

This method is both a bottom-up and top-down approach that can benefit from

integration with other problem solving methods, such as deriving all possible causes

through brainstorming and then investigating using the 5 Whys technique.

A “general rule” to this diagram is to break down each possible cause to the

secondary level of granularity (Primary Cause and Secondary Cause)

A key to successfully using Ishikawa diagrams is to rate each condition on the

fishbone as Necessary or Sufficient. This will help guide the root cause investigation

and the assessment of causal combinations that may have occurred.

o A necessary condition is one where the event can never occur without it

o A sufficient condition is one where the event must occur with it

Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 28

Problem Solving Workshop Critical Thinking and Root Cause Analysis

Activity 6: Ishikawa Diagram - Creative Leads

PEOPLE

MANAGEMENT

PROCESS

ENVIRONMENT

EQUIPMENT

MATERIALS

Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 29

Problem Solving Workshop Critical Thinking and Root Cause Analysis

Research Information:

Contact the Marketing Manager and Sales Reps The Marketing Manager insists the CRM application is not used for any new business

activity; nor is it used by any other department as only the Marketing Manager can

approve new users. As the software is developed internally, there are no user limits.

Also, no environmental factors that have changed (i.e.: no office moves, etc).

The Marketing Manager further state that the CRM Application is better managed and

used since starting a new quality review initiative where the Manager reviews and

updates poorly documented or incomplete Client records. It’s critical to the Marketing

Manager that these records are accurate as they drive the weekly Sales reports to upper management. This quality effort has been in place now for more than a month.

Sales Reps insist that they are following Marketing procedures when updating

records. There are no shortcuts taken. Only the Sale Reps have access to these

records, and Sales Reps do not have the admin rights to share and update other Sales

Reps client records.

Sales Reps do not think updates happen at the same time. Client calls are too random.

However, there are often multiple records left open for periods of time when multiple

Client calls are taken in succession. This practice is the norm, and Sales Reps will complete the Client Updates when call volumes lower and time permits.

Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 30

Problem Solving Workshop Critical Thinking and Root Cause Analysis

Section 6: Getting to the True Root Cause

Hypothesis Testing and Validation A hypothesis is a proposed Cause for the Problem and is then tested. There are many ways

to test and validate, but these tend to fall into two categories: a controlled experiment or an

operational observation. Within the context of IT Service Management (ITSM), a range of

testing options should be proposed and each option assessed with a Risk Assessment to

manage both time pressures and risk of worsening the Problem. Involvement of affected

stakeholders should always be considered and involved when making Testing choices.

The following table is a simple and structured way to assess testing options:

Testing

Option

Components

Involved

Stakeholders Experiment or

Observation

Risk

Assessment

Change Analysis (Comparative Analysis) Change Analysis is a form of testing and validation through comparative analysis. This

technique is based on comparing all factors contributing to the situation where a problem

does not exist, to the situation where the problem does exist.

This technique is a top-down approach requiring full knowledge of the correct functional

parameters (or design constraints) of a system, as well as the skills and experience of the

problem solver.

This technique may involve re-enactment and observation, where a technician changes one

factor at a time in an attempt to re-create the Problem. The Root Cause of the Problem may

be found through a process of elimination of validating key operating parameters.

Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 31

Problem Solving Workshop Critical Thinking and Root Cause Analysis

Activity 7: Hypothesis Testing and Validation

You have made a decision that the three previous deductions are most likely and that the

only way to confirm your hypothesis is that you test these user scenarios. Complete the

following table using previous case study material and propose your choice testing scenario

(assume there is no separate testing environment):

Testing Option Components

Involved

Stakeholders Experiment /

Observation

Risk

Assessment

Using Sunday maintenance

window, apply test cases to

operational data. Revert data

to original values.

CRM data

CRM App

CRM Server

Marketing Mgr

Sales Reps

App group

Server Group

Problem Mgr

Using Sunday maintenance

window, create test records

and apply test cases to test

data. Remove test records.

CRM data

CRM App

CRM Server

Marketing Mgr

Sales Reps

App group

Server Group

Problem Mgr

Engage the Marketing

Manager and staff to

participate in a limited

update test of a single client

update being shared.

CRM data

CRM App

Marketing Mgr

Sales Reps

Problem Mgr

Involve participation and

cooperation from the

Marketing staff. On next

occurrence, list all open

client records that failed to

save and check with staff

and management to

determine if the record was

being shared.

CRM data

CRM App

Marketing Mgr

Sales Reps

Problem Mgr

What is/are your recommended choices?

Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 32

Problem Solving Workshop Critical Thinking and Root Cause Analysis

Testing Results: Testing has been arranged within the production CRM Application over the maintenance

weekend for the creation of 5 test clients and 2 Desktop PCs. The following is observed:

When a single client record is updated but not saved, and then the same record is

opened and updated in a second PC, the second PC fails to save the record and

appears to be frozen. The resolution is to refresh the Client record or close the CRM

Application.

o This same test for multiple records opened and just one record updated will

also fail to save the block of records.

o This appears to have duplicated the Problem and identified the Cause.

However, it is prudent to test the two other possible contributing scenarios for

their effect.

On creating batch record updates and saving, there were no application errors. This

same scenario was repeated for opening the records in multiple desktop PCs, but only

making updates and saving on one specific PC. No save errors resulted.

o This same test was duplicated with large PDF file attachments to records.

Again, no file save errors resulted.

Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 33

Problem Solving Workshop Critical Thinking and Root Cause Analysis

The 5 Whys The 5 Whys is a method for perseverance in Problem Shaping to find the true root cause,

and not to stop at a superficial symptoms and assumptions. This basic cause and effect

method is simple in concept: to investigate the possible cause of a problem, ask the question

“why did this happen” in five successions.

This technique was originally used within Toyota Motor Corporation and is a critical

component of problem solving now also used within Kaizen, lean manufacturing, and Six

Sigma.

The 5 Whys is a questions-asking method that pushes the problem solver to dig deeper. By

no means is the method limited to 5 degrees of detail, but it has been generally accepted that

5 five iterations of asking why is generally sufficient to get to a root cause.

Use the 5 Whys technique for simple problems or for use in conjunction with other problem

solving techniques. This technique depends on a technician’s knowledge and experience

that allows them to ask the right “why” questions. This may result in some causes being

overlooked in a complex situation, as well as allowing that varying experience will result in

different cause theories/explanations.

Success with this method can be increased when combined with the following factors:

Verify each “why” question before proceeding to the next to avoid straying off-track.

Focus on making the final “why” a process questions. Based on the assumption that

most problems occur due in some way to a lack of adherence to, failure within, or

lack of existence of a process.

Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 34

Problem Solving Workshop Critical Thinking and Root Cause Analysis

Activity 8: But are we Done? The 5 Why’s

Using the 5 Whys Technique, determine if there is a further root cause:

1. Why did the save error occur?

Due to a record concurrency lock

2. Why did the concurrency issue get into production?

3. Why?

4. Why?

5. Why?

Root Cause

Conclusions: The true Root Cause

Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 35

Problem Solving Workshop Critical Thinking and Root Cause Analysis

Section 6: Problem Options and Solutions

The final step of the Problem Solving process is to determine an effective range of solution

options to address the root cause(s) of the Problem. Each option should be assessed from a

business justified perspective, should consider an assessment of risk, and should be

implemented according to an appropriate project plan (based on the complexity and scope of

the solution).

This step is made more effective when the Problem Solver also considers effective

workarounds, or temporary techniques, to deal with the Problem should it recur.

The following table is a simplified approach to listing and assessing solution options:

Solution Option Business

Justification

Risk

Assessment

Priority Approved

Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 36

Problem Solving Workshop Critical Thinking and Root Cause Analysis

Activity 9: Problem Options and Solutions

The following table outlines the Problem Options and Solutions. Rank the priority of the

options presented, their business justification, and risk assessment.

Solution Options Priority

(1-4)

Business

Justified Cost

(High/Med/Low)

Risk Assessment

(High/Med/Low)

Approved

Standards for considering and

implement Application

concurrency

Procedure ensuring Developed

/ Purchased Applications meet

concurrency standards

Testing of all current

Applications for meeting

concurrency standards

CRM Application re-

programmed according to new

concurrency standards

Temporary Workaround A:

Communicate the Problem

Cause to Marketing and ensure

care is taken NOT to open

shared records. In addition,

encourage Marketing Staff to

save frequently.

Temporary Workaround B:

The CRM Application is re-

programmed with a warning

message to the User should a

shared record be opened.

What is/are your recommended choices?