problem solving workshop - itil certification … and root cause analysis problem-solving activities...
TRANSCRIPT
Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 1
IT Service Management
Problem Solving Workshop: Critical Thinking and Root Cause Analysis
Seminar Workbook
Version 2.1.0
Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 2
Problem Solving Workshop Critical Thinking and Root Cause Analysis
Acknowledgements
B Wyze Solutions Inc LCS Accredited ITIL® Training Provider B Wyze copyright © 2011
Author: Graham Furnis ITIL® v3 Expert Certified
ITIL® v3 Intermediate Lifecycle Certified ITIL® v2 Manager Certified
LCS Accredited ITIL® Trainer
Works Cited
Fishbone Diagram, Ishikawa, Kaoru (1990)
Kepner-Tregoe Problem Solving (Kepner & Tregoe, 1958)
Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 3
Problem Solving Workshop Critical Thinking and Root Cause Analysis
TABLE OF CONTENTS
ACKNOWLEDGEMENTS .............................................................................................................. 2
WORKS CITED ................................................................................................................................. 2
TABLE OF CONTENTS ................................................................................................................. 3
OVERVIEW: WHAT TO EXPECT ............................................................................................... 5
Who should attend? .................................................................................................................... 5
Prerequisites ................................................................................................................................ 5
Learning Objectives .................................................................................................................... 5
SECTION 1: PROBLEM SOLVING WITHIN IT SERVICE MANAGEMENT ................... 6
IT Service Management (ITSM)................................................................................................ 6
Key Terms ................................................................................................................................... 6
SECTION 2: PROBLEM SOLVING .............................................................................................. 7
What is Problem Solving? .......................................................................................................... 7
SECTION 3: PROBLEM-SOLVING PERSPECTIVES ............................................................. 8
Adjust your Thinking and Reasoning ........................................................................................ 8
SECTION 4: PROBLEM SOLVING AS A STRUCTURED PROCESS................................ 10
Problem Solving ....................................................................................................................... 10
Root Cause Analysis (RCA) .................................................................................................... 10
Kepner-Tregoe Root Cause Analysis Method ........................................................................ 11
The Problem Solving Plan Using Kepner-Tregoe .................................................................. 12
SECTION 5: RCA METHODS AND TECHNIQUES ............................................................... 13
Comparison of Methods and Techniques ................................................................................ 13
The Journalism Standard .......................................................................................................... 14
ACTIVITY: CASE STUDY ............................................................................................................ 15
ACTIVITY 1: THE JOURNALISM STANDARD ..................................................................... 16
Facts Table: ............................................................................................................................... 16
ACTIVITY 2: PROBLEM SOLVING PLAN AND STEPS ...................................................... 18
Pareto Analysis ......................................................................................................................... 19
ACTIVITY 3: PARETO ANALYSIS ............................................................................................ 20
Cause and Effect (Relationship) Analysis............................................................................... 22
ACTIVITY 4: CAUSE & EFFECT - TECHNICAL RELATIONSHIPS ............................... 23
Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 4
Problem Solving Workshop Critical Thinking and Root Cause Analysis
ACTIVITY 5: CAUSE & EFFECT – PROCESS AND CHRONOLOGICAL
RELATIONSHIPS ........................................................................................................................... 24
Ishikawa Diagram (Cause and Effect Diagram) ..................................................................... 26
ACTIVITY 6: ISHIKAWA DIAGRAM - CREATIVE LEADS ............................................... 28
SECTION 6: GETTING TO THE TRUE ROOT CAUSE ........................................................ 30
Hypothesis Testing and Validation.......................................................................................... 30
ACTIVITY 7: HYPOTHESIS TESTING AND VALIDATION .............................................. 31
The 5 Whys ............................................................................................................................... 33
ACTIVITY 8: BUT ARE WE DONE? THE 5 WHY’S.............................................................. 34
SECTION 6: PROBLEM OPTIONS AND SOLUTIONS ......................................................... 35
ACTIVITY 9: PROBLEM OPTIONS AND SOLUTIONS ....................................................... 36
Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 5
Problem Solving Workshop Critical Thinking and Root Cause Analysis
Overview: What to Expect
Who should attend? This course will be of interest to:
IT support individuals who would like to strengthen their problem-solving skills
IT managers seeking to improve their team skills in problem solving
IT process management individuals who are looking to improve their problem
process and Root Cause Analysis problem-solving activities
Prerequisites No prerequisites are required, but an IT background is highly recommended and an
understanding of ITIL (IT Infrastructure Library) may be useful.
Learning Objectives The learning objectives of this workshop provide value to participants in the areas of:
Understand the basic elements of the Problem Solving Skills
Understand the perspectives of Critical Thinking and Creative Thinking
Understand a process approach to Problem Solving and Root Cause Analysis
Understand how to maintain progress in problem solving by adjusting your
approach, methods, and techniques
Understand how to get to the true Root Cause of a Problem
Understand how to solve the Problem
Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 6
Problem Solving Workshop Critical Thinking and Root Cause Analysis
Section 1: Problem Solving within IT Service
Management
IT Service Management (ITSM) IT Service Management is a discipline for managing IT systems and technology centered
on the identification and delivery of IT Services used by the business. These IT Services
are defined in business terms and are the final outcome for IT systems and technology.
Within the practice of ITSM, the ITIL (IT Infrastructure Library) framework links Root
Cause Analysis to the process of Problem Management.
The ITSM discipline and the ITIL framework approach provide a beneficial relationship to
successful IT related Problem Solving by IT Service Support professionals. As such, it is
useful to consider the following definitions:
Key Terms
Problem and Problem Management A Problem is the unknown cause of one or more related Incidents.
Problem Management manages the investigation into the cause of these related
Incidents and ensures an appropriate resolution to the Problem is found.
Incident and Incident Management An Incident is an unplanned event that is a deviation from normal (as defined by the
Service Level Agreement) that affects an IT Service. This deviation could include:
o Disruption to the agreed service
o Reduction in the quality of agreed service
o Something that could lead to a disruption or reduced quality of agreed service
Incident Management manages the quick response and restoration of these incidents,
and may further escalate issues to Problem Management for further investigation.
Priority, Impact, Urgency Priority is a generic ITSM definition that defines the priority in which issues, such as
Incidents and Problems, are dealt with. Priority of an issue is a combination of Impact and
Urgency. These are defined as the degree of:
Impact - positive or negative business effect of something.
Urgency - response time required to address an “impact” event.
Service Level Agreement (SLA) A written or understood agreement between an IT Service Provider and the Business
(through identification of the Business Customer) that outlines all arrangements for the
use, performance levels, and management of an IT Service. Incident and Problem priority
objectives and other details are often documented and agreed in this Agreement.
Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 7
Problem Solving Workshop Critical Thinking and Root Cause Analysis
Section 2: Problem Solving
What is Problem Solving? Problem Solving is a basic and key life skill that is often considered to be one of the most
complex human intellectual functions. Fortunately, it’s also something that can be
continually learned, practiced and improved.
In its basic sense, Problem Solving is a mental process for thinking and reasoning. It can
be refined and improved when broken into the separate, but related, parts of Problem
Finding and Problem Shaping.
Problem Finding Identifying the Problem is the first step to good Problem Solving. The problem statement
becomes the target that is being solved for, and getting this target right or wrong can have
serious good or bad consequences. In many cases, identifying the problem is more
complex than actually solving the problem.
A key to good problem finding involves the use of Creative Thinking.
Problem Shaping Problem Shaping follows Problem Finding. Once the Problem has been correctly
identified, questions need to be asked that shape the direction and findings of problem
investigation. Each question leads to insight into the underlying Problem Cause(s), and
thus refines and shapes further questions to be asked.
A key to good problem shaping involves the use of Critical Thinking.
Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 8
Problem Solving Workshop Critical Thinking and Root Cause Analysis
Section 3: Problem-Solving Perspectives
Adjust your Thinking and Reasoning Problem Solving as a skill requires that the problem solver adjust their approach and
perspective to solving the problem. Failure to adjust and taking the wrong perspective from
the onset is one of the primary reasons problem solvers fail, or fail to act in a timely
manner. This can be accomplished by understanding the differences between:
Critical Thinking in Problem Solving can be
considered logical thinking. It is a methodical
approach that involves the cognitive skills of
goal clarification, observation, interpretation,
analysis, categorizing, relating, inference
making, evaluation of results, assessment, and
explanation of conclusions.
Creative thinking in Problem Solving can be
thought of as finding options and alternatives,
or more commonly referred to as “thinking
outside the box” of common and tried
solutions.
Critical Thinking is associated with deductive
reasoning, where it is based on a set of
propositions and the subsequent investigation
and factual discoveries that bring to light the
root cause of a Problem.
Tends to be a top-down approach to
Problem Solving. It requires detailed
knowledge or experience combined with a
logical process that confirms each
proposition is NOT the source of the
Problem.
Creative thinking is associated with inductive
reasoning, where Inductive Reasoning can be
thought of as assumptions or generalized
conclusions drawn from a set of observations.
These assumptions are not necessarily valid
conclusions, but start points to be further
investigated and validated.
Tends to be a bottom-up approach to
Problem-Solving. It is based on both
intuition and making guesses based on
experience. It must be followed up by
verification of these guesses, or
assumptions, typically using an
experimental approach.
Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 9
Problem Solving Workshop Critical Thinking and Root Cause Analysis
When to Use Critical Thinking and
Deductive Reasoning A problem is familiar or of a familiar type
A problem solver has sufficient skill and
experience
When to Use Creative Thinking and
Inductive Reasoning A problem is unfamiliar
Deductive reasoning has reached a dead
end and more alternatives are needed
For example:
A critical marketing application has several
different user error messages across the
marketing department. With our programming
experience we know that each error message is
triggered by application error trapping code.
Therefore, we deduce that we should investigate
the programming code related to the application
modules that produced the error message to
confirm the application logic.
For example:
A critical marketing application has several
different user error messages across the
marketing department. We have no
programming skill, but have observed in our
past experience that shared applications are
run from a central server. The marketing
application is a shared application; and
therefore we induce (assume) that the
Problem must be based on a server. Our
investigations will now take this path.
Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 10
Problem Solving Workshop Critical Thinking and Root Cause Analysis
Section 4: Problem Solving as a Structured Process
Problem Solving Problem Solving involves a structured and methodical approach to problem solving. In
general, this structure involves:
(1) Correctly defining the problem,
(2) Finding the root cause(s) of the problem through Root Cause Analysis,
(3) Determining the most effective corrective actions to take, and
(4) Implementing the solution to successfully manage the problem.
Primary Goal: The Problem Solving process seeks to prevent Problems from ever recurring by taking
effective corrective actions.
Root Cause Analysis (RCA) Root Cause Analysis is a sub-process of the larger Problem Solving process that requires
an appropriate application of Problem Solving skills in conjunction with a methodical and
systematic approach to identifying the true root cause(s) of Problems.
Each Root Cause Analysis approach shares a common aim to avoid focusing on and
solving the symptoms of the problem, and to instead drill down to identify and solve the
true root cause of the problem.
A key assumption to Root Cause Analysis is that there is always one true root
cause for any given problem. This leads to a key challenge of having sufficient focus and
perseverance to find this one true root cause.
Primary Goal Root Cause Analysis endeavors to determine the lowest level “root” cause of a Problem
that supports taking the most effective corrective actions.
Primary Objective
The objective of Root Cause Analysis is to reveal the correct root cause of the Problem,
because without it we cannot determine what effective corrective actions must be taken.
Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 11
Problem Solving Workshop Critical Thinking and Root Cause Analysis
Kepner-Tregoe Root Cause Analysis Method
The Kepner-Tregoe method to analyze problems was developed by researchers Dr. Charles
Kepner and Dr. Benjamin Tregoe. This method emphasizes a structured approach to
problem solving that relies on setting priorities and making use of technician knowledge
and experience. The method includes five steps to problem solving.
Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 12
Problem Solving Workshop Critical Thinking and Root Cause Analysis
The Problem Solving Plan Using Kepner-Tregoe It is recommended that a structured problem solving plan should be created when solving
any Problem. The plan should follow the problem solving steps, such as those defined by
Kepner-Tregoe, and should include the business goals and objectives that need to be
achieved. Each step can then be managed at an appropriate level based on priorities, time
pressures, and availability of information.
The Problem Solving Plan is iterative, where new facts and observations shed new and
increasingly accurate light on both the Problem definition as well as the root cause
investigation.
Sample Problem Solving Plan Goal:
Objectives:
Constraints:
Problem Definition:
Problem Solving Steps Technique(s) to Use
1. Define the Problem
2. Assess the Problem
3. Establish Possible Causes
4. Explore Possible and Probable Causes
5. Verify Root Cause(s)
Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 13
Problem Solving Workshop Critical Thinking and Root Cause Analysis
Section 5: RCA Methods and Techniques
Comparison of Methods and Techniques There are many Root Cause Analysis related methodologies and techniques. The more
common ones along with their primary characteristics are listed in the following table.
Each has particular strengths that make it suitable for use in specific situations. These are
defined more fully in the subsequent pages.
Method / Technique Approach Top Down /
Bottom up
Requires Complementary
Methods / Techniques
Journalism Standard Problem Finding Problem Shaping
All methods / techniques
Pareto Analysis Problem Finding Problem Shaping
Both Top-down / Bottom-up
ITIL Processes
Ishikawa
The 5 Whys
Cause and Effect Analysis
Problem Shaping Top-Down ITIL Processes
Configuration Relationships
Systems Analysis
The 5 Whys
Change Analysis Problem Shaping Top-down Systems Analysis
The 5 Whys
Ishikawa Diagram Problem Finding Problem Shaping
Both Top-down / Bottom-up
Brainstorming
The 5 Whys
The 5 Whys Problem Shaping Top-Down All methods / techniques
Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 14
Problem Solving Workshop Critical Thinking and Root Cause Analysis
The Journalism Standard The Journalism Standard is a technique that is focused on factual reporting and analysis,
where emotion and assumptions are removed from consideration. It can be thought of as
listing and considering “just the facts”. The Journalism Standard reminds the Problem
Solver to research and list the basic facts of the situation first, to seek interviews and
independent confirmation, and then to evaluate using a neutral approach.
This technique also reminds the Problem Solver to avoid some of the more common
Problem Solving mistakes including eliminating or limiting possible causes that includes:
Avoid Jumping to Assumptions Avoid eliminating possible causes due to one or more incorrect assumptions. Making
assumptions is a necessary part of life. However, when the stakes are high and risks of
failre increase, making assumptions can be dangerous! Many times a Problem has
escalated or dragged on due to an assumption being made that eliminated a check point.
For example:
A technician may eliminate checking the application drivers on Server PC “knowing that
it can’t be the Server as no one has updated the Server since it was last working…”
Avoid Tunnel Vision Avoid missing possible causes due to an obsession or narrow focus on one or a small range
of assumptions. Jumping to a conclusion may lead you to a quick diagnosis of a Problem,
but more often than not it leads to a failure to correctly identify the root cause. In other
cases, the narrow focus misses the correct root cause and the Problem escalates as the
investigation drags on.
For example:
A technician may limit investigation to a Server PC “concluding that it must be the Server
as the application error is generated from the software residing on the Server…”
The 5 w’s The 5 w’s is a simple and commonly known rule to gather the facts:
who was involved
what events happened
when did the sequence of events happen
where did the sequence of events happen
how did the sequence of events happen
why did the problem happen (initial inductions and deductions)
Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 15
Problem Solving Workshop Critical Thinking and Root Cause Analysis
Activity: Case Study
The Marketing Client Support team of LITI Corporation has reported two related Incidents
this week related to their Sales Management Service. While each individual Incident has
been restored, the issue has been escalated by Marketing staff to the Marketing Manager,
who in turn has raised the issue with Service Level Management (SLM). SLM has agreed
that this is a “Problem” that needs to be solved by IT in order to maintain marketing staff
productivity and keep good relations with the Business.
The issue has been assigned to the Problem Manager, who in turn has opened a Problem
record and assigned “you” to resolve this issue. Initial investigation of related Incident
records has revealed the following descriptions:
Bob in Marketing has experienced a series of CRM Application "freezes" where he
has tried to save an updated client history. Bob called the Service Desk and was
advised to re-boot the application. As a consequence, Bob lost his data updates and
had to re-enter the updates. Bob mentioned that several other Sales Representatives
have experienced the same issue.
Mary in Marketing has experienced extreme slowness and failure to save updates
several times in the last month when updating client billing and payment
information using the CRM Application. Mary did not contact the Service Desk
until the event occurred again this week. Mary was dispatched a Level 2 support
technician. The technician was able to confirm local network connectivity, and also
confirmed that the CRM Application server was up and running. Desktop memory
and disk space were confirmed to be sufficient. The technician resolved the
Incident by refreshing the client profiles (and consequently losing current data
updates), re-entered 3 of the 20 original client updates, and successfully saved to
the server without any Incident occurring. Mary was advised to save more frequent.
Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 16
Problem Solving Workshop Critical Thinking and Root Cause Analysis
Activity 1: The Journalism Standard
Facts Table: Who
What
When
Where
How
Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 17
Problem Solving Workshop Critical Thinking and Root Cause Analysis
Why
Deductions
Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 18
Problem Solving Workshop Critical Thinking and Root Cause Analysis
Activity 2: Problem Solving Plan and Steps
Goal: Goal: The Problem root cause is to be found
Objectives: Objective: The Problem is to be solved using a structured
problem solving approach
Constraints: Problem is categorized as Priority 2
Root cause must be found within one week of assignment
One problem solving analyst assigned
Problem Definition:
Problem Solving Steps Tactics, Approach, and Technique(s) to Use
1. Define the Problem
2. Assess the Problem
3. Establish Possible Causes
4. Explore Possible and
Probable Causes
5. Verify Root Cause(s)
Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 19
Problem Solving Workshop Critical Thinking and Root Cause Analysis
Pareto Analysis Pareto analysis is based on the Pareto principle, which is often quoted as the “80/20 rule”.
This principle states that (in general) 80% of the effects of something are a result of 20%
of the inputs or causes.
In the world of ITSM problem solving, Pareto analysis states that problem causes
accounting for 80% of problems should be investigated first. Thus, Pareto analysis
becomes a statistical technique that relies upon assembling historical problem records.
This technique is both top-down and bottom up; by identifying possible causes that can
serve as start points to be investigated further using other techniques and methods (such as
Cause and Effect analysis).
Some considerations include: Pareto Analysis forces the Problem Solver to avoid the temptation to jump directly to
problem investigation without a quick check of known or obvious problems.
A history of problems and causes is required to produce the analysis. In a green-field
situation, this information may take too long to assemble. In such situations, Pareto
analysis may be based on technician experience and recollection of past similar
events.
It’s important to remember that the 80/20 rule suggests “the most likely causes”, and
is not intended to be correct every time. As such, it may result in misdiagnosis by
overlooking smaller frequency causes or new causes.
Pareto Analysis Steps Step 1: Compile a history of unique causes along with specific problems, or types of
problems, components or services affected, and/or users affected.
Step 2: Compute the percentage that each unique cause has been the source of a
problem.
Step 3: Apply the 80/20 rule to determine the most likely causes to investigate.
Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 20
Problem Solving Workshop Critical Thinking and Root Cause Analysis
Activity 3: Pareto Analysis
The following Problem History Report is produced when searching for the Generic Problem of “Application Failure to Save Data”.
Last 80/20 Percentage Frequency Category1 Category2 Cause Resolution Assessment
18 month
ago
1.64% 8 Network Hardware Local area network connectivity issues
at the local network hub
Workaround - Reboot
1 month
ago
2.46% 12 Server Hardware Server disk capacity limitations caused
by accumulation of data
Workaround - Archive
data
2 months ago
4.11% 20 Desktop Software Local PC application code error Workaround – Reboot
or Re-enter data
1 month
ago
5.13% 25 Desktop Software Local PC network connectivity issues at
the user's Desktop PC
Workaround - Reboot
5 months ago
6.16% 30 Desktop Software Local PC memory limitations caused by
too many open applications Workaround - Close Applications
8 months
ago
24.64% 120 Server Hardware Server memory limitations causes by
application processing routines overload
Workaround - Reboot
2 months
ago
27.10% 132 Desktop Hardware Local PC disk capacity limitations
caused by accumulation of data
Workaround - Archive
data
3 weeks ago
28.75% 140 Desktop Hardware Local PC invalid data types entered as
record field data Workaround –translate data to acceptable type
Deductions (what should we investigate first, second, etc.?):
Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 21
Problem Solving Workshop Critical Thinking and Root Cause Analysis
Research Information:
Contact the Server Support Group The Server Support group assures you that the Server PC supporting the CRM
Application is fully monitored and showed no processing overload according to Server
log files, and currently has 50% available disk capacity after a data clean up last month.
Contact Bob and Mary Bob and Mary both respond that there’s nothing unusual or incorrect with their data
as they re-entered and attached the same data after re-booting and successfully saved.
While they are on the phone, you also find out:
o The Problem first appeared (but was not reported) 4 to 5 weeks ago. The first
occurrence was several weeks after a CRM release introducing a Billing
module, but just before the Billing bug fix.
o The “others” were almost all other Sales Reps, and they believe the frequency
of these failed saves is increasing.
o Sales Reps and the Marketing Manager believe that the application is used
more heavily and stores more information as time goes on. It must be a failure
to save the quantity of data. They demand this failure be addressed to allow
them to store the critical information required.
Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 22
Problem Solving Workshop Critical Thinking and Root Cause Analysis
Cause and Effect (Relationship) Analysis Cause and affect analysis is a deductive, top-down problem solving method that requires the
problem solver to conduct proper research and information gathering in order to make use
of critical thinking and deductive reasoning skills.
Cause and Effect Analysis can also be thought of as Relationship Analysis as it first
requires identifying relationships between components (including people, process, and
technology) and then to determine a cause and effect path.
The cause and effect tree or diagram is a visual description of the relationship between two
events, where the first event is the cause (the trigger) and the second event is the effect (the
consequence). This is a description of causality.
Understanding Causality Conditions Causality can better be understood by classifying causes as one of three types of conditions:
Necessary Conditions:
If event “B” is always caused by “A”, then the presence of “B” implies “A” must
have happened.
Sufficient Conditions:
If event “B” is sometimes caused by “A”, then the presence of “B ” implies “A” may
have happened. This leaves open the possibility of another event resulting in “B”.
Contributory Conditions:
If event “B” is caused by “A” and other factors (ie: “A” is not sufficient by iteself to
cause “B”), then the presence of “B” implies “A and other factors” may have
happened.
Neither necessary nor sufficient but it must be contributory. This requires other events
to be searched for and found.
Common Cause and Effect Analysis Methods Technical Relationships through the formal or informal Configuration Management
Database (CMDB)
o Fault Relationships through Fault Tree Analysis (FTA). Fault Tree Analysis is
used to analyze a single fault event. It makes use of a cause and effect tree
structure that analyzes the relationships of complex system faults (Technical
Faults) using logic diagrams displaying the states of each part of a system.
IT Process Relationships through the Configuration Management System (CMS) and
related process activity records of the Service Management System (SMS), such as
Incident, Problem, and Change records.
End-User Interaction Relationships through formal or informal observation of IT
Service modules of functionality.
Chronological Event Relationships (chain of events) through sequencing all known
events for people, process, or technology relationships.
Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 23
Problem Solving Workshop Critical Thinking and Root Cause Analysis
Activity 4: Cause & Effect - Technical Relationships
The Service Management System (support ticketing system) contains information that
relates IT components as well as Incident and Change records.
1. What Configuration Items
(CIs) - based on the
Service Configuration
Model - should be
investigated?
2. If the Cause exists within
the components identified,
how would you rate the
Conditions?
Configuration Item Cause-Effect Necessary
Condition
Sufficient
Condition
Contributory
Condition
Server PC Failure leads to Problem
CRM Server Application Failure leads to Problem
Network Router Failure leads to Problem
CRM Desktop Application Failure leads to Problem
Desktop PCs Failure leads to Problem
MS Windows Failure leads to Problem
MS Office Failure leads to Problem
Deductions:
Problem Solving & Critical Thinking Workshop
Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 24
Problem Solving Workshop Critical Thinking and Root Cause Analysis
Activity 5: Cause & Effect – Process and Chronological Relationships
The following report is produced for all Incident and Changes for the last 2 months for all Configuration Items identified above:
Record Occurrences Last Time Component Description Incident 2 1 day ago CRM Desktop CRM Desktop application failure to save Incident 3 2 weeks ago Desktop PC Desktop network cable disconnected, desk-side reconnect Maintenance 2 2 weeks ago Server PC Server shut down and restart – standard Sunday maintenance window Incident 5 3 weeks ago Desktop PC Desktop PC performance degradation, close applications or reboot required
Incident 3 3 weeks ago CRM Desktop CRM Desktop application performance degradation, close applications and reboot
required Incident 1 1 month ago Server PC Server PC disk space alarm, historic data archived Change 1 1 month ago CRM Server Application Release Bug Fix Updates Maintenance 4 1 month ago Server PC Server disk clean up and tuning – standard Sunday maintenance window Change 1 2 months ago CRM Desktop Update Desktop CRM Application Drivers Change 1 2 months ago CRM Desktop Application Release functionality Update to Client Billing module
Deductions:
Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 25
Problem Solving Workshop Critical Thinking and Root Cause Analysis
Research Information:
Contact the CRM Application Development and Support Group The Application Support group indicates the bug fix was released in response to an
undersized data size field limit in the new Client Billing module. Since the bug fix
there have been no further related Incidents.
The Application Support group believes it must be a User data entry error as the
Application has been fully tested.
The Application further explains that when data is updated by a User, it is held in
memory on the User’s PC. When the User saves this data, each record is written to the
record on the Server PC Application database. There are no errors recorded in the
database error log.
Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 26
Problem Solving Workshop Critical Thinking and Root Cause Analysis
Ishikawa Diagram (Cause and Effect Diagram) Kaoru Ishikawa was a pioneer of quality management processes in the 1960s. His Ishikawa
diagram, or “fishbone diagram”, is a founding tool for modern management and is
considered one of the seven basic tools of quality control.
The tool forces a problem solver to think creatively across several different categories, the
most common of which are shown. Additional value is gained when “causal factors” are
considered in relation to other categories of the fishbone diagram.
Causes and Effects The tool, or diagram, is divided into two problem components:
Effect
The Problem effect can be thought of as part of Problem Finding. The symptoms need to
be identified and the problem properly defined.
Causes
Problem Causes can be thought of as part of Problem Shaping. The detail work of the
Ishikawa diagram is performed here, where all possible causes are identified and broken
down to sub-causes.
Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 27
Problem Solving Workshop Critical Thinking and Root Cause Analysis
Problem Causes and Categories The numbers and types of main “causal bones” that are included in the Ishikawa diagram
can be quite varied. These can be considered as causal categories that can then be broken
down further. Some of the more common categories are:
The Basic Ishikawa Factors
People (individuals, teams, skills and experience)
Process (process and procedure)
Technology (hardware and software)
Materials (raw materials, consumables, and/or information)
Environment (social culture and physical building)
Management (decision makers)
The 4 Ss (used in the service industry)
Surroundings
Suppliers
Systems
Skills
Considerations when using the Ishikawa Technique The following considerations are useful when using the Ishikawa diagram technique:
The cause-and-effect diagram reveals key relationships between variables and
possible causes that are useful in a complex systems environment.
This method is both a bottom-up and top-down approach that can benefit from
integration with other problem solving methods, such as deriving all possible causes
through brainstorming and then investigating using the 5 Whys technique.
A “general rule” to this diagram is to break down each possible cause to the
secondary level of granularity (Primary Cause and Secondary Cause)
A key to successfully using Ishikawa diagrams is to rate each condition on the
fishbone as Necessary or Sufficient. This will help guide the root cause investigation
and the assessment of causal combinations that may have occurred.
o A necessary condition is one where the event can never occur without it
o A sufficient condition is one where the event must occur with it
Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 28
Problem Solving Workshop Critical Thinking and Root Cause Analysis
Activity 6: Ishikawa Diagram - Creative Leads
PEOPLE
MANAGEMENT
PROCESS
ENVIRONMENT
EQUIPMENT
MATERIALS
Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 29
Problem Solving Workshop Critical Thinking and Root Cause Analysis
Research Information:
Contact the Marketing Manager and Sales Reps The Marketing Manager insists the CRM application is not used for any new business
activity; nor is it used by any other department as only the Marketing Manager can
approve new users. As the software is developed internally, there are no user limits.
Also, no environmental factors that have changed (i.e.: no office moves, etc).
The Marketing Manager further state that the CRM Application is better managed and
used since starting a new quality review initiative where the Manager reviews and
updates poorly documented or incomplete Client records. It’s critical to the Marketing
Manager that these records are accurate as they drive the weekly Sales reports to upper management. This quality effort has been in place now for more than a month.
Sales Reps insist that they are following Marketing procedures when updating
records. There are no shortcuts taken. Only the Sale Reps have access to these
records, and Sales Reps do not have the admin rights to share and update other Sales
Reps client records.
Sales Reps do not think updates happen at the same time. Client calls are too random.
However, there are often multiple records left open for periods of time when multiple
Client calls are taken in succession. This practice is the norm, and Sales Reps will complete the Client Updates when call volumes lower and time permits.
Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 30
Problem Solving Workshop Critical Thinking and Root Cause Analysis
Section 6: Getting to the True Root Cause
Hypothesis Testing and Validation A hypothesis is a proposed Cause for the Problem and is then tested. There are many ways
to test and validate, but these tend to fall into two categories: a controlled experiment or an
operational observation. Within the context of IT Service Management (ITSM), a range of
testing options should be proposed and each option assessed with a Risk Assessment to
manage both time pressures and risk of worsening the Problem. Involvement of affected
stakeholders should always be considered and involved when making Testing choices.
The following table is a simple and structured way to assess testing options:
Testing
Option
Components
Involved
Stakeholders Experiment or
Observation
Risk
Assessment
Change Analysis (Comparative Analysis) Change Analysis is a form of testing and validation through comparative analysis. This
technique is based on comparing all factors contributing to the situation where a problem
does not exist, to the situation where the problem does exist.
This technique is a top-down approach requiring full knowledge of the correct functional
parameters (or design constraints) of a system, as well as the skills and experience of the
problem solver.
This technique may involve re-enactment and observation, where a technician changes one
factor at a time in an attempt to re-create the Problem. The Root Cause of the Problem may
be found through a process of elimination of validating key operating parameters.
Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 31
Problem Solving Workshop Critical Thinking and Root Cause Analysis
Activity 7: Hypothesis Testing and Validation
You have made a decision that the three previous deductions are most likely and that the
only way to confirm your hypothesis is that you test these user scenarios. Complete the
following table using previous case study material and propose your choice testing scenario
(assume there is no separate testing environment):
Testing Option Components
Involved
Stakeholders Experiment /
Observation
Risk
Assessment
Using Sunday maintenance
window, apply test cases to
operational data. Revert data
to original values.
CRM data
CRM App
CRM Server
Marketing Mgr
Sales Reps
App group
Server Group
Problem Mgr
Using Sunday maintenance
window, create test records
and apply test cases to test
data. Remove test records.
CRM data
CRM App
CRM Server
Marketing Mgr
Sales Reps
App group
Server Group
Problem Mgr
Engage the Marketing
Manager and staff to
participate in a limited
update test of a single client
update being shared.
CRM data
CRM App
Marketing Mgr
Sales Reps
Problem Mgr
Involve participation and
cooperation from the
Marketing staff. On next
occurrence, list all open
client records that failed to
save and check with staff
and management to
determine if the record was
being shared.
CRM data
CRM App
Marketing Mgr
Sales Reps
Problem Mgr
What is/are your recommended choices?
Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 32
Problem Solving Workshop Critical Thinking and Root Cause Analysis
Testing Results: Testing has been arranged within the production CRM Application over the maintenance
weekend for the creation of 5 test clients and 2 Desktop PCs. The following is observed:
When a single client record is updated but not saved, and then the same record is
opened and updated in a second PC, the second PC fails to save the record and
appears to be frozen. The resolution is to refresh the Client record or close the CRM
Application.
o This same test for multiple records opened and just one record updated will
also fail to save the block of records.
o This appears to have duplicated the Problem and identified the Cause.
However, it is prudent to test the two other possible contributing scenarios for
their effect.
On creating batch record updates and saving, there were no application errors. This
same scenario was repeated for opening the records in multiple desktop PCs, but only
making updates and saving on one specific PC. No save errors resulted.
o This same test was duplicated with large PDF file attachments to records.
Again, no file save errors resulted.
Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 33
Problem Solving Workshop Critical Thinking and Root Cause Analysis
The 5 Whys The 5 Whys is a method for perseverance in Problem Shaping to find the true root cause,
and not to stop at a superficial symptoms and assumptions. This basic cause and effect
method is simple in concept: to investigate the possible cause of a problem, ask the question
“why did this happen” in five successions.
This technique was originally used within Toyota Motor Corporation and is a critical
component of problem solving now also used within Kaizen, lean manufacturing, and Six
Sigma.
The 5 Whys is a questions-asking method that pushes the problem solver to dig deeper. By
no means is the method limited to 5 degrees of detail, but it has been generally accepted that
5 five iterations of asking why is generally sufficient to get to a root cause.
Use the 5 Whys technique for simple problems or for use in conjunction with other problem
solving techniques. This technique depends on a technician’s knowledge and experience
that allows them to ask the right “why” questions. This may result in some causes being
overlooked in a complex situation, as well as allowing that varying experience will result in
different cause theories/explanations.
Success with this method can be increased when combined with the following factors:
Verify each “why” question before proceeding to the next to avoid straying off-track.
Focus on making the final “why” a process questions. Based on the assumption that
most problems occur due in some way to a lack of adherence to, failure within, or
lack of existence of a process.
Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 34
Problem Solving Workshop Critical Thinking and Root Cause Analysis
Activity 8: But are we Done? The 5 Why’s
Using the 5 Whys Technique, determine if there is a further root cause:
1. Why did the save error occur?
Due to a record concurrency lock
2. Why did the concurrency issue get into production?
3. Why?
4. Why?
5. Why?
Root Cause
Conclusions: The true Root Cause
Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 35
Problem Solving Workshop Critical Thinking and Root Cause Analysis
Section 6: Problem Options and Solutions
The final step of the Problem Solving process is to determine an effective range of solution
options to address the root cause(s) of the Problem. Each option should be assessed from a
business justified perspective, should consider an assessment of risk, and should be
implemented according to an appropriate project plan (based on the complexity and scope of
the solution).
This step is made more effective when the Problem Solver also considers effective
workarounds, or temporary techniques, to deal with the Problem should it recur.
The following table is a simplified approach to listing and assessing solution options:
Solution Option Business
Justification
Risk
Assessment
Priority Approved
Copyright Protection: No part of these notes may be reproduced in any form – electronic or printed – without the written consent of B Wyze Solutions Inc 36
Problem Solving Workshop Critical Thinking and Root Cause Analysis
Activity 9: Problem Options and Solutions
The following table outlines the Problem Options and Solutions. Rank the priority of the
options presented, their business justification, and risk assessment.
Solution Options Priority
(1-4)
Business
Justified Cost
(High/Med/Low)
Risk Assessment
(High/Med/Low)
Approved
Standards for considering and
implement Application
concurrency
Procedure ensuring Developed
/ Purchased Applications meet
concurrency standards
Testing of all current
Applications for meeting
concurrency standards
CRM Application re-
programmed according to new
concurrency standards
Temporary Workaround A:
Communicate the Problem
Cause to Marketing and ensure
care is taken NOT to open
shared records. In addition,
encourage Marketing Staff to
save frequently.
Temporary Workaround B:
The CRM Application is re-
programmed with a warning
message to the User should a
shared record be opened.
What is/are your recommended choices?