incident management process executive summary (1) · · 2017-09-01incident*management*process –...
TRANSCRIPT
Process Team Members• Jon Russell• Larry Dillard• Marvin Kirkendoll• Peggi Polen• Nate Wagenaar• Vacilis Kollias• Lorinda Wisneski• Vesna Siracevska• John Crook, Navvia
Met for approximately 25 hours over the last two weeks
Goals• Align incident management process within University IT to common ITSM practice
• Prepare for Service Now Technical Design• Define goals, roles, objectives and process of Incident Management process
• Identify areas where Stanford deviates from best practice, limit where possible
Output• IT Incident Management Process Document• Incident Management Executive Summary• Process Diagram• Common understanding of Incident Management Process
https://asconfluence.stanford.edu/confluence/display/SM/Incident+General+Definitions
Incident Management DefinitionThis is the process that deals with all Incidents. Incidents can include failures or degradation of your services reported by users of those services;; by your own technical staff;; or automatically from monitoring tools. The ability to respond to an Incident and restore the level of service as quickly as possible or to what was agreed to with customers or at least alleviate the impact on them is the primary concern of the process.
The scope of Incident management for Stanford University IT and other University support entities for Production Services
Goal of Incident ManagementIncident Management exists to get the operation of a service back to 'normal' as quickly as possible in order to minimize any adverse affects on the supported Academic, business and research processes. This requires the continuous monitoring of the incident mitigation process through the collecting of heuristic information in order to improve the time to resolution, communicate effectively and eliminate incident re-occurrence.
Incident Management Process – Detect and RecordProcess: Incident ManagementActivity: 1.0 Detect & Record
Serv
ice
Des
k A
gent
Use
rIn
cide
nt S
uppo
rtD
ata
Inte
r/Int
ra
Proc
ess
Ann
otat
ion
INC 1.2Generate ticket
INC 2.1Prioritize &Identify SLA
'N' level support generated
Incident ticket
Incident ticket
Incident Supportgenerated tickets willhave all details enteredat generation time
INC 1.3Open New Incident
Open New ticket
Open Newticket
New CallerProcess Email system Self Service
PortalEvent
Management
New Incidents may beopened in a number ofways, e.g., though EventManagement,Self-Service portal,email, etc.
INC 1.4Verify User’ sInformation
Verified contact information
Verifiedcontact
INC 1.5Capture IncidentDetails / Categorize
ITIL® saysCategorization occurs in"Initial Support". As theinformation is availablenow and categorizationmay drive IncidentModel use we do ithere
INC 1.6Provide Unique
Number
Updated Incident ticket
UpdatedIncident ticket
DraftService Request
Draft ServiceRequest
At this point we knowthe ticket is an Incidentor a Service Requestand have eitherupdated the incidentticket or created a draftService Request
Incident?
INC 2.1Prioritize &Identify SLA
Yes
ServiceRequest
FulfillmentManagement
No
Generated on: 15Jul2015 @ 15:24
Incident Management Process – Initial SupportProcess: Incident Management
Activity: 2.0 Initial Support
Pred
eces
sors
Serv
ice
Des
k A
gent
Serv
ice
Des
k M
anag
erD
ata
Inte
r/Int
ra
Proc
ess
Ann
otat
ion
INC 2.1Prioritize & Identify
SLA
INC 1.2Generate ticket
'N' l
evel
sup
port
gene
rate
d
INC 1.7Incident ?
Yes
CMDB
CMDB
SLA target
Known Alert ? INC 2.3Attempt SOP Use
Updated SOP
Updated SOP
Knowledge database
Knowledgedatabase
SOP
Resolved?
INC 5.1Confirm
UserAcceptan
Yes
Major Incident?
INC 2.6Notify Stakeholders /
Declare MajorIncident
INC 5.4Resolution /
RecoveryDetails
Res
olve
d
Major Incident Procedure
Major IncidentProcedure
ProblemManagement
Notification
A major incidentrequires specialhandling and will havea predefined procedurethat must be followed.Once the procedure iscompleted the processadvances to the Closeactivity
INC 2.7Perform Incident
Matching
Incident Database
IncidentDatabase
Problem Database
ProblemDatabase
Knowledge Database
KnowledgeDatabase
The preferred searchorder should be:1) Knowledge Database2) Problem Database3) Incident Database
INC 2.9Escalate
INC 3.1Accept
assignment
There is nothing moreService Desk can do toresolve the incidentand the ticket will beescalated to IncidentSupport based on theCategorization for theaffected service
INC 2.10Handle Duplicate
Incident
Updated Incident ticket
UpdatedIncident ticket
Duplicate
INC 2.11Link Incident to
Problem
INC 3.7Match Found?
Yes
ProblemManagement
Link
Workaround ?
INC 4.1CR Required?
Yes
INC 2.13Wait for Problem
Resolution
Although not desirable,if the incident matchesan existing problemwith no workaroundalready defined at thispoint then the onlyoption is to wait for apermanent resolution
NO
Yes
No
Yes
No
No M
atch
Incident Match
Problem Match
No
Generated on: 15Jul2015 @ 15:24
Incident Management Process – Investigate & DiagnoseProcess: Incident Management
Activity: 3.0 Investigate & Diagnose
Pred
eces
sors
Inci
dent
C
oord
inat
orIn
cide
nt S
uppo
rtD
ata
Inte
r/Int
ra
Proc
ess
Ann
otat
ion
INC 3.1Accept assignment
INC 2.9Escalate
INC 3.2AcknowledgeAssignment
INC 3.3Acquire additional
information if required
INC 5.2User
Confirmation ?
No
Access additionalsources of informationand additionalinformation from theimpacted user
INC 3.4Re-evaluatecategory/priority
Categorizationchange?
INC 3.6Additional searches
Knokledge Database
KnokledgeDatabase
Problem Database
ProblemDatabase
Incident Support willhave a greater depth ofknowledge to draw onthan Service Desk andmay be able toformulate better searchcriteria
Match Found?
INC 2.11Link Incidentto Problem
Yes
INC 3.8Create Problem
Problem ticket
Problem ticket
ProblemManagement
Workaroundpossible?
INC 3.10Develop Workaround
INC 4.1CR Required?
Workaround
Workaround
The developedworkaround will berecorded in both theIncident ticket and therelated Problem ticketProblem Managementmay optimize theworkaround for futureincident resolutions
INC 3.11Wait for Problem
Resolution
ProblemManagement
No Workaround
It is not feasible todevelop a workaround,work on the Incident ishalted and we mustwait for a permanentresolution to beidentified andimplemented byProblem Management
Yes
No No Yes
No
Generated on: 15Jul2015 @ 15:24
Incident Management Process – Resolve & RecoverProcess: Incident ManagementActivity: 4.0 Resolve & Recover
Pred
eces
sors
Inci
dent
Sup
port
Dat
aIn
ter/I
ntra
Pr
oces
sA
nnot
atio
n
CR Required?
INC 2.12Workaround ?
Yes
INC 3.10Develop
Workaround
Does implementing theworkaround requirechanging somethingthat is under ChangeManagement control? Ifso a CR is required totrigger actions fromChange Management.
INC 4.2Create CR
CR
CR
ChangeManagement
Change Managementwill facilitate theimplementation of theworkaround
INC 4.3ImplementResolution /Workaround
INC 4.4Recover theenvironment
INC 5.1Confirm
UserAcceptan
ChangeManagement
Successful Change
Workaround has beenimplemented, either byIncident Support or by asuccessful changeimplementation. Theproduction environmentmay need additionalresetting to beoperational
Yes
No
Generated on: 15Jul2015 @ 15:24
Incident Management Process – Close IncidentProcess: Incident Management
Activity: 5.0 Close Incident
Pred
eces
sors
Inci
dent
Sup
port
Serv
ice
Des
k A
gent
Dat
aIn
ter/I
ntra
Pr
oces
sA
nnot
atio
n
INC 5.1Confirm UserAcceptance
INC 2.4Resolved?
Yes
INC 4.4Recover theenvironment
ProblemManagement
The affected usershould confirm that theworkaround orpermanent resolution(Problem Management)is able to get what theyneed to continue towork
UserConfirmation ?
INC 3.3Acquireadditionalinformati
No
INC 5.3Capture UserFeedback
Feedback
Feedback
This is not a CustomerSatisfaction survey.Simply capture anycomments the user hasabout how this incidentwas handled
INC 5.4Resolution /
Recovery Details
INC 2.6Notify
Stakeholders /Decl
Resolved
Updated Incident ticket
UpdatedIncident ticket
ChangeManagement
If Change Managementverified the incidentresolution as part of itsimplementation toresolve the incident thedetails need to berecorded in the Incidentticket before closure
INC 5.5Close Incident
Closed Incident Ticket
ClosedIncident Ticket
Yes
Generated on: 15Jul2015 @ 15:24
Incident Management Roles• Incident Process Owner• Incident Process Manager• Incident Coordinator• Incident Support• Service Desk Agent• Service Desk Manager• User• Problem Process Manager • Major Incident Owner (MIO)
Benefits• Standardization of incident management process within University IT• Alignment with common ITSM practices• Role and procedural definitions• Clarifying process interaction and interdependence
Major Differences to Current Process(es)• Role of Call feature in Service Now• Separation between Request and Incident• Renewed focus on problem management• Any issue without known root cause goes to problem management
• Role of Process Manager• Major Incident Process• Major Incident Owner• Standardized DOC process
• Categorizations shared between all processes
Departures from ITSM Common Practice • Created two-tiers for Major Incidents (DOC and Non-DOC)• Separation does not exist in ITSM • Incident Management Team (DOC) is called for all P-1 Incidents
• Service Desk not serving as incident owner• Service Desk owns communication with end-users in most mature ITSM shops
• Current staffing levels, current practice, and organizational complexities make this impossible
• Incident ownership is moved up one level to Incident Support and Major Incident Owner roles