Change Management Procedures
Document No: ITS-NOC 300Title: Change Management ProceduresMonth Year May 2011Doc. Type MS WordFile name: ITS-NOC 300: Change Management Procedures V4.1_.doc
Classify by , , and .
Pre-RFC ChecklistStart
The Change Process
Urgency High ? Get Customer Approval
Schedule using
No
?
Determine appropriate Alert Change Manager to
No
with Customers
change request.
Send to CAB
Perform ChangeTest change
User OK?
CAB OK?
Yes
Yes
No
Yes
Analyze/resubmit
2
Yes
No
Analyze/resubmit
Change Management Procedures
- 2 -
The Change Process Part 2
No
Back Out Successful? No
Yes
Yes
Change Successful?
Stop
)
2
- Complete Documentation updates - Schedule Training (User, Staff)
Change Management Procedures
- 3
Change Management Procedures
Table of Contents
ContentsSummary..........................................................................................................................6Goals of Change Management.........................................................................................7Change Classifications.....................................................................................................8
1) Urgency...........................................................................................................8ChangeLog.......................................................................................................................9
2) Impact..............................................................................................................103) Risk..................................................................................................................10Example of Change Types...................................................................................11
Forward Calendar of Changes.........................................................................................11Maintenance Window......................................................................................................12
By Service............................................................................................................12Guidelines for Selecting Upgrade Windows........................................................13
Change Approval Matrix.................................................................................................14Approvers Table (low and medium risk changes only).......................................14
Activities and Deliverables..............................................................................................15Approval Procedures............................................................................................15
The Request for Change (RFC).......................................................................................16How to..................................................................................................................16
Lead Time for Approval..................................................................................................17RFC Submission Fields....................................................................................................18Customer Groups Communication/Service Bulletins......................................................20Forward Schedule of Changes (FSC)..............................................................................21Emergency Change Process.............................................................................................22Closing the RFC...............................................................................................................22Change Assessment (Post-Implementation Review).......................................................23Glossary...........................................................................................................................24Appendix 1 CAB Contact List.........................................................................................26Appendix 2 Pre-RFC Initiation Process..........................................................................27Appendix 3 Critical Incident Reviews (aka Post-Mortem”) Guidelines.........................32Appendix 4 Examples of changes....................................................................................33Appendix 5 Examples: Communications Strategy.........................................................35
- 4 -
Change Management Procedures
SummaryChange Management applies to all changes to Applications, Databases, Networks, Infrastructure and Documentation.
All changes are recorded and classified by their: service impact risk urgency
Who can initiate changes:Each change is sponsored by a Manager who is responsible for the planning and implementation of the change, and any recovery.
Certain changes are reviewed and approved by a Change Advisory Board (CAB). Those changes include all High Risk changes and changes that meet the criteria illustrated in the Change Approval Matrix. The CAB meets weekly with the sponsoring Manager(s) and approves changes for the following week, and beyond.
Highly Urgent (Emergency) changes require separate approval.
Unsuccessful changes that are successfully backed out within the change window are followed up with an Analysis of Failed Change by the sponsoring managers.
Unsuccessful changes that are not successfully backed out within the change window are followed up with a CIR (Critical Incident Review( aka Post Mortem by the Director, IT Operations or designate.
Frequency of CIR ReviewReviews are conducted by the Operations Committee at the bi-weekly Tuesday morning meetings at 09:00am in Rm443. Meeting is chaired by the Director of Operations or his/her designate..
- 6 -
Change Management Procedures
Goals of Change ManagementThe goals of the Change Management Procedures are to:
Minimize disruption caused by changes to the production environment through effective risk management.
Assign ownership and responsibility for changes to the relevant manager.
Optimize the change process through active management of reliable data on changes.
Manage all changes to the production environment including hardware, software, environmental equipment, networks, and procedures.
Keep users informed of changes to the production environment and manage their expectations. (This is also an opportunity to promote and strengthen client communications.)
Keep UBC IT staff aware of changes to the production environment and increase the number of UBC IT staff reviewing changes.
Develop a more pro-active approach to changes through encouraging better planning of changes, back outs and contingencies.
To measure success and success trends over time.
Key Success FactorsA CM process used by staff who see it as an enabler
Control over urgent changes
Correct scoping and staffing of CM
Knowledge of changes required for good decisions
A CM process aligned with the project lifecycle
Visible management commitment and support to enforce process.
- 7
Change Management Procedures
Change ClassificationsChanges are classified according to:
1. Urgency2. Impact3. Risk4. Nature of Application.
These changes are classified by the change initiator, and checked by the CAB.
1) UrgencyDescribes how quickly the change must be implemented.
High (Emergency Changes). The two changes that fall under this category are: Break/fix: required immediately – a system or service has failed and is not available and cannot wait for the normal change approval process and/or next scheduled maintenance period.
The other type of emergency change is urgent and immediate action is required to avert a system or service outage.
High Urgency changes are considered as emergencies and are subject to a separate Emergency Change process. Approval is granted as noted in the Emergency Change Process section.
For Break/fix situations and the Sponsor deems a Change is required, an Emergency Change can be done without approval received prior to the change but should submit an RFC as soon as possible after the change.
Medium A normal change which will be processed by the CAB and installed in the next maintenance window.
Standard aka Change Log A standard change is one that is effectively pre-approved and can readily reference pre-defined workflows – these can be entered as a Change Log.
A standard change is a change to the infrastructure that has the following characteristics: recurrent, well known and proven pre-defined, relatively risk free is the accepted solution to a specific requirement or set of
requirements. Approval is given in advance. Does not require CAB approval but is subject to Post
Implementation Review by the CAB.(PIR)
- 8 -
Change Management Procedures
Low A change that can be deferred and/or grouped with other changes in a subsequent release.
ChangeLog
A “change log” is defined as a standard change and does not require CAB approval.
These standard changes ae well understood, recurring, low risk change, redundant, accepted response to a specific requirement of set of circumstances, the approval for which has been delegated to the group making the change.
Examples:
- 9
Installation of SSL certificates / patches Scheduled workload to new TWS server Cycling of LDAP instances Ongoing alumni production db maintaince Symposium call pilot integration
Change Management Procedures
2) ImpactThe change impact is determined by the number of users who could be affected by a reduction in their normal service level. This takes into account the worst case where a change does not succeed.
High Change effect covers the campus.
Medium Change effects roughly a department or building.
Standard Change There is no anticipated user impact, for example, a change to a redundant system. A standard change is one that is effectively pre-approved and can be entered as a Change Log. See examples:
Low Change effects a single user or small number of users.
3) RiskDescribes the risk* that the change will not be made successfully.
High There is a higher than normal risk that the change will have to be backed out, or that a system failure will occur after the change is installed.
Medium A change where success is the expected outcome, but the change is not routine, and could require a back-out.
Low A routine, proceduralized change, with little chance of failure.
*Risks are managed through back-out and contingency planning, and all changes with high risk require approval of the CAB. The Change Owner of a high risk change is expected to demonstrate back-out & contingency plans consistent with the level of risk.
- 10 -
Change Management Procedures
Example of Change TypesAdmin FMIS, HR, Alumni, Student systems
Elearning WebCT
Enterprise Middleware CWL, LDAP, DNS servers
Enterprise Applications Email, Mercury, uPortal, UBC main web page
Forward Calendar of ChangesCheck the forward calendar of changes for conflicts with your proposed change. This calendar is found on outlook.
See example below.
- 11
Change Management Procedures
Maintenance WindowBy ServiceThe maintenance window is determined from the following table. Note that the maintenance windows in the table are meant to cover backout if required, i.e. in a 6:00 am to 8:00 am window, the change would normally be done from 6:00 am to 7:00 am. If the change required exceeds the allotted time, then the change should begin earlier, rather than impinge on the contingency time.
Please note that at this time – November 2009 – this process is being revised and a draft document is being proposed.
Service Upgrade Time Contingency Time
WebCT (Effective July 1, 2005)
Saturday with an expanded time frame between 08:00 and 12:00. Saturday 11:00am to 12:00am
HRMS Weekend 06:00 to 9:00 Weekend 07:30 to 09:00
PeopleSoft Saturday, 06:00 to 09:00 Saturday, 06:00 to 09:00
- 12 -
Change Management Procedures
Guidelines for Selecting Upgrade WindowsIf no maintenance window has been established for a service, use the following table to select a potential window. (the default time is 06:00 – 08:00am Monday to Friday when not stated)
Change Type Impact Upgrade Time Contingency Time
Admin applications All Outside business hours and
with agreement of customerOutside business hours and with agreement of customer
Cable system All Set by customer Set by customer
Elearning All 06:00 – 08:00am Saturday 07:00 – 08:00 Saturday
Enterprise middleware
Medium or Low 06:00 to 07:00 weekdays 07:00 to 08:00 weekdays
Enterprise middleware High 06:00 to 07:00 weekends or
holidays07:00 to 08:00 weekends or holidays
Enterprise applications
Medium or Low 06:00 to 07:00 weekdays 07:00 to 08:00 weekdays
Enterprise applications High 06:00 to 07:00 weekends or
holidays07:00 to 08:00 weekends or holidays
Network Medium or Low 06:00 to 07:00 weekdays 07:00 to 08:00 weekdays
Network High 06:00 to 07:00 weekends or holidays
07:00 to 08:00 weekends or holidays
Security (firewalls)
Medium or Low 06:00 to 07:00 weekdays 07:00 to 08:00 weekdays
Security (firewalls) High 06:00 to 07:00 weekends or
holidays07:00 to 08:00 weekends or holidays
Voice All 05:00 to 07:00 07:00 to 08:00
- 13
Change Management Procedures
Change Approval MatrixApprovers Table (low and medium risk changes only)
IMPACT
Service Type NoneNo Impact
Low Single Users
MediumDepartment
HighCampus
Admin applications Group Group Group CAB
Cable Group Group Group CAB
eLearning Manager CAB CAB CABEnterprise Operating Systems CAB CAB CAB CAB
Enterprise middleware CAB CAB CAB CAB
Enterprise applications CAB CAB CAB CAB
Network Manager Manager Manager CAB
Security (firewalls) CAB CAB CAB CAB
Voice Manager Manager Manager CAB
Infrastructure CAB
Firewall Rule Changes CAB
Note: all High Risk Changes require CAB approval.
- 14 -
Change Management Procedures
Activities and Deliverables
Approval ProceduresWho can approve:
Determine from the Approvers table whether the change can be approved by your group, manager or theCAB.
Group or manager level?If group or manager level approval is indicated, then follow approval procedures for your own group.
If CAB approval is requiredComplete the RFC web form at http://www.UBC IT.ubc.ca/change.html
This webform will send a message to NOC staff, who will open an RFC in Magic. The newly created Trouble Ticket number generated will be sent to you for further referencing. (for example to relay success of change.
- 15
Change Management Procedures
The Request for Change (RFC)
How toGet approval by CAB
Enter the RFC on the webform at:http://www.UBC IT.ubc.ca/change.html
This webform will send a message to NOC staff, who will open an RFC in Magic trouble ticketing system..
The newly created Trouble Ticket number generated will be sent to you for further referencing. (for example to relay status of change or other explanations.
CAB meetings scheduleThe CAB meets Tuesday afternoon at 1:30 pm in Rm443.
DeadlinesRFCs to be considered by the CAB need to be entered by 23:59 on Monday evening.
Change to be implemented on the subsequent weekend must be sent to CAB by Monday (up to 23:59)
Decisions wil be sent out by Tuesday afternoon following the CAB meeting.
Change Owner responsibilityAll changes must be preent to the CAB by the change owner or designate in the event that there is a need for clarification regarding the proposed change.
RFC change “logs” It is possible that you may want to “log” but not require a change to be approved by the CAB.
a. Ensure that the criteria for the log has been met.b. Then “Send RFC to Log” instead of sending “RFC to
CAB” when completing the webform.
How to comment on/change RFCDo a “reply to all” from the original message that the NOC staff had sent to you notifiying you of RFC reference #.(which in fact is a Magic trouble ticket)
Closing the change
- 16 -
Change Management Procedures
All ITS staff have access to Magic and have IDS. Find the RFC number in Magic and add an action item indicating the outcome of the change: Successful, Cancelled, Backed Out, Failed.
If you do not have a Magic ID (i.e.: you are a new employee or a consultant, for example) – send a message to [email protected] containing the above status information of the RFC and they will enter the completion status on your behalf.
Lead Time for ApprovalLead time The lead time required to make a change depends on the SLA in place
for the service(s) interrupted.
In the absence of an SLA, the service manager or customer can set the lead time for approval. Without an SLA or Service Manager or customer input, the CAB requires the following minimum notice: An RFC submitted on Monday can be scheduled for the subsequent Saturday morning or after.
There is a 72 hour window that is required for customer notification once change has been approved.
A change that cannot wait for the next CAB meeting can be marked Urgent, and may be approved through the emergency change process.
- 17
Change Management Procedures
RFC Submission FieldsSubject A short description of the change which will appear on the email
subject line.
Contact Name The name of the RFC submitter.
Change Owner The name of the UBC IT manager responsible for the change.
Contact email Email of the RFC submitter.
CC Email addresses to which the change will be sent (in addition to the CAB).
Sponsor Name The name of the service manager or group manager who accepts responsibility for the change and its outcome. The change sponsor is normally a member of CAB, represents the change to CAB during its discussion, and is the point person if the change fails. Note that the sponsor must not be away at the time of the change or as required, have a designate in place.
Routing This specifies whether the RFC will be circulated to CAB (the usual case), or just logged in Magic.
Logging straight to Magic is appropriate when the RFC has been approved at group manager level.
Proposed date, start time, end timeThe date and time for the change will normally fall into one of the maintenance windows listed under Maintenance Windows.
Reason for Change Overall description of the change. (Is there a need to use specific terminology that should appear on website?)
Urgency High, medium, or low.
Impact High, medium, low, or none.
Risk High medium, low.
Effects The services that could be affected by the change. Give a clear description of the effect, and potential effect, on the customers.
- 18 -
Change Management Procedures
Communications The person responsible for communications.
Communications and Web Site UpdatesWho is responsible for communicating this change to key stakeholders? This may be the same person as the person requesting the change, or the service manager, or other person. Please note that the UBC IT main website is the default.
Web Site Content Suggested wording for any potential service bulletin, including a concise headline. All websites should be consistent in message content and layout where possible.
Components From the UBC IT perspective, what platform components will have been modified.
QA approval Name of the person(s) responsible for assuring the quality of the change. This could be the person who tested the change, or the installer.
Implementers Who will make the change. (Will change require training or documentation updates?)
Backout personnel List of people to contact if the change must be backed out.
QA Checklist A series of tic boxes that provide an overall view of the level of quality assurance.
- 19
Change Management Procedures
Customer Groups Communication/Service BulletinsCustomer groups are contacted in sufficient time to amend or reschedule the proposed change.
The benefit of good communication with the customer is that it not only allows a better understanding by the customer of the effort that is required to sustain a service but shows that any downtime has been carefully considered and reviewed to minimize impact on the customer’s business cycle.
The normal means of contact consists of email lists and web postings, although personal contact is also used as appropriate.
Documentation on who to contact and how, together with sample scripts for service outages (located in section 4: Guidelines for describing the problem OR can be found in the document Service Outage Notifications located here:.
Saturn\users\UBC IT|public|Systeminfo|Service Outage Notification Procedures.doc
(The contact is done by the person designated in Communication and Web Site Updates on the RFC.)
If customer objects to the change time after the notification goes out, the customer is referred to the change sponsor for resolution.
- 20 -
Change Management Procedures
Forward Schedule of Changes (FSC)The NOC operators will add new approved changes to the forward calendar of changes (FSC).
The forward FSC is available on Microsoft Outlook. – See example below.
- 21
Change Management Procedures
Emergency Change ProcessAn Emergency Change is a change whose Urgency is high, and where a deliberate decision is made to shortcut the normal change process. Examples:
Service outages (major) HW/Application failure Security threat
The steps for getting an Emergency Change approved are:
Requestor: Obtain approval of the customer and Service Manager Alert the Change Manager to the forthcoming change Issue the RFC with Urgency = High
The following conditions must be met for an Emergency Change to be approved: An Emergency Change requires 3 approvers from the UBC IT Management (Directors and
Managers). One of the approvers should be a Director, preferably the assigned Code 3 Director. The Change Sponsor cannot be an approver. A Director can veto the approvers.
The normal change control process notifies CAB of the RFC through email. CAB members can communicate through reply email or telephone Operations to participate in the approval process.
In the event that Operations needs to escalate and telephone Management to obtain approvers, then they will start with the assigned Code 3 team.
Ideally the approvers are ones who are familar with the subject area as they can best represent the impacts, urgency, and risks of the Change.
With completion of the above, the normal change control process is adapted as appropriate.
*Note that changes that are not genuinely urgent will not be approved through the emergency process, and will have to wait for the next meeting of the CAB.
Closing the RFCFollowing the change, its implementation is reviewed and closed. The outcome is recorded as a Magic action based on one of the following:
Successful Change accomplished and system back up on time.
Cancelled/Denied Change called off before the change start time.
- 22 -
Change Management Procedures
Backed out Change was removed from service during the maintenance window.
Failed The change interrupted normal service.
All UBC IT staff have a Magic ID, the action can be entered directly online. If for some reason you do not have a Magic ID, a message can be sent to the NOC staff at [email protected], quoting the RFC number and the outcome, and the staff member will update the record.
- 23
Change Management Procedures
Change Assessment (Post-Implementation Review)For all completed changes, the following steps are taken:
The NOC staff produce KPIs which summarize the changes performed in each period by outcome
Changes from the previous week are examined at the weekly CAB meeting, for example:
change objective achieved, feedback (positive/negative) from users and customers, were there unexpected side effects from implementing change, effective resource planning, implementation was executed as planned, change was on time, backout was successful if/when applied) “lessons learned” incorporated into “pre-RFC” checklist report and followup with problems (e.g.: vendor, customer related, CAB) information is absorbed into improvement process.
The statistics are reviewed by the change committee.
For any changes which was backed out or failed: a problem report is created by the operators and sent to the change sponsor, or the
service manager if different from the change sponsor.
For any change which failed, a critical incident review is conducted at the next Operations Committee meeting
following the change, and the results of the review is published along with the Operations Committee minutes.
- 24 -
Change Management Procedures
GlossaryAnalysis of Failed Change An analysis of a failed change is held after a change fails and is
successfully backed-out. The analysis is managed by the Change Owner Manager or designate and results reported to the Director, IT Operations, and the Operations Committee. It confirms the sequence of events, establishes the cause(s) of the failure(s) and makes recommendations to improve future performance.
CAB The Change Advisory Board.The CAB consists of all members of the Operations Committee, all service managers, and others as required to represent the various customer groups. Change Advisory Board Changes are the responsibility of the Change Advisory Board.
Change Manager The Change Committee convener. The Change Manager performs the following tasks:
Convenes the CAB Circulates new changes to CAB Issues approvals for changes
There is always a duty change Manager. If you do not know who it is, you can find out by calling the Operations Centre.
Change Owner The service manager or group manager who accepts responsibility for the change and its outcome. The Sponsor performs the following tasks:
Represents the change to CAB during its discussion Takes responsibility for recovery if the change fails. Deals with customer objections to the change timing.
Note that the Change Owner must not be away at the time of the change. If absence is unavoidable, please identify designate prior to meeting.
Critical Incident Review (aka)Post Mortem A Critical Incident Review (aka “post mortem”) is held after both a
change and the back-out plan fail. The Critical Incident Review is managed by the Director IT Operations or designate and is reported to the Operations Committee and UBC IT’ Senior Management. It confirms the sequence of events, establishes the cause(s) of the failure(s) and makes recommendations to improve future performance.A Critical Incident Review may also be called on other non-change failures in the production environment at the discretion of the Director IT Operations.
- 25
Change Management Procedures
Emergency Change A change whose Urgency is High, which cannot wait for the next CAB meeting. An emergency change is a case where a deliberate decision is made to shortcut the change process.
Enterprise Middleware Examples of middleware at UBC IT include, webservers (Apache, Tomcat, ColdFusion), LDAP services,
Enterprise Applications Examples of applications which are extended to the UBC community include: UBC Mail (Exchange, SunOne), CWL, MyUBC, UBC Directory.
FSC Forward Schedule of ChangesThis shows when approved changes are scheduled for installation. The FSC is available using Microsoft Outlook. (open ChangeLog calendar)
Normal Changes Normal changes are changes to the live environment that can be scheduled in advance and installed in a predictable manner.
Problems Problems are unscheduled failures of the live environment. Problems are handled through a process of problem management and escalation. Changes to the live environment made as part of a problem fix are recorded in the trouble ticket system, not the change management log.
Problem management is not covered by this document.
Standard Changes A standard change is a well understood, recurring, low risk change, accepted response to a specific requirement of set of circumstances, the approval for which has been delegated to the group making the change. These can be entered as change log where the outcome is expected to be successful with no impact to the user community.
- 26 -
Change Management Procedures
Appendix 1 CAB Contact ListEmergency CAB Members
Listed alphabetically
Aksentsev, Felix (IBA) Belsito, Mark (Apps PM) Burns, Jennifer (Passive) Cooper, Lynda (Parental Leave) Craven, William (Passive) Cumming, Lois (Systems) Fong, Kent (Access) Frazer, Dave (Passive) Haeusser, Jens – (Passive) Hay, Marilyn – (NMC) Huang, Amy (Passive) IT - Systems Operations (Passive) Johnson, Patrick Kita, Stan Lay, Sean Lee, Jeanne Lim, Michael Loewen, Doug Macdonald, Bob McKelvie, Evelyn Miladinovic, Jovan Ng, Susan Operations (ITServices) Quinville, Doug Rosco, Steve Sayer, Margaret Shaw, Wes (Passive) Smith, Brock Thompson, Don (Passive) Thorson, Michael Twining, Neil IT Resource: Klinck Rm.443 Bourdon, Eric Razi, Sam
- 27
Change Management Procedures
Appendix 2 Pre-RFC Initiation Process (Draft: Please comment (improvements or additions to list)
Note 1: This is not meant to be a comprehensive list, but a starting point in determining what needs to be considered, your input is greatly appreciated.
1) Reason for change: - Response to a business need - new service rollout- end of service/systems lifecycle- security, audit, legislation - customer driven- current service - identified problem/hardware failure- technical – software upgrades
2) Perform risk benefits/risk analysis for each change prior to submission
What is risk of failure, what might fail, what would be estimated time to restore and recover.
a) Customer Impact on Business - What services would be affected by unplanned outage- What is effect of not implementing change?- Is timing right for cycle of business? - What are the customers critical dates (see osmium calendar)- What is impact to an identified SLA or other agreement if there is one
in existence?Customer Notifications
- Notification of outage / Does it meet customer notification timelines? Are alternate avenues (websites, systems status line) provided for customer/ Are customers notified of resolution.
b) Environmental Impact- What is impact on other current processes/projects – What is the
priority attached to this?- What are other services that run on same infrastructure- What other changes are occurring that day (see outlook calendar
calendar)- Evaluate the impact on the following:
o System Hardware OS Release Memory Disk space CPU type
- 28 -
Change Management Procedures
Remote access Password issues
o Network Bandwidth, routing, SNMP passwords, interference,
firewall, name resolution, security
o Application Detailed list of applications and dependencies,
release and patch level, critical applications
o Processes Description of the existing business processes and
dependencies, critical processes
o Organization Information on all units, alarm plan, list of
phone numbers, logistic issues, etc. health and well-being of staff who may be working extra long hours. There may also be physical security issues such as after hours access.
o Security Office Are existing security measures in place met? Is the security office familiar with change
c) Resources requirements - Identify internal and external resources/ skillsets required for successful
change - Availability of an experienced architect resident to ensure the
infrastructure and changes are planned and vetted before production
- Verify that all equipment, software, hardware, and updates are available
- Verify that backup tapes are available in the event of a back-out or restore.
- Research the requirements/ other supporting resources to achieve a successful change (required patches and stability of upgrade, licenses, security issues identified).
- Identify who needs to be aware and/or on standby when changes are planned. A rep from each group in department change management team to be ‘on duty’
- 29
Change Management Procedures
- Systems and services should be operationalized to the extent that the NOC can turn services on and off
- Must communicate fully to NOC who provide updates via status messages/website updates: Problem process loop, open/close, ongoing
d) Testing
This should address issues of performance, security, maintainability, supportability, reliability, and functionality:
- Develop a detailed plan of action to reduce the risk to an acceptable level. (Comprehensive testing plan/signoff on tests carried out) (implementation details) This should explain the steps that must be taken to restore access in the event that the change has a negative impact. Provide online link or file location .
- Develop a plan of action to lessen the affects on the customer if the change should cause an outage. What is workaround? Is there automatic failover in place?
- Always develop a pre-change check-list (template should be developed) When did we last do a shut-down? A stop/start should be tested without changes Checkpoints need to be identified as problems arise – stop or go?
E.g. no shutdown script – stop? Go live checklist Walkthrough all dependencies first Ensure everything is tested with the current version of the
operation system Test cards prior to installing into production (done in this case). Test the drivers supplied under the correct release of the O/S – test
every configuration parameter in advance Additional or spare hardware requirements? Create an outage ‘timetable’ for planned outage window - e.g.
Sunday 5-7ama. Flag go/no go stepsb. Identify when to back out if necessaryc. Create for every change
- A duplicate environment is required (parallel to try out) – need for all applications - many development environments are not adequate
- 30 -
Change Management Procedures
- Communicate to senior management the importance of appropriate equipment
need functionality – not necessarily the exact same boxes (save $$$)
- IF vendor resources are required, check availabilitya. Is the support line adequately ‘manned’ during off-hours
(weekends?)b. Pre-arrange contacts to ‘stick handle’ the situation with
internal resources 24x7?c. Notify vendor (s) of any impending changes/outages to
ensure we are on their radar if any issues arise
e) Escalation process:- Clarify Technical and Mgmt stakeholders: - Contact lists available: cell-phone / wireless-phone / pager, all can be automated- Refer to policies as to who and how the go/no go decision is made – management?
f) After a successful change:
Review the following: was the planning correctly carried out? Report on completion/ Update RFC record or log as to status. Reply to
email that was sent to you by NOC informing you of RFC reference number
has the impact been correctly estimated? has the usage of resources been correctly estimated? Was it necessary
to deploy the resources of the standby team during the rollout or change?
has the right effect of the change been reached? are the users satisfied with the result? # incidents on the change were / are there unexpected side effects? have detected anomalies been communicated to the CAB for future
changes? Are there necessary documentation / training (customer, staff) updates
required after change. Identify resources: who will do what
- 31
Change Management Procedures
Finally:
Obtain approval to proceed from your immediate or appropriate responsible manager for requesting the change.
Submit a complete, concise, and descriptive Change Request no later than Monday (up to 23:59 - midnight.)
- 32 -
Change Management Procedures
Appendix 3 Critical Incident Reviews (aka Post-Mortem”) Guidelines
When to holdA “CIR” (Critial Incident Review)is held after both a change and the back-out plan fail.
This is managed by the Director IT Operations or designate and is reported to the Operations Committee and UBC IT’ Senior Management.
It confirms the sequence of events, establishes the cause(s) of the failure(s) and makes recommendations to improve future performance.
A “CIR” (Critial Incident Review) may also be called on other non-change failures in the production environment at the discretion of the Director IT Operations.
Example:“CIR” (Critical Incident Review)Description of IncidentDate
Attendees: Absent:
Preparation: Always bring documentation/ticket info to CIR meeting
Agenda:-Ground rules for critical incident review.-What took place? Chronology of events-Why did it take place?-What to do prevent this in the future (processes, environment, etc)
1. Action ItemsAction requiredArea of responsibilityDate dueStatusReview
-Other Lessons Learned Things that worked well Further recommendations.
PROBLEM RESOLUTION PROCESS
- 33
Change Management Procedures
Appendix 4 Examples of changes
Applications new application rollout new systems releases/conversions/functional enhancements/fixes maintenance
Databases changing the name of a db table, view, or column modifying a stored procedure, trigger, or user-defined function changing or adding relationships using referential integrity features changing or adding db partitioning. moving a table from one db, dbspace, or table space to another. changing the uniqueness specification of an index. clustering the table data by a different index. changing the order of an index (ascending or descending).
Networks installs, upgrades, disconnects, router configs.
Telecommunications facilities- communication rooms
Infrastructure Hardware (moves, add, changes, hw relocation, emergency
replacements, OS, installations, ) SW releases, enhancements
Documentation
Periodic maintenance, User requests, Hardware and/or software upgrades, Acquisition of new hardware and/or software, Changes or modifications to the infrastructure, Environmental changes, Operations schedule changes, Changes in hours of availability, and Unforeseen events.
Facilities Major electrical upgrades, installs Air Conditioner upgrades Fire suppression systems
- 34 -
Change Management Procedures
UPS systems Security panels
Desktops HW configurations OS Applications, releases, patches
- 35
Change Management Procedures
Appendix 5 Examples: Communications Strategy
PurposeThis document serves as a GUIDE FOR COMMUNICATIONS REQUIRED for the Change Management process.
Include all stakeholders in the information flow to manage expectations and invite participation prior to, during and after rolling out new features or service.
Issues:o What will be communicated
o What media mixes are most effective
o Frequency
o Urgent messages
o Audiences
- 36 -
Change Management Procedures
Example guidelines for describing the problem
Concise Description
Describe concisely and specifically what the planned outage affects in terms of what the customer uses the service for.
Indicate what this outage affects or what error messages a user might see if they try to access the service while it is out.
Examples“This service or system will be unavailable…. Faculty and staff will be unable to log into..…”“The main UBC web server will be unavailable…. You will be unable to access any web pages beginning with, …..may return ‘no such domain name’ messages.”“E-mail services for faculty and staff will be unavailable ….”
Specify the duration of the outage. Example“…unavailable on Saturday, August 7, between 5AM and 7AM.”
Indicate if there is an alternative available to users while a service is unavailable.
Examples“The __________web server will be unavailable… In the meantime you can access your email by using…..” “The _________server at www._______________will be unavailable … Please use the following ……”
Identify if there is a web page that has useful information for the users. Example“For updates and a complete list of services affected by the outage, please visit http://www.itservices.ubc.ca/support/ .”
Include suggestions where the user can go for additional information
Examples“If you have questions please contact the Help Desk at 822-2008“Please call 822-4115 for recorded updates.”Please refer to the "Systems Alerts" section found at http://www.itservices.ubc.ca/support/ for further updates.Please contact the Network Operations Centre at 822-5438 (option 4) when network problems are experienced.
Web page updates, examplesExample 1: Oracle3 outage notices for websites.
- 37
Change Management Procedures
“System Alert” section at http://www.itservices.ubc.ca/support/bulletins
Oracle3 Service Outage - A service outage is scheduled on August 27, 200x from 6:00am to 7:30am for maintenance on the oracle3 server. It will affect the following services and applications.
Asbestos Tracking System Campus-Wide Login (CWL) Faculty and Staff Pension Magic TSD Call Tracking System Paradigm Call Tracking System myPress Copyright Management myUBC
ORSIL Tracc-II UBC Faculty and Administrative Directory UTAW (Utility Tool for Administrators of WebCT)
We apologize for any inconvenience. Please contact the Network Operations Centre at 822-5438 (option 4} if problems are experienced.
Users of Windows 2000 and XP - If you encounter access problems following the Oracle3 Service Outage please try the following: From the Start menu, select "Run", then type in 'cmd' (without the quotes) and press the "Enter" key. A DOS command will appear. Enter the following information.ipconfig /displaydns Then press the "Enter" key to view the local DNS cache.ipconfig /flushdnsThe press the "Enter" key to flush the local DNS cache.or you can simply reboot your PS to pick up the new DNS for Oracle3. Please contact the ITServices Help Desk at 822-2008 if you continue to experience difficulties.
CWL Site
Service BulletinA service outage is scheduled on August 27, 200x from 6:00am to 7:30am for maintenance on the oracle3 server. As a result, myUBC, CWL, WebCT administration tools and other online services may not be available during this time. For a full list of services affected, visit www.itservices.ubc.ca/support/bulletinsNETINFO/ INTERCHANGE
Service Bulletin A service outage is scheduled on August 27, 200x from 6:00am to 7:30am for maintenance on the oracle3 server. As a result, myUBC, CWL, WebCT administration tools and other online services may not be available during this time. For a full list of services affected, visit www.itservices.ubc.ca/support/bulletins
- 38 -
Change Management Procedures
my.ubc.ca
Service Bulletin
A service outage is scheduled on August 27, 200x from 6:00am to 7:30am for maintenance on the oracle3 server. As a result, myUBC, CWL, WebCT administration tools and other online services may not be available during this time. For a full list of services affected, visit www.itservices.ubc.ca/support/bulletins
- 39
Change Management Procedures
Example 2: Enrollment Services Database outage notices for websites.Student Centre Service web site
Service Bulletin
The Enrolment Services Database will be unavailable Saturday, October 19, 200x (and Sunday Oct 20 if required). The outage is expected to start at 3:00AM Saturday morning and last all day. The purpose of this outage is to perform an Oracle Database upgrade. We apologize for any inconvenience. For a list of all services affected by the outage, please visit www.itservices.ubc.ca/support. “System Alert” section at http://www.itservices.ubc.ca/support/bulletins
Service Bulletin
The Enrolment Services Database will be unavailable from 3:00 AM, Saturday, October 19, 200x and Sunday, October 20, if required. The purpose of this outage is to perform an Oracle Database upgrade.All applications/services that require access to the Enrolment Services Database will be affected. The following is a list of known services that will be impacted:Enrolment Services - Admissions- Awards- Course Scheduling/Catalog (Ad Astra)- Elections- Degree Navigator (DAG)- Faculty Service Centre (FSC)- Student Authentication Service- Student Information Service Centre (SISC)- Student Information System (Old Green Screens/Unikix)- Student Service Centre (SSC)Others- TracII/Netinfo access requiring student number validation- myUBC authentication using student number- CWL registration using student number- Housing authentication using student number- WebCT administrationWe apologize for any inconvenience.
my.ubc.ca
Service Bulletin
Due to upgrades to the Enrolment Services server, students will not be able to log in using student numbers on Saturday, October 19 and Sunday, October 20. We apologize for any inconvenience. For a list of all services affected by the outage, please visit www.itservices.ubc.ca/support/bulletins
CWL Site
- 40 -
Change Management Procedures
Service Bulletin
Student CWL registration will not be available on Saturday, October 19 and Sunday, October 20, if required, due to upgrades to the Enrolment Services database server. For more information, visit www.itservices.ubc.ca/support/bulletins. We apologize for any inconvenience.
NETINFO
Service Bulletin
Netinfo registration and password resets will not be available on Saturday, October 19 and Sunday, October 20, if required, due to upgrades to the Enrolment Services database server. For more information, visit www.itservices.ubc.ca/support/bulletins. We apologize for any inconvenience.
www.webct.ubc.ca
Service Bulletin
Enrolment Services interruption: Saturday, October 19/02 - all dayDue to the downtime of the Enrolment Services database on Saturday, October 19/02, the following WebCT services will be affected:Users will not be able to login to WebCT using student number and PIN.* Please use your interchange/netinfo account to avoid interruptionFor more information, visit www.itservices.ubc.ca/support/bulletins. We apologize for the inconvenience.Sincerely,ITServices WebCT Admin Team
- 41
Change Management Procedures
Example 3: Phase 2 of the admin cluster firewall installation.
“System Alert” section at http://www.itservices.ubc.ca/support/bulletins
Service Bulletin
Notification of ITServices Outage
The following is advance notification of a major outage within ITServices network and the services it provides. This outage has been scheduled in the early morning to minimize the impact on students, faculty and staff services:
From 06:00am November 13, 200xTo: 08:00am November 13, 200xReason: Scheduled network configuration maintenance
Impact: The following services will be unavailable during the above period
Enrolment Services AdmissionsAwardsCourse Scheduling/Catalog (Ad Astra)Degree Navigator (DAG)ElectionsFaculty Service Centre (FSC)Student Authentication ServiceStudent Information Service Centre (SISC)Student Information System (Old Green Screens/Unikix)Student Service Centre (SSC)
OthersTracII/Netinfo access requiring student number validationmyUBC authentication using student numberCWL authentication using student numberHousing authentication using student numberWebCT Administration
We apologize for this disruption in service and thank you for your patience.
Please refer to the "Systems Alerts" section found at http://www.itservices.ubc.ca/support/bulletins for further updates.
- 42 -