patching exadata doag conference 2011 · updates for os, kernel, infiniband, ilom, firmware and new...
TRANSCRIPT
2011 © Trivadis
BASEL BERN LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MÜNCHEN STUTTGART WIEN
Patching Exadata
Database Machine
- the DBA gets it
all!
DOAG Conference 2011
Konrad Häfeli
Senior Technology Manager
Trivadis AG
15.11.2011
15.11.2011
1 Patching Exadata Database Machine - the DBA gets it all
2011 © Trivadis
AGENDA
1. Patching Overview
2. Patch Application Methods
3. Patch Administrator/Manager?
4. Patching Exadata V2 to X2-2
5. Conclusion
15.11.2011
Patching Exadata Database Machine - the DBA gets it all
2
2011 © Trivadis
Exadata…
1. Turn Key solution?
2. Appliance?
generally "closed and sealed"
not serviceable by the owner
3. Black Box?
Patching Exadata Database Machine - the DBA gets it all 3
15.11.2011
2011 © Trivadis
Preconfigured, balanced System
1. In terms of:
Patching Exadata Database Machine - the DBA gets it all
4
Interconnect
Processors
Storage I/O
Memory
15.11.2011
2011 © Trivadis
Exadata responsibility?!
Who is responsible for maintaining the
supportability?
Who is responsible for patching?
The operator is responsible!
Keep on patching!
15.11.2011
Patching Exadata Database Machine - the DBA gets it all
5
2011 © Trivadis
Source of Support: MOS Note - 888828.1
The Maininformation for Version Support
Compatibility Matrix
15.11.2011
Patching Exadata Database Machine - the DBA gets it all
6
2011 © Trivadis
Exadata Patch Stacks/Types
Patches apply to 3 Stacks:
Database Server
RDBMS
Grid Infrastructure
Bundle Patch
InfiniBand
Switches
Infiniband Switch Patches
15.11.2011
Patching Exadata Database Machine - the DBA gets it all
7
Exadata Storage Server
Operating System
Firmware
Storage Server Patches
KVM and Cisco
No Oracle patches
2011 © Trivadis
Database Server Patches
Delivered in bundle patches created specifically for Exadata
Database and Grid Infrastructure Homes
Monthly released (until Oct 2011, then 2 months)
Contain a recently released Patch Set Update (PSU), which contain a
recently released Critical Patch Update (CPU)
Bundle patches are cumulative
Installation with OPatch utility
15.11.2011
Patching Exadata Database Machine - the DBA gets it all
8
2011 © Trivadis
Exadata Storage Server Patches
Updates for OS, Kernel, InfiniBand, ILOM, firmware and new features
Must occur only with an Exadata Storage Server patch provided as a single
downloadable patch from My Oracle Support
Do NOT manually update firmware or software on storage servers.
Includes the “minimal OS pack” for database server
Quarterly released
Installation
using a script supplied with the patch called patchmgr
rolling and non-rolling application
15.11.2011
Patching Exadata Database Machine - the DBA gets it all
9
2011 © Trivadis
Infiniband Switch Patches
Install only Oracle patches
MOS note 888828.1 is the reference
Prerequisites and instructions for installing a patch are provided in a README
Once or twice a year released
Installation depends on the version, check first
InfiniBand switch software version has no dependency on Exadata Storage
Server software version
15.11.2011
Patching Exadata Database Machine - the DBA gets it all
10
2011 © Trivadis
Patch Guideline
Exadata Patching Overview and Patch Testing Guidelines
MOS Note 1262380.1
Systems that are in production or are in late testing stages before
production should plan to periodically adopt more current patch
releases.
It is not required or necessary to install every new patch release.
A patch should be installed on a production system only after it has
been validated in a proper test environment, and no less than one
month after release to allow field experience to solidify.
However, if the system requires a fix that is available in a newer version,
then plans to adopt a newer version should be accelerated
15.11.2011
Patching Exadata Database Machine - the DBA gets it all
11
2011 © Trivadis
AGENDA
1. Patching Overview
2. Patch Application Methods
3. Patch Administrator/Manager?
4. Patching Exadata V2 to X2-2
5. Conclusion
15.11.2011
Patching Exadata Database Machine - the DBA gets it all
12
2011 © Trivadis
Reduce Patching Risks
Patching used to be a bit difficult/complicated
Since later releases feature to reduce Downtime and Risk
15.11.2011
Patching Exadata Database Machine - the DBA gets it all
13
Feature Minimum Version
Cell Rolling Apply 11.2.1.3.1
Bundle Patch Merge 11.2.0.2
RAC Rolling Installable 11.2.0.1 GI_BP1
11.2.0.1 DB_BP9
OPatch Auto Installable 11.2.0.1 GI_BP4
11.2.0.2 BP2
EM Installable 11.2.0.1 DB_BP7
11.2.0.2 BP1
Dataguard Standby-First Installable 11.2.0.1 DB_BP8
11.2.0.2 BP1
OPlan 11.2.0.2 BP2
2011 © Trivadis
Rolling Bundle Patch
Installed with OPatch utility
Since 11.2.0.2 “opatch auto”
Before apply or auto depending the patch
OPatch auto
One command per node
Database has to be registered
“-oh <ORACLE_HOME>” patches specific home
15.11.2011
Patching Exadata Database Machine - the DBA gets it all
14
Stop
Oracle
Home
Patch
Oracle
Home
Stop
CRS
Unlock
Grid
Home
Patch
Grid
Home
Start
CRS
Lock
Grid
Home
Start
Oracle
Home
2011 © Trivadis
Storage Server Patches
Installed with utility: patchmgr
Includes operating system updates, firmware updates, new functionality
Patchmgr uses dcli (distributed command line interface)
Rolling apply
or non rolling apply
15.11.2011
Patching Exadata Database Machine - the DBA gets it all
15
Stage Cells CELL 1
Offline ASM
CELL1
Patch
CELL 1
Online ASM
2011 © Trivadis
Storage Server Patches rolling vs. non-rolling
15.11.2011
Patching Exadata Database Machine - the DBA gets it all
16
Pro
No downtime
In case of problems just one
cell affected
Contra
Long apply time
No/reduced redundancy during
apply
1.5 to 3 hours per cell
ROLLING
Pro
1.5 to 3 hours overall
Reduced patch window
Contra
System downtime during apply
In case of problems all cells
affected
NON-ROLLING
2011 © Trivadis
Infiniband Switch Patches
Infiniband switches are running CentOS Linux 5.2
Infiniband patches can be installed in a rolling fashion
Patch the first switch, and wait for reboot to complete, then repeat patch
process on remaining switches
Older versions (1.1.3) were installed by placing update files on a web or
FTP server, and downloading RPM package updates to the switch
Later versions (1.3.3) were installed by placing the update package on
the filesystem of the switch, and updating from the ILOM of the IB switch
15.11.2011
Patching Exadata Database Machine - the DBA gets it all
17
2011 © Trivadis
AGENDA
1. Patching Overview
2. Patch Application Methods
3. Patch Administrator/Manager?
4. Patching Exadata V2 to X2-2
5. Conclusion
15.11.2011
Patching Exadata Database Machine - the DBA gets it all
18
2011 © Trivadis
Exadata Administrator
Who is in charge for patching?
Oracle Software Stack allows Separation of roles
Database Home
Grid Home (Cluster and AMS functionality)
Does it really “need” a SysAdmin?
What is to do?
- OS Maintenance (not only minimal pack)
- Usermanagement
- Crontab setup/enabling
- Sendmail config
- Network configuration (changes)
- …
15.11.2011
Patching Exadata Database Machine - the DBA gets it all
19
2011 © Trivadis
DBA or not?
To get the most synergy combine the jobs
DBA can do it
No fear to look over the border…
Interdisciplinary work done with checklists
Job enrichment no matter from which side
DBA SysAdmin
SysAdmin DBA
15.11.2011
Patching Exadata Database Machine - the DBA gets it all
20
2011 © Trivadis
Teamwork
Patching big systems is time
consuming
Checking dozens of readmes is
error-prone
Doublecheck your work in a team
15.11.2011
Patching Exadata Database Machine - the DBA gets it all
21
2011 © Trivadis
AGENDA
1. Patching Overview
2. Patch Application Methods
3. Patch Administrator/Manager?
4. Patching Exadata V2 to X2-2
5. Conclusion
15.11.2011
Patching Exadata Database Machine - the DBA gets it all
22
2011 © Trivadis
Lifecycle of an old V2 quarter rack
RDBMS 11.2.0.2 needed which requires Storage
Server Version 11.2.2.x
Prepare the V2 for scaling out to a half Rack (with
an X2-2 Quarter)
Reason for patching
15.11.2011
Patching Exadata Database Machine - the DBA gets it all
23
Complex Upgrade/Patch of all Components
2011 © Trivadis
Patch process: a story with ups and downs
Get Infos from «gurus»:
A lot of tipps but most not really relevant for my actual configuration
RTFM
Read the fine MOS-Note ;-) 888828.1 https://supporthtml.oracle.com/ep/faces/secure/km/DocumentDisplay.jspx?id=888828.1&h=Y
15.11.2011
Patching Exadata Database Machine - the DBA gets it all
24
2011 © Trivadis
Infiniband Switch Upgrade
15.11.2011
Patching Exadata Database Machine - the DBA gets it all
25
2011 © Trivadis
Infiniband Switches are supposed to be patched first
The procedure must be taken in three steps
Users of FW version 1.0.1 will need to upgrade to 1.1.3 (patch 9560930)
before upgrading to 1.3.3
Packageload via Webserver…
Infiniband Switch Patching
15.11.2011
Patching Exadata Database Machine - the DBA gets it all
26
# nm2version
NM2-36p version: 1.0.1-1
Build time: Sep 14 2009 12:52:51
ComExpress info:
Manufacturing Date: 2009.02.19
Serial Number: "NCD2T0059"
Hardware Revision: 0x0006
Firmware Revision: 0x0102
2011 © Trivadis
Exadata patch 12373676, Infiniband switch software 1.3.3-2
Infiniband NM2 36p payload patch 11891229
Infiniband Switch Patching (2)
15.11.2011
Patching Exadata Database Machine - the DBA gets it all
27
[root@bdm2sw-ib2 tmp]# disablesm
Stopping IB Subnet Manager.. [ OK ]
[root@bdm2sw-ib2 tmp]# /tmp/ibswitchcheck.sh pre
Current version of switch is: 1.1.3-2 and is a leaf
Switch target version is: 1.3.3-2
[SUCCESS] Switch meets minimal patching level [SUCCESS]
.
.
[root@bdm2sw-ib2 tmp]# spsh
Sun(TM) Integrated Lights Out Manager
Version ILOM 3.0 r47111
Copyright 2009 Sun Microsystems, Inc. All rights reserved.
Use is subject to license terms.
-> load -source /tmp/sundcs_36p_repository_1.3.3_2.pkg
NOTE: Firmware upgrade will upgrade the SUN DCS 36p firmware.
ILOM will enter a special mode to load new firmware. No
other tasks should be performed in ILOM until the firmware
[root@bdm2sw-ib2 tmp]# ./ibswitchcheck.sh post
Current version of switch is: 1.3.3-2 and is a leaf
Switch target version is: 1.3.3-2
[SUCCESS] Switch is at target patching level
.
.
[SUCCESS] /conf/configvalid is 1
Overall status of post check is SUCCESS
2011 © Trivadis
Storage Server Upgrade
15.11.2011
Patching Exadata Database Machine - the DBA gets it all
28
2011 © Trivadis
Storage Server Patching
Download mentioned the Patches
Exadata Storage Server Patching consists of:
Patch 12577723 - Exadata Storage Server software 11.2.2.3.2 (Note 1323958.1)
Check: Exadata Critical Issues
MOS Note: 1270094.1
All Issues solved with Storage Server Image 11.2.2.2
Master Note for Oracle Database Machine and Exadata Storage Server
(Doc ID 1187674.1)
Before and after patch application, run HealthCheck
MOS Note: 1070954.1 to verify software, hardware, and firmware versions and
configuration best practices
15.11.2011
Patching Exadata Database Machine - the DBA gets it all
29
2011 © Trivadis
Switch the Primary-DBs to Standby-Site
Leaves in a MAA-Environment the Standby-Site free for «offline» Patching
Run Checkscripts
to verify software, hardware, and firmware versions and configuration best
practices
- MOS Note 1070954.1
exachk is the current version. HealthCheck is frozen and retained for
backward compatibility with HP hardware
Old HealthChecks:
Uprade sequence…
15.11.2011
Patching Exadata Database Machine - the DBA gets it all
30
# ./run_os_commands_as_root.sh \
-a /home/oracle/HealthCheck \
-b /u01/app/11.2.0/grid \
-c /u01/app/11.2.0/grid \
-d /u01/app/oracle/product/11.2.0/dbhome_1
2011 © Trivadis
New exachk
Report
Uprade sequence… (2)
15.11.2011
Patching Exadata Database Machine - the DBA gets it all
31
# oracle@bdm2db01:~/exachk/ [EXATEST1] ./exachk -a
.
=============================================================
Node name - bdm2db02
=============================================================
Collecting - CPU Information
Collecting - CRS active version
Collecting - CRS oifcfg
Collecting - CRS software version
Collecting - Cluster interconnect (clusterware)
Collecting - Compute node PCI bus slot speed for infiniband HCAs
Collecting - Exadata storage cells [DBMV2]
Collecting - Kernel parameters
.
2011 © Trivadis
Following README for patch 12577723
Review the support note 1323958.1
Find the model of the cell or database host
Applying the Patch with No Deployment-wide Downtime
also known as a "rolling update“ (Does not need any database downtime)
in worst-case conditions or when unexpected conditions occur, it can lead to
Oracle ASM repair timeout being reached
- resulting in Oracle ASM dropping the grid disks on the cell
- Re-adding these grid disks is a time consuming manual operation
- It also triggers a rebalance by Oracle ASM.
It is recommended that only light loads or no loads be running on the system
during patching to avoid timeouts or having the grid disks dropped by Oracle
ASM after disk repair times expire
Uprade sequence… (3)
15.11.2011
Patching Exadata Database Machine - the DBA gets it all
32
# dmidecode -s system-product-name
2011 © Trivadis
Uprade sequence… (4)
15.11.2011
Patching Exadata Database Machine - the DBA gets it all
33
[root@bdm1cel01 ~]# imageinfo
Kernel version: 2.6.18-128.1.16.0.1.el5 #1 SMP Tue Jun 30 16:48:30
EDT 2009 x86_64
Cell version: OSS_11.2.1.2.1_LINUX.X64_100131
Cell rpm version: cell-11.2.1.2.1_LINUX.X64_100131-1
Active image version: 11.2.1.2.1
Active image activated: 2010-03-01 16:24:56 +0100
Active image status: success
Active system partition on device: /dev/md5
Active software partition on device: /dev/md7
In partition rollback to 11.2.1.2.0: Possible
Cell boot usb partition: /dev/sdm1
Cell boot usb version: 11.2.1.2.1
Inactive image version: undefined
Rollback to the inactive partitions: Impossible
2011 © Trivadis
Setup Patchdepot on DB-Server
Config a cell_group file with all cell-IPs
Check Prerequirements
Stop all DB- and Cell-Services
Uprade sequence… (5)
15.11.2011
Patching Exadata Database Machine - the DBA gets it all
34
# ./patchmgr -cells cell_group -patch_check_prereq
dcli -g dbs_group -l root \
"/u01/app/11.2.0/grid/bin/crsctl stop crs -f“
dcli -g dbs_group -l root "ps -ef | grep grid"
dcli -g cell_group -l root \
"cellcli -e alter cell shutdown services all"
2011 © Trivadis
Patch all Cells
Needs approx. 1 ¾ hour
Cleanup
Uprade sequence… (6)
15.11.2011
Patching Exadata Database Machine - the DBA gets it all
35
# ./patchmgr -cells cell_group –patch
.
14:33-25-May:2011 :Working: DO: Check cells have ssh
equivalence for root user. Up to 10 seconds per cell ...
.
.
16:14-25-May:2011 5 of 5 :Working: DO: Check the state of patch on
cells. Up to 5 minutes ...
16:14-25-May:2011 5 of 5 :SUCCESS: DONE: Check the state of patch
on cells.
# cat /etc/redhat-release
Enterprise Linux Enterprise Linux Server release 5.5(Carthage)
# ./patchmgr -cells cell_group -cleanup
2011 © Trivadis
DB-Server minimal Pack
(OS and Firmware)
15.11.2011
Patching Exadata Database Machine - the DBA gets it all
36
2011 © Trivadis
Start Console via ILOM
Check imagehistory and stop services
Adapt /etc/security/limits.conf (memory limits to ¾ physical memory)
Unzip db_patch_11.2.2.3.2.110520.zip (included in Storage Patch)
Start the patch
DB-Server minimal Pack
15.11.2011
Patching Exadata Database Machine - the DBA gets it all
37
start /SP/console
./install.sh -force
# /usr/local/bin/imagehistory
# /u01/app/11.2.0/grid/bin/crsctl stop crs –f
# cd /opt/oracle.oswatcher/osw
# ./stopOSW.sh
2011 © Trivadis
DB-Server minimal Pack (2)
15.11.2011
Patching Exadata Database Machine - the DBA gets it all
38
# /usr/local/bin/imageinfo
Kernel version: 2.6.18-128.1.16.0.1.el5 #1 SMP Tue Jun 30 16:48:30
EDT 2009 x86_64
Image version: 11.2.2.3.2.110520
Image activated: 2011-05-25 20:52:12 +0200
Image status: success
System partition on device: /dev/sda1
# cat /etc/redhat-release
Enterprise Linux Enterprise Linux Server release 5.3 (Carthage)
# uname -r
2.6.18-128.1.16.0.1.el5
# rpm -qa |grep ofa
ofa-2.6.18-128.1.16.0.1.el5-1.4.2-14
# /opt/MegaRAID/MegaCli/MegaCli64 -LDInfo -Lall -a0
.
Default Cache Policy: WriteThrough, ReadAheadNone, Direct, …
Current Cache Policy: WriteThrough, ReadAheadNone, Direct, …
2011 © Trivadis
At this point bdm2db02 failed to reboot.
We had this message in the ILOM console during boot process (hard to
get hold of, all runs through very quickly…):
The controller had to be replaced !
After replacement it had – as to be expected – a too low firmware version:
Troubles with one database node
15.11.2011
Patching Exadata Database Machine - the DBA gets it all
39
Adapter at Baseport is not responding
No MegaRAID Adapter Installed
# dcli -l root -g all_group
"/opt/oracle.SupportTools/CheckHWnFWProfile -c strict"> /tmp/ck.out
[root@bdm2db01 tmp]# more ck.out
bdm2db01: [SUCCESS] The hardware and firmware profile matches one
of the supported profiles
bdm2db02: [WARNING] The hardware and firmware are not supported.
2011 © Trivadis
Upgrade Controller firmware
Battery capacity is below the threshold value
So policy Change to WB will not come into effect immediately
Fix Diskcontroller configuration to “WriteBack”
Troubles with disk controller
15.11.2011
Patching Exadata Database Machine - the DBA gets it all
40
/opt/MegaRAID/MegaCli/MegaCli64 -LDSetProp WB -Lall -a0
/opt/MegaRAID/MegaCli/MegaCli64 -LDSetProp NoCachedBadBBU -Lall -a0
/opt/MegaRAID/MegaCli/MegaCli64 -LDSetProp NORA -Lall -a0
/opt/MegaRAID/MegaCli/MegaCli64 -LDSetProp Direct -Lall -a0
[root@bdm2db02 ~]# /opt/oracle.SupportTools/CheckHWnFWProfile -U
/opt/oracle.cellos/iso/cellbits/
Now updating the disk controller firmware ...
Now disabling cache of the disk controller ...
2011 © Trivadis
Upgrade DB Server
15.11.2011
Patching Exadata Database Machine - the DBA gets it all
41
2011 © Trivadis
Uprade sequence…
11.2.0.1 to 11.2.0.2 Database Upgrade on Exadata Database Machine
(Doc ID 1315926.1)
There are four main sections to the upgrade:
Prepare the Existing Environment
- environment must be at certain minimum levels before upgrade to 11.2.0.2
Install and Upgrade Grid Infrastructure to 11.2.0.2
- always performed in a RAC rolling manner
Install Database 11.2.0.2 Software
- into a new ORACLE_HOME directory with no impact to the current env
Upgrade Database to 11.2.0.2
- requires database-wide downtime
- Rolling upgrade with Logical Standby or Golden Gate may be used to
reduce database downtime
15.11.2011
Patching Exadata Database Machine - the DBA gets it all
42
2011 © Trivadis
Download required files
Files staged on first database server only
Patch 10098816 - Oracle Database 11g, Release 2 (11.2.0.2) Patch Set 1 for
Linux x86-64
- p10098816_112020_Linux-x86-64_1of7.zip - Oracle Database
- p10098816_112020_Linux-x86-64_2of7.zip - Oracle Database
- p10098816_112020_Linux-x86-64_3of7.zip - Oracle Grid Infrastructure
- p10098816_112020_Linux-x86-64_7of7.zip - Deinstall tool
15.11.2011
Patching Exadata Database Machine - the DBA gets it all
43
2011 © Trivadis
Download required files (2)
Files staged on all database servers.
Patch 6880880 - OPatch latest update
Patch 9329767 - Fix for 11.2.0.1 CRS rolling upgrade bug 9329767
Latest Database 11.2.0.2 bundle patch for Exadata
Data Guard only - Bundle patch overlay fix for unpublished bug 11664046.
See Document 1288640.1 for details.
- The overlay patch required must match the 11.2.0.2 bundle patch installed.
At the time of publication there are two overlay patches available.
If an overlay patch for the 11.2.0.2 bundle patch you will install is not listed
above, either contact Oracle Support to check on availability of an overlay
patch for your bundle patch, or utilize the workaround described in Document
1288640.1. This is described in more detail later in this document.
Patch 12312927 - Overlay fix for unpublished bug 11664046 on top of 11.2.0.2
BP5 is used within this document
15.11.2011
Patching Exadata Database Machine - the DBA gets it all
44
2011 © Trivadis
OS Upgrade DB Server
For instructions for updating key software components on database
servers from OL 5.3 to OL 5.5, refer to Document 1284070.1
Steps:
Step 1: Obtain RPM bundle, stage needed RPMs, apply workaround(s)
Step 2: Install updated kernel
Step 3: Shutdown Services
Step 4: Update additional RPMs
Step 5: Update additional packages
Step 6: Clean up, restart processes, healthcheck
15.11.2011
Patching Exadata Database Machine - the DBA gets it all
45
2011 © Trivadis
AGENDA
1. Patching Overview
2. Patch Application Methods
3. Patch Administrator/Manager?
4. Patching Exadata V2 to X2-2
5. Conclusion
15.11.2011
Patching Exadata Database Machine - the DBA gets it all
46
2011 © Trivadis
Conclusion
15.11.2011
Patching Exadata Database Machine - the DBA gets it all
47
2011 © Trivadis
Conclusion
Even Exadata as an engineered solution needs software maintenance
Patching should be done regularly to proactive avoid troubles
Integrate it into the Lifecycle-Management with a well defined patch
process
Patching involves near countless steps, define and use checklists
Regularly check your Systems state with the provided Healthcheck
Scripts
Evaluate the latest patch feature, as there is a fast improvement cycle
Having redundancy in terms of a MAA system reduces patch risks and
time
15.11.2011
Patching Exadata Database Machine - the DBA gets it all
48
2011 © Trivadis
Conclusion (2)
Do not forget Mr. Murphy:
Software is buggy, but hardware breaks
This Patch/Upgrade Scenario was definitely one of the most challenging
ones, your case can only be easier ;-)
Therefore no matter who does the job:
The DBA gets I all!
(If he wants it)
15.11.2011
Patching Exadata Database Machine - the DBA gets it all
49
2011 © Trivadis
BASEL BERN LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MÜNCHEN STUTTGART WIEN
THANK YOU.
VISIT US AT THE
TRIVADIS-STAND:
Floor 3, No. 304
15.11.2011
50 Patching Exadata Database Machine - the DBA gets it all
Trivadis AG
Konrad Häfeli
Papiermühlestrasse 73
CH-3014 Bern
Tel. +41-31-928 09 60
Fax +41-31-928 09 64
www.trivadis.com