![Page 1: BELT & SUSPENDERS HA & DR in one solution Ray English Sr. Systems Administrator Indianapolis Power & Light Company](https://reader035.vdocuments.mx/reader035/viewer/2022062413/5a4d1b497f8b9ab0599a49e3/html5/thumbnails/1.jpg)
BELT & SUSPENDERSHA & DR in one
solutionRay English
Sr. Systems AdministratorIndianapolis Power & Light Company
![Page 2: BELT & SUSPENDERS HA & DR in one solution Ray English Sr. Systems Administrator Indianapolis Power & Light Company](https://reader035.vdocuments.mx/reader035/viewer/2022062413/5a4d1b497f8b9ab0599a49e3/html5/thumbnails/2.jpg)
About Ray English Sr. Systems Administrator
Indianapolis Power & Light Company Focusing on UNIX (primarily Solaris) systems
UNIX geek since 1994 Sun Certified System Administrator VCS administrator since 2000 Experience in Indianapolis
IPL Lilly General Motors Allison Transmission (EDS)
![Page 3: BELT & SUSPENDERS HA & DR in one solution Ray English Sr. Systems Administrator Indianapolis Power & Light Company](https://reader035.vdocuments.mx/reader035/viewer/2022062413/5a4d1b497f8b9ab0599a49e3/html5/thumbnails/3.jpg)
OMS Overview OMS = Outage Management System Records outage calls from IPL customers
IVR (317-261-8111 & 317-261-8222) Phone center agents “Last gasp” from meters
“I’m meter 12345 and I just lost power.” ~500,000 customers in Indianapolis area Predictive analysis of root cause based on
call grouping (transformer, pole, etc.) Industry-specific software from CGI/M3i
![Page 4: BELT & SUSPENDERS HA & DR in one solution Ray English Sr. Systems Administrator Indianapolis Power & Light Company](https://reader035.vdocuments.mx/reader035/viewer/2022062413/5a4d1b497f8b9ab0599a49e3/html5/thumbnails/4.jpg)
OMS Map View
![Page 5: BELT & SUSPENDERS HA & DR in one solution Ray English Sr. Systems Administrator Indianapolis Power & Light Company](https://reader035.vdocuments.mx/reader035/viewer/2022062413/5a4d1b497f8b9ab0599a49e3/html5/thumbnails/5.jpg)
Zoom-in on outage
![Page 6: BELT & SUSPENDERS HA & DR in one solution Ray English Sr. Systems Administrator Indianapolis Power & Light Company](https://reader035.vdocuments.mx/reader035/viewer/2022062413/5a4d1b497f8b9ab0599a49e3/html5/thumbnails/6.jpg)
Outage summaries
![Page 7: BELT & SUSPENDERS HA & DR in one solution Ray English Sr. Systems Administrator Indianapolis Power & Light Company](https://reader035.vdocuments.mx/reader035/viewer/2022062413/5a4d1b497f8b9ab0599a49e3/html5/thumbnails/7.jpg)
OMS business challenges Critical system
Customers expect 100% reliability Utilized to dispatch trucks to restore outages
Data is constantly churning Information from minutes ago could be useless The more data, the better idea we have of what’s wrong
Utilized most during high stress Evenings (end-of-day for day shift)
Storms Customers arriving home from work
Poor weather (storms, ice storms, snow) High customer expectations
Keep it simple
![Page 8: BELT & SUSPENDERS HA & DR in one solution Ray English Sr. Systems Administrator Indianapolis Power & Light Company](https://reader035.vdocuments.mx/reader035/viewer/2022062413/5a4d1b497f8b9ab0599a49e3/html5/thumbnails/8.jpg)
OMS Technical Architecture Oracle databases
Sun Solaris SPARC systems Application Tier
Windows systems Client Tier
Windows workstations Dispatchers Trucks
IVR systems Web front-end call center agents (iCall)
![Page 9: BELT & SUSPENDERS HA & DR in one solution Ray English Sr. Systems Administrator Indianapolis Power & Light Company](https://reader035.vdocuments.mx/reader035/viewer/2022062413/5a4d1b497f8b9ab0599a49e3/html5/thumbnails/9.jpg)
High Availability (the belt) VERITAS Cluster Server Failover within datacenter
Human error Power feeds Networking SAN Isolated environmental Application failure Server failure
Rolling upgrades
![Page 10: BELT & SUSPENDERS HA & DR in one solution Ray English Sr. Systems Administrator Indianapolis Power & Light Company](https://reader035.vdocuments.mx/reader035/viewer/2022062413/5a4d1b497f8b9ab0599a49e3/html5/thumbnails/10.jpg)
Overview of HA setup
![Page 11: BELT & SUSPENDERS HA & DR in one solution Ray English Sr. Systems Administrator Indianapolis Power & Light Company](https://reader035.vdocuments.mx/reader035/viewer/2022062413/5a4d1b497f8b9ab0599a49e3/html5/thumbnails/11.jpg)
VCS service group configuration
![Page 12: BELT & SUSPENDERS HA & DR in one solution Ray English Sr. Systems Administrator Indianapolis Power & Light Company](https://reader035.vdocuments.mx/reader035/viewer/2022062413/5a4d1b497f8b9ab0599a49e3/html5/thumbnails/12.jpg)
VCS service group configuration
![Page 13: BELT & SUSPENDERS HA & DR in one solution Ray English Sr. Systems Administrator Indianapolis Power & Light Company](https://reader035.vdocuments.mx/reader035/viewer/2022062413/5a4d1b497f8b9ab0599a49e3/html5/thumbnails/13.jpg)
DR Challenges Loss of a site Need up-to-the-minute data
Information from minutes ago could be useless No data is better than incorrect data
“Know that you don’t know anything.” Seamless to users
Dispatchers Crews Call center representatives Customers
![Page 14: BELT & SUSPENDERS HA & DR in one solution Ray English Sr. Systems Administrator Indianapolis Power & Light Company](https://reader035.vdocuments.mx/reader035/viewer/2022062413/5a4d1b497f8b9ab0599a49e3/html5/thumbnails/14.jpg)
Disaster Recovery (the suspenders) Moderately close proximity
~10 miles +/-
Robust fiber Public IP subnet & VCS heartbeats span data centers Redundant loop around city Lots of bandwidth
EMC SRDF Symmetrix Remote Data Facility Other technologies available (VVR, etc.)
VERITAS Cluster Server (Global Cluster Option)
![Page 15: BELT & SUSPENDERS HA & DR in one solution Ray English Sr. Systems Administrator Indianapolis Power & Light Company](https://reader035.vdocuments.mx/reader035/viewer/2022062413/5a4d1b497f8b9ab0599a49e3/html5/thumbnails/15.jpg)
Cluster Terminology Stretch cluster Stretched cluster Campus cluster Extended cluster Data replication cluster Metro cluster Metro stretched cluster
![Page 16: BELT & SUSPENDERS HA & DR in one solution Ray English Sr. Systems Administrator Indianapolis Power & Light Company](https://reader035.vdocuments.mx/reader035/viewer/2022062413/5a4d1b497f8b9ab0599a49e3/html5/thumbnails/16.jpg)
Overview of DR setup
![Page 17: BELT & SUSPENDERS HA & DR in one solution Ray English Sr. Systems Administrator Indianapolis Power & Light Company](https://reader035.vdocuments.mx/reader035/viewer/2022062413/5a4d1b497f8b9ab0599a49e3/html5/thumbnails/17.jpg)
VCS service group with SRDF
![Page 18: BELT & SUSPENDERS HA & DR in one solution Ray English Sr. Systems Administrator Indianapolis Power & Light Company](https://reader035.vdocuments.mx/reader035/viewer/2022062413/5a4d1b497f8b9ab0599a49e3/html5/thumbnails/18.jpg)
Overview of DR setup
![Page 19: BELT & SUSPENDERS HA & DR in one solution Ray English Sr. Systems Administrator Indianapolis Power & Light Company](https://reader035.vdocuments.mx/reader035/viewer/2022062413/5a4d1b497f8b9ab0599a49e3/html5/thumbnails/19.jpg)
Production node crashes (Time for HA!)
![Page 20: BELT & SUSPENDERS HA & DR in one solution Ray English Sr. Systems Administrator Indianapolis Power & Light Company](https://reader035.vdocuments.mx/reader035/viewer/2022062413/5a4d1b497f8b9ab0599a49e3/html5/thumbnails/20.jpg)
Loss of production site (Time for DR!)
![Page 21: BELT & SUSPENDERS HA & DR in one solution Ray English Sr. Systems Administrator Indianapolis Power & Light Company](https://reader035.vdocuments.mx/reader035/viewer/2022062413/5a4d1b497f8b9ab0599a49e3/html5/thumbnails/21.jpg)
Running at the DR site
![Page 22: BELT & SUSPENDERS HA & DR in one solution Ray English Sr. Systems Administrator Indianapolis Power & Light Company](https://reader035.vdocuments.mx/reader035/viewer/2022062413/5a4d1b497f8b9ab0599a49e3/html5/thumbnails/22.jpg)
Failback to the production site
![Page 23: BELT & SUSPENDERS HA & DR in one solution Ray English Sr. Systems Administrator Indianapolis Power & Light Company](https://reader035.vdocuments.mx/reader035/viewer/2022062413/5a4d1b497f8b9ab0599a49e3/html5/thumbnails/23.jpg)
The data is there- now what?
![Page 24: BELT & SUSPENDERS HA & DR in one solution Ray English Sr. Systems Administrator Indianapolis Power & Light Company](https://reader035.vdocuments.mx/reader035/viewer/2022062413/5a4d1b497f8b9ab0599a49e3/html5/thumbnails/24.jpg)
Gotchas Mounts should be the same on both sides SRDF needs to be “synchronous”
Diskgroup, volumes, filesystem needs to be consistent “Adaptive copy” doesn’t cut it- individual devices in the
disk group fall behind Networking between sites needs to be robust
Redundant: Prevent split-brain Fast: VCS heartbeats, data replication Big: Data replication, public network traffic
Freeze/disable failover to the DR servers Risk vs. Reward
![Page 25: BELT & SUSPENDERS HA & DR in one solution Ray English Sr. Systems Administrator Indianapolis Power & Light Company](https://reader035.vdocuments.mx/reader035/viewer/2022062413/5a4d1b497f8b9ab0599a49e3/html5/thumbnails/25.jpg)
Why have idle DR hardware? Run Test, Development, Sandbox,
Training, etc. environments on DR equipment when it’s not needed.
Load on these environments will probably be minimal if you’re in “DR Mode”
Also add these environments to VCS Easily offline if horsepower is needed for DR Service group dependencies
![Page 26: BELT & SUSPENDERS HA & DR in one solution Ray English Sr. Systems Administrator Indianapolis Power & Light Company](https://reader035.vdocuments.mx/reader035/viewer/2022062413/5a4d1b497f8b9ab0599a49e3/html5/thumbnails/26.jpg)
Global cluster service groups? Adds complexity that may not be needed
Networking (DNS, etc.) Management in VCS (Cluster of clusters) GCO Proxy
Instead, use parts of Global Cluster Replication agents
![Page 27: BELT & SUSPENDERS HA & DR in one solution Ray English Sr. Systems Administrator Indianapolis Power & Light Company](https://reader035.vdocuments.mx/reader035/viewer/2022062413/5a4d1b497f8b9ab0599a49e3/html5/thumbnails/27.jpg)
Oracle RAC (parallel service groups) Oracle RAC between metro sites using
data replication requires use of Global Cluster service groups because you’re failing between clusters, not machines. All-or-nothing at each site (because only 1 site
can have valid data access at a time) is enforced by GCO
Machine-based failover for Oracle RAC within each site is primarily handled by Oracle RAC itself.
![Page 28: BELT & SUSPENDERS HA & DR in one solution Ray English Sr. Systems Administrator Indianapolis Power & Light Company](https://reader035.vdocuments.mx/reader035/viewer/2022062413/5a4d1b497f8b9ab0599a49e3/html5/thumbnails/28.jpg)
Remember… Don’t underestimate the power of network
and storage magic. Call 261-8222 to report IPL power outages VCS makes “belt & suspenders” easy for
metro failover clusters with robust infrastructure.
A “fall back to an hour ago / yesterday / last week” situation requires other planning besides this (backups).
Your mileage may vary.
![Page 29: BELT & SUSPENDERS HA & DR in one solution Ray English Sr. Systems Administrator Indianapolis Power & Light Company](https://reader035.vdocuments.mx/reader035/viewer/2022062413/5a4d1b497f8b9ab0599a49e3/html5/thumbnails/29.jpg)
Obligatory slide of logos