prepared by craig taylor ([email protected]) christopher russo ([email protected]) presentation to...

21
Prepared by Craig Taylor ([email protected]) Christopher Russo ([email protected]) Presentation to 2002 OSISoft Users Conference Large PI System Redundancy, Performance and Security Strategies

Upload: octavia-brown

Post on 13-Dec-2015

215 views

Category:

Documents


2 download

TRANSCRIPT

Prepared by

Craig Taylor ([email protected])

Christopher Russo ([email protected])

Presentation to2002 OSISoft Users Conference

Large PI System Redundancy, Performance

and Security Strategies

Large PI System Redundancy, Performance

and Security Strategies

AgendaAgenda

California ISO System Overview

PI-UDS Hardware Cutover

PI-UDS Network Monitoring Tool

Large PI-UDS Primary Design Goals

Tips/Potential Problems for Large Systems

Discussion

California ISO FactoidsCalifornia ISO Factoids

Territories covered: Pacific Gas and Electric Southern California Edison Comision Federal de Electricidad

Covers 124,000 square miles

21,000 circuit miles of transmission

Approximately 600+ generators

45,000 Megawatt summer peak load

$23 billion energy consumed annually

PI System CutoverPI System Cutover

Energy Management System CutoverDecember 2001

From ABB Spider to ABB Ranger

Ranger system improved reliability

Included Universal Data Server hardware upgrades

150,000+ points Currently world’s largest single system

PI System Hardware ChangesPI System Hardware Changes

Component Spider System RANGER SystemSystem Compaq 6400R Compaq DL 580CPU's 4 x 500Mhz - 512Kbyte 4 x 700Mhz - 1MbyteMemory 1 Gigabytes 1 GigabytesController Smart Array 221 Controller Built into Mother BoardOutside Network 100 Megabit Full Duplex 100 Megabit Full DuplexBackup Network FDDI FDDI

EMC Symmetrix 3000 Compaq HSG80 ControllerDisk Storage 300 Gigabytes Brocade Silkworm 2800

300 Gigabytes

PI-UDS Issues Description and Containment

PI-UDS Issues Description and Containment

ISO experienced client disconnects:

Potential service denial causes Large DataLink data queries tied up PI Archiving Subsystem Network suspected in some cases Disk subsystem suspected

Question was: How do we identify and fix? We decided to get more network use information and identify pi hogs

We wrote program to organize PICONFIG data in web page

Monitoring UDS Use with PICONFIGMonitoring UDS Use with PICONFIG

Visual Basic program ran PICONFIG Command (every 5 minutes)

PICONFIG commands:

@login UDS_SERVER,piadmin,password,5450@mode list@table pinetmgrstats@ostru

ID,ConStatus,ConTime,ConType,MsgRecv,MsgSent,Name,PeerAddress,PeerName,PID,RecvErrors,SendErrors,BytesRecv,BytesSent

@select ID=*@ends

Monitoring UDS Use with PICONFIGMonitoring UDS Use with PICONFIG

PICONFIG Output:

1948,[0] Success,20-Feb-02 07:46:24,PI-API connection,224.,224.,pideE,IP.IP.IP.IP,PeerName,-1,0.,0.,8620.,42437

56,[0] Success,13-Feb-02 10:56:06,PI-API connection,107.,107.,pideE, IP.IP.IP.IP, PeerName,-1,0.,0.,4752.,19628

3000,[0] Success,4-Mar-02 05:55:29,PI-API connection,30.,30.,pideE, IP.IP.IP.IP, PeerName,-1,0.,0.,1600.,1081.

2727,[0] Success,4-Mar-02 02:07:11,PI-API connection,1282.,1282.,pideE, IP.IP.IP.IP, PeerName,-1,0.,0.,1.0027E+005,2.614E+005

2932,[0] Success,26-Feb-02 10:40:58,PI-API connection,1.0431E+005,1.0432E+005,pideE, IP.IP.IP.IP, PeerName,-1,0.,0.,4.2131E+005,2.5783E+006

3202,[0] Success,4-Mar-02 08:29:19,PI-API connection,15054,42290,pideE, IP.IP.IP.IP, PeerName,-1,0.,0.,7.0155E+005,1.2982E+008

8625,[0] Success,1-Mar-02 13:54:18,PI-API connection,79446,79446,PIPeE, IP.IP.IP.IP, PeerName,-1,0.,0.,4.1275E+007,1.934E+007

The important information is: PeerName (The person’s computer) Connection Time BytesRecv and BytesSent (Data amount transferred)

Monitoring UDS Use with PICONFIGMonitoring UDS Use with PICONFIG

PICONFIG Output:pinetmgrstats ValueID 1948ConStatus [0] SuccessConTime 4-Mar-02 02:07:11ConType PI-API connectionMsgRecv 1282MsgSent 1282Name pideEPeerAddress IP.IP.IP.IPPeerName PeerNameRecvErrors -1SendErrors 0.BytesRecv 1.00E+05BytesSent 2.61E+05

Monitoring UDS Use with PICONFIGMonitoring UDS Use with PICONFIG

Web Pages

Monitoring UDS Use with PICONFIGMonitoring UDS Use with PICONFIG

Web Pages

Monitoring UDS Use with PICONFIGMonitoring UDS Use with PICONFIG

IP.IP.IP.IP, Peer NameIP.IP.IP.IP, Peer NameIP.IP.IP.IP, Peer NameIP.IP.IP.IP, Peer NameIP.IP.IP.IP, Peer NameIP.IP.IP.IP, Peer NameIP.IP.IP.IP, Peer NameIP.IP.IP.IP, Peer NameIP.IP.IP.IP, Peer NameIP.IP.IP.IP, Peer NameIP.IP.IP.IP, Peer NameIP.IP.IP.IP, Peer NameIP.IP.IP.IP, Peer NameIP.IP.IP.IP, Peer NameIP.IP.IP.IP, Peer NameIP.IP.IP.IP, Peer NameIP.IP.IP.IP, Peer Name

Monitoring UDS Use with PICONFIGMonitoring UDS Use with PICONFIG

NOTE: IP AddressesAnd

Peer NamesRemoved

forSecurity

Monitoring Tool BenefitsMonitoring Tool Benefits

Enabled quick PI user “hog” identification

Helped to identifying and optimize abusive queries

PI System Hardware ChangesPI System Hardware Changes

Component Spider System RANGER SystemSystem Compaq 6400R Compaq DL 580CPU's 4 x 500Mhz - 512Kbyte 4 x 700Mhz - 1MbyteMemory 1 Gigabytes 1 GigabytesController Smart Array 221 Controller Built into Mother BoardOutside Network 100 Megabit Full Duplex 100 Megabit Full DuplexBackup Network FDDI FDDI

EMC Symmetrix 3000 Compaq HSG80 ControllerDisk Storage 300 Gigabytes Brocade Silkworm 2800

300 Gigabytes

Christopher RussoRusso & Associates

www.russoandassociates.com

Large PI-UDS Primary Design Goals

Tips/Potential Problems for Large Systems

Designing Large PI SystemsPrimary Goals

Designing Large PI SystemsPrimary Goals

Reliability & Robustness Avoiding single-element failure

Redundancy Clustering Solutions

• Hardware (GeoCluster, Marathon, Legato, EMC)• Software (MSCS, Unix)

Performance Server tweaking

• Archive parameters, bottlenecks Redundant Solutions

• IP Load Balancing• Different Servers• PI to PI Distribution

Peak Performance TipsPeak Performance Tips Decide what you’re realistically going to need

Consider dedicated systems for specific applications

OS/Hardware Specific Network performance and latency Disk performance: disk striping, fiber-channel, dedicated

hardware Processor, network subsystem

PI-Specific Parameters Archive Subsystem Tuning

• Archive cache record tuning Update subsystem tuning

• Tuning for real-time performance versus historical retrieval• Considerations for totalizer users

“Pre-digesting” of specific calculations

Large System Potential ProblemsLarge System Potential Problems

Performance monitoring only manages some bottlenecks Certain requests can still “nuke” the server

Loading and unloading records into memory is time-consuming

Shutdown times increase linearly with archive ratio size

Memory image cannot exceed 2 GB

Microsoft IP Load-Balancing doesn’t help PI Connections are “stateful” and are not all equal

Achieving “bumpless” transfer is difficult without hardware solutions

Archive Subsystem BottlenecksArchive Subsystem Bottlenecks

Archive Cache

Rate at which archive cachRate at which dirty archivArchive cache records in mArchive cache disk reads. Archive cache disk writes.Archive cache memory hits.

9/14/2001 9:05:00 9/14/2001 9:20:0015.00 Min(s)

2

4

6

8

10

0

12

0

12

33500

38000

0

350

1.24

1.4

0

5000

Our PI 4 Wish ListOur PI 4 Wish List

A true multi-threaded archive subsystem

A connection and request logging facility Not just who and how much, but what

A way to restrict expensive API queries or users

A point-database change-log feature The issue of “meta-data” Some current workarounds with scripts Implemented in PI 3.3

Questions & Discussion