axe 30 minutes

22
2004-10-18 1 Soft Real Time and High Availability, the AXE approach AXE applications Co ntrol system str ucture Hard real time vs. sof t real time Event driven e xecution, soft real time and parallel processes Fault to lerance and recovery Upgrade Scalability Operation under overload (separat e pr esenta tio n)

Upload: kriti1989

Post on 10-Apr-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: AXE 30 Minutes

8/8/2019 AXE 30 Minutes

http://slidepdf.com/reader/full/axe-30-minutes 1/22

2004-10-18 1

Soft Real Time and High Availability, the AXEapproach

AXE applications

Control system structure Hard real time vs. soft real time

Event driven execution, soft real time and parallel

processes

Fault tolerance and recovery Upgrade

Scalability

Operation under overload (separate presentation)

Page 2: AXE 30 Minutes

8/8/2019 AXE 30 Minutes

http://slidepdf.com/reader/full/axe-30-minutes 2/22

2004-10-18 2

TR

LIT

MSC

HLR SCP

GMSC

MSC

T

I

SCP

SCPIN

MS BTS

BSCGSM

HLRSCP

GMSC

MSC TDMA

ILRL

AXE

TSPCPP

WPP

 AXD

EAR

TMOS/CIF

 AN

 ADSL

T

CCN

FNR

TeS

MSG

 ATM Back bone

 AN

RNC

3G

UMTS

GPRSSGSN

GGSN

Internet³MGW´

OSS

CSCF

HSS AS

IPMM/SIP

PCU

AXE Applications in Telecom Networks

Page 3: AXE 30 Minutes

8/8/2019 AXE 30 Minutes

http://slidepdf.com/reader/full/axe-30-minutes 3/22

2004-10-18 3

AXE Control SystemStructure

Central

Processor 

Regional

Processor 

Adjunct

Pro,( I/O)

Application Hardware

CentralProcessor 

Adjunct

Pro,( I/O)

DP

HDLC/Ethernet

< 1024

Page 4: AXE 30 Minutes

8/8/2019 AXE 30 Minutes

http://slidepdf.com/reader/full/axe-30-minutes 4/22

2004-10-18 4

AXE Hard Real Time vs. Soft Real Time

Central

Processor 

Regional

Processor 

Adjunct

Pro,( I/O)

Application Hardware

CentralProcessor 

Adjunct

Pro,( I/O)

Hard Real Time

Soft Real Time

DP

~1ms

Page 5: AXE 30 Minutes

8/8/2019 AXE 30 Minutes

http://slidepdf.com/reader/full/axe-30-minutes 5/22

2004-10-18 5

Soft Real Time: Event (Signal) Driven Execution

SW

UnitEvent

Buffer 

(Typically

2 us/event)

External

EventsInternal Events

Response

ms level

Subject to

Load control

Page 6: AXE 30 Minutes

8/8/2019 AXE 30 Minutes

http://slidepdf.com/reader/full/axe-30-minutes 6/22

2004-10-18 6

Each event executes until next event is generated, that is

not processes interrupted by a time sharing system

The execution time is limited by design rules (and checks)

The number of internal events is known at system design

The occupancy level of the event buffer is subject to load

control

All share the same level

Soft Real Time: Event (Signal) Driven Execution

Page 7: AXE 30 Minutes

8/8/2019 AXE 30 Minutes

http://slidepdf.com/reader/full/axe-30-minutes 7/22

Page 8: AXE 30 Minutes

8/8/2019 AXE 30 Minutes

http://slidepdf.com/reader/full/axe-30-minutes 8/22

2004-10-18 8

AXE HW Redundancy

Central

Processor 

Regional

Processor 

Adjunct

Pro,( I/O)

Application Hardware

CentralProcessor 

Adjunct

Pro,( I/O)

DP

HDLC/Ethernet

Page 9: AXE 30 Minutes

8/8/2019 AXE 30 Minutes

http://slidepdf.com/reader/full/axe-30-minutes 9/22

2004-10-18 9

AXW HW redundancy

RP: Duplicated with simple fail over, or pooled.

Data loss (only temporary data)

AP(I/O): Duplicated, secure data on RAID disks

CP (classic systems): Duplicated, synchronous mode with

transparent fail-over 

CP (modern systems): Duplicated, non synchronous,

warm stand-by with possibility for Soft Side Switch for 

maintenance purposes (repair and upgrade)

Page 10: AXE 30 Minutes

8/8/2019 AXE 30 Minutes

http://slidepdf.com/reader/full/axe-30-minutes 10/22

2004-10-18 10

AXE HW Redundancy, Soft Side Switch

A-side memory B-side memory

Wr iteTransf er all pages

LOOP:

Transf er all

modified pages

UNTIL

Hot area stable;

HALT execution;

Transf er hot area;

RESUME on B-side;

Frequentwr ite

Page 11: AXE 30 Minutes

8/8/2019 AXE 30 Minutes

http://slidepdf.com/reader/full/axe-30-minutes 11/22

2004-10-18 11

AXE SW Recovery.

SW recovery actions are :

- Selective, depending on severance, possibility to recover 

and system state (history)

- Coordinated/consistent all over the system

Page 12: AXE 30 Minutes

8/8/2019 AXE 30 Minutes

http://slidepdf.com/reader/full/axe-30-minutes 12/22

2004-10-18 12

AXE SW Recovery. Levels and Escalation

No action / register irregularity

Perform low level recovery = single transaction (a call) fails

Suppressed/delayed system restart, raise alarm

Small system restart = transactions in dynamic states are

lost (not established calls are lost,established are checked

Large system restart = all transactions are lost (all calls)

Large system restart with reload from back-up copy

Large system restart with reload from ³old´ back-up copy

Escalation to next restart level if a problem recurs

Page 13: AXE 30 Minutes

8/8/2019 AXE 30 Minutes

http://slidepdf.com/reader/full/axe-30-minutes 13/22

2004-10-18 13

AXE SW Recovery. Low Level Recovery

An identity (ID) is tied to each resource included in a

µtransaction¶, typically a call or a command.

The processing platform provides support for creation of 

ID and linking to application SW.

In case of an execution error, the platform identifies all SW

units concerned and orders release over a standard

interface.

Page 14: AXE 30 Minutes

8/8/2019 AXE 30 Minutes

http://slidepdf.com/reader/full/axe-30-minutes 14/22

2004-10-18 14

AXE SW Recovery. Low Level recovery

Ix IxIx

Ix

Low level recovery handler 

IxIx

IxIx

Low level recovery handler 

Link sIx

Transaction

Ix

Execution

Error!

ID=Ix

Release

Page 15: AXE 30 Minutes

8/8/2019 AXE 30 Minutes

http://slidepdf.com/reader/full/axe-30-minutes 15/22

2004-10-18 15

AXE SW Recovery. The Reality

SW Error 

Low Level Recovery

No Action Filter 

99,8%

0,1%

System Restart

< 0,1%

Page 16: AXE 30 Minutes

8/8/2019 AXE 30 Minutes

http://slidepdf.com/reader/full/axe-30-minutes 16/22

2004-10-18 16

AXE SW Upgrade

Two different methods are used for SW upgrade:

- Corrections/patches and

- New SW packages

Corrections/patches are local changes of code inserted at

assembler level when the CP is idle => no disturbance

New SW packages are introduced when major changes

including new data structures are required.

The new version of a SW units inherit data from the old

version and are switched in with a system restart => at

least yearly disturbance of new calls (~1 min. down time)

Page 17: AXE 30 Minutes

8/8/2019 AXE 30 Minutes

http://slidepdf.com/reader/full/axe-30-minutes 17/22

2004-10-18 17

AXE SW Upgrade, Data Inheritance

Data Change

Inf ormationData Change

Inf ormationData Change

Inf ormation

Old unit New unit

Diff erent upgrade

cases

Page 18: AXE 30 Minutes

8/8/2019 AXE 30 Minutes

http://slidepdf.com/reader/full/axe-30-minutes 18/22

2004-10-18 18

AXE Scalability

The traditional AXE in scalable only in the RP region.

For the CP only a low-end/high-end option exists.

In modern applications the need for HW related RPscalability is decreasing but the need for CP scalability is

increasing. To achieve better scalability AXE uses two

approaches:

1) Parallel multi-threaded execution with common memory2) Clusters of CPs with network interfaces

Page 19: AXE 30 Minutes

8/8/2019 AXE 30 Minutes

http://slidepdf.com/reader/full/axe-30-minutes 19/22

2004-10-18 19

AXE Scalability, Multi-threaded Execution

Must be 101% compatible with application SW (includes

fault compatibility!)

The problem is not to make it work

The real problem is to make an efficient implementation

with limited over-head including the cost for cache

coherency => minimize true concurrent execution =>

combine concurrency with functional distribution! =(CMX-FD)

Page 20: AXE 30 Minutes

8/8/2019 AXE 30 Minutes

http://slidepdf.com/reader/full/axe-30-minutes 20/22

2004-10-18 20

AXE Scalability, Concurrent Multi ExecutionProcessor Core

MemoryProcessor Core

Memory

Processor CoreMemory

Processor CoreMemory

CMX-FD. Simplified View

CMX

FD

CMX

FD

CMX

FD

CMX

FD

Cluster of Application Modules

Cluster of APZ-OSand Platform SW

Cluster of Application Modules

Cluster of Application Modules

FD-mode: Functional Distribution, that is each function is allocated to execute on one Processor Core only.

CMX-mode: Concurrent (Multi) eXecution, that is each SW unit is allowed to execute on all Processor Cores, but

only one at a time. Certain sequencing rules must be obeyed in order to make each ³call´ exe-

cute like it would in a single CP system.

Page 21: AXE 30 Minutes

8/8/2019 AXE 30 Minutes

http://slidepdf.com/reader/full/axe-30-minutes 21/22

2004-10-18 21

AXE Scalability. Cluster Systems

Call

Control

T

Protocol

Term. +

Dis patcher 

 Network 

 protocols

 N+1

Cluster s address

- scalability

-down t

ime atupgrade

- down time at

node f ailure

Page 22: AXE 30 Minutes

8/8/2019 AXE 30 Minutes

http://slidepdf.com/reader/full/axe-30-minutes 22/22

2004-10-18 22

AXE 10 Minutes, CP ¶Classicµ vs. µModern¶

HW

MIP

Application SW

APZ-CP OS

HW (Q-processor)

OS (Tru64)

APZ-VMASA-

compiler 

Application SW

APZ-CP OSSame Same Same