eventprocessing% redundant datacenter configuration and …€¦ ·...

16
Real Time Data Processing System Real Time Data Acquisi7on System Informa7on systems Data Sources Event informa7on ShakeMap informa7on Event processing ShakeMap processing Mathias Franke Stefan Radman Toufik Alilli Kinemetrics, Inc Open Systems & Services Department www.kmioss.com Redundant Datacenter Configuration and Failover Procedure for Real-Time Seismic Networks Antelope User Group Mee;ng San Diego, CA January 1416, 2015

Upload: others

Post on 19-Jun-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Eventprocessing% Redundant Datacenter Configuration and …€¦ · Outline’of’Failover’Implementa#on’ 0.Normaloperaon 1. Failure’detec#on’of’networkand’ applicaon

Real  Time  Data  Processing  System  

Real  Time  Data  Acquisi7on  System  

Informa7on  systems  

Data  Sources  

Event  informa7on   ShakeMap  informa7on  

Event  processing   ShakeMap  processing  

Mathias  Franke  Stefan  Radman  Toufik  Alilli  Kinemetrics,  Inc  Open  Systems  &  Services  Department  www.kmioss.com  

Redundant Datacenter Configurat ion and Fai lover Procedure for Real-Time Seismic

Networks  

Antelope  User  Group  Mee;ng  

San  Diego,  CA  January  14-­‐16,  2015  

Page 2: Eventprocessing% Redundant Datacenter Configuration and …€¦ · Outline’of’Failover’Implementa#on’ 0.Normaloperaon 1. Failure’detec#on’of’networkand’ applicaon

•  Catastrophic  events  require  to  move  a  seismic  data  center  to  a  backup  loca7on  in  order  to  maintain  cri7cal  opera7on  and  informa7on  dissemina7on  capabili7es.      

•  Intra-­‐data  center  redundancy  is  rapidly  progressing  to  virtual  machines  moving  between  two  or  more  servers  

•  Such  solu7on  for  inter-­‐data  center  redundancy  is  currently  hampered  by  –  Bandwidth  limita7on  of  Wide  Area  Network  (WAN)  connec7ons  between  the  primary  and  secondary  loca7on  

–  Database  migra7on/synchroniza7on  •  An  alterna7ve  solu7ons  provides  the  implementa7on  of  failover  procedures  

Mo#va#on  

Page 3: Eventprocessing% Redundant Datacenter Configuration and …€¦ · Outline’of’Failover’Implementa#on’ 0.Normaloperaon 1. Failure’detec#on’of’networkand’ applicaon

Outline  of  Failover  Implementa#on  

0.    Normal  opera#on  

1.   Failure  detec#on  of  network  and  

applica#on    

2.   Failover  procedure  

3.   Fencing  opera#on  

4.   Failback  procedure  5.   Recovery  

Page 4: Eventprocessing% Redundant Datacenter Configuration and …€¦ · Outline’of’Failover’Implementa#on’ 0.Normaloperaon 1. Failure’detec#on’of’networkand’ applicaon

Normal  Opera#on:  System  Layout  

Page 5: Eventprocessing% Redundant Datacenter Configuration and …€¦ · Outline’of’Failover’Implementa#on’ 0.Normaloperaon 1. Failure’detec#on’of’networkand’ applicaon

Network  Failover  

Dynamic  Re-­‐rou7ng  

X

X  

Page 6: Eventprocessing% Redundant Datacenter Configuration and …€¦ · Outline’of’Failover’Implementa#on’ 0.Normaloperaon 1. Failure’detec#on’of’networkand’ applicaon

Applica#on  Failover:  System  En##es  

Real  Time  Data  Processing  System  

Real  Time  Data  Acquisi7on  System  

Informa7on  systems  

Data  Sources  

Event  informa7on   ShakeMap  informa7on  

Event  processing   ShakeMap  processing  

Page 7: Eventprocessing% Redundant Datacenter Configuration and …€¦ · Outline’of’Failover’Implementa#on’ 0.Normaloperaon 1. Failure’detec#on’of’networkand’ applicaon

Normal  Opera#on:  Data  Flow  -­‐  PS  

Page 8: Eventprocessing% Redundant Datacenter Configuration and …€¦ · Outline’of’Failover’Implementa#on’ 0.Normaloperaon 1. Failure’detec#on’of’networkand’ applicaon

Normal  Opera#on:  Data  Flow  -­‐  BUS    

Page 9: Eventprocessing% Redundant Datacenter Configuration and …€¦ · Outline’of’Failover’Implementa#on’ 0.Normaloperaon 1. Failure’detec#on’of’networkand’ applicaon

Normal  Opera#on  Backup  System  (BUS)  

System  characteris#cs  

•  Concurrent  data  acquisi7on  

•  Processing  at  both  data  centers  

•  Two  produc7on  databases  

•  Unique  IDs  via  ID  server      

•  Replica7on  of  database  IDs  

•  One  ac7ve/authorita7ve  

informa7on  system  

Page 10: Eventprocessing% Redundant Datacenter Configuration and …€¦ · Outline’of’Failover’Implementa#on’ 0.Normaloperaon 1. Failure’detec#on’of’networkand’ applicaon

Applica#on  Failover:  Detec#on  Backup  System  (BUS)  

Crucial  components  

•  One  Database  ID  Server  •  Database  ID  Replica7on  •  Monitoring  of  ID  Server  

(Failure  Detec,on)    

•  Failover  7me  

(Failure  Detec,on)    

•  Authority  (Fencing)  

Page 11: Eventprocessing% Redundant Datacenter Configuration and …€¦ · Outline’of’Failover’Implementa#on’ 0.Normaloperaon 1. Failure’detec#on’of’networkand’ applicaon

Applica#on  Failover    Backup  System  (BUS)  

Failover  process  (automa#c)  

•  Ini7ated  by  BUS  

•  Synchronize  local  database  IDs  

•  Ac7vate  ID  Server  

•  Synchronize  local  database  with  

local  ID  Server  (Fencing)  

•  Restart  database  writers  

•  Ac7vate/authorize  informa7on  

systems    

•  No7fy  operators  (Email,  SMS)  

Page 12: Eventprocessing% Redundant Datacenter Configuration and …€¦ · Outline’of’Failover’Implementa#on’ 0.Normaloperaon 1. Failure’detec#on’of’networkand’ applicaon

Applica#on  Failover  Backup  System  (BUS)  

Post  failover  

•  BUS  providing  database  IDs  

•  BUS  has  authorita7ve  database  

•  BUS  feeding  informa7on  to  the  

world  

•  BUS  prevents  ac7va7on  of  ID  

Server  at  PS  (Fencing)  

Page 13: Eventprocessing% Redundant Datacenter Configuration and …€¦ · Outline’of’Failover’Implementa#on’ 0.Normaloperaon 1. Failure’detec#on’of’networkand’ applicaon

Applica#on  Failover  

PS    •  Check  db  consistency  •  Start  real-­‐7me  data  acquisi7on  •  Start  real-­‐7me  data  processing  •  Verify  data  flow  

•  Check  db  consistency  replicated  IDs  •  Start  ID-­‐Server  

•  Start  informa7on  systems  

BUS  

•  Stop  db  writers  •  Stop  ID  server  •  Replicate  IDs  to  PS  •  Synchronize  local  db  to  remote  ID-­‐

Server  (change  db  descriptor)  •  Stop  monitoring/fencing  PS  

•  Start  db  writers  •  Start  Monitoring  PS  •  Stop  informa7on  systems  

Failback  (manual)  

Page 14: Eventprocessing% Redundant Datacenter Configuration and …€¦ · Outline’of’Failover’Implementa#on’ 0.Normaloperaon 1. Failure’detec#on’of’networkand’ applicaon

BUS    •  Extract  data  to  temporary  database      PS    •  Stop  database  writers  •  Merge  data  from  temporary  database  •  Start  database  writers  

Applica#on  Failover:  Data  Recovery  

Page 15: Eventprocessing% Redundant Datacenter Configuration and …€¦ · Outline’of’Failover’Implementa#on’ 0.Normaloperaon 1. Failure’detec#on’of’networkand’ applicaon

Important    

•  Wri^en  procedures  for  stressful  situa7ons  •  Drills  and  dry-­‐runs  for  applica7on  and  network  failover  test  

•  At  least  once  a  year  •  Revise  procedures  if  needed  

System  Failover:  Final  Note  

Page 16: Eventprocessing% Redundant Datacenter Configuration and …€¦ · Outline’of’Failover’Implementa#on’ 0.Normaloperaon 1. Failure’detec#on’of’networkand’ applicaon

Redundant  Data  Center  Configura#on  

THANK  YOU