using oracle timesten to deploy low latency voip...
TRANSCRIPT
Using Oracle TimesTen to Deploy Low Latency VOIP Applications in Remote SitesThomas LynnComcast National Engineering & Technical Operations
October 2009
2
VOIP Application Background• Comcast had a need to build a VOIP application server to support the needs of
multiple value added applications• Universal Caller ID (UCID) is first in the deployment list• UCID provides a link between Comcast’s Digital Voice, Digital Cable and High-Speed
Internet offerings• UCID provides caller id on televisions through Digital Cable and PCs with a small
downloadable client• UCID is deployed on a general purpose Communications Application Server
infrastructure
Communications Applications Server (CAS)• Some CAS applications deployed including UCID would need to be part of the
Session Initiation Protocol (SIP) call flow• Applications that are part of SIP call flow can increase call connect times• Call connection times relate directly to customer perception of CDV quality• Low internal application latency was deemed to be extremely important to maintain
customer satisfaction
Comcast Digital Voice® (CDV)• Comcast Digital Voice has become the third largest residential phone service
provider in the U.S. • 7 million customers• 80+ VOIP switches deployed across the united states
3
Simplified VOIP Call Flow1. The calling party’s VOIP Phone contacts Originating VOIP
Switch2. Originating VOIP Switch determines destination and
contacts Terminating VOIP Switch3. Terminating VOIP Switch contacts the destination party’s
VOIP Phone
Simplified VOIP Call Flow with UCID1. The calling party’s VOIP Phone contacts the Originating VOIP switch2. Originating VOIP Switch determines destination and contacts the Terminating VOIP Switch3. Terminating VOIP Switch determines the call should be routed to CAS and contacts CAS4. CAS performs determination of
customer services and responds to the Terminating VOIP Switch
5. The Terminating VOIP Switch contacts the destination party’s CDV phone
6. Once the destination phone rings, UCID contacts the destination party’s TV and PC
Application transactions in the SIP call flow must be fast to add value and not detract from quality
Applications In The SIP Call Flow
4
Communications Application Server Topology
• Multiple site deployments near customers were deemed necessary to eliminate the possibility of network latency
• Many other steps were taken to reduce application latency, one of them was database architecture
• Application Response Time of less than 100ms is necessary to maintain high quality connect times
Database Requirements• To reduce application latency and drive call quality, a
local data source at each site makes sense• Centralized data for ease of administration• Geo-Redundancy• High Availability• Internal Site Redundancy• Response time of less than 10ms
Meeting The Requirements• TimesTen meets the needs of real time queries• Cache Connect to Oracle meets the centralized and
localized strategy at the same time• Oracle Data Guard serves the Geo-Redundancy
requirements• Oracle RAC provides high availability
Data In The Application Layer Reduces Latency
• Observed local query time 0.002174 sec/query• Observed WAN query time 0.150009 sec/query
RAC DB
EAST
RAC DB
WEST
5
IN MEMORY DATABASE
Active
IN MEMORY DATABASE
Standby
IN MEMORY DATABASE
Subscriber
IN MEMORY DATABASE
Subscriber
10g
Database
Cache Connect
TimesTen Replication
Meeting internal site redundancy requirements
• Once data is replicated into the site, a second strategy was required to meet site redundancy requirements
• TimesTen Active Standby Pair with Subscribers was decided upon because it provided the ability to maintain replicas of the cached data the site
• As data enters the site from Oracle Cache Connect it is committed to the Standby Master first, this ensures that the Standby is always up to date and can take over for the Master without complete cache reload
• The Master is then committed• Finally the subscribers replicate via the Standby since it
contains the most accurate cache in the site
Application Data Source Strategy• In order to maintain speed in the application layer and
add redundancy, applications must have built in data source failover with priority
• One of the reasons TimesTen is so fast is that the applications reside on the same server as the database
• No network protocol overhead for local queries• Applications should prioritize data sources and always
select local sources over remote
6
Table Structure Cache Replication
Cache Connect to Oracle allows for simple and accurate replication strategies
• Cache Connect groups can follow Primary Key / Foreign Key table design in Oracle source database tables
• Cache groups can be built with a tiered structure that allows multiple tables to replicate together in single transactions, this allows TimesTen to mimic constraints in Oracle and replicate data without constraint errors
• If there are requirements for data to differ from site to site there is a filtering ability in cache groups that allows for data from all sites to be stored centrally but filtered by Cache Connect
• If cache group filters are applied on parent tables in the cache, only data linked to that parent will be replicated based on the PK/FK relationship
7
Cache Connect Replication With Where Clause allows for data from all sites to be stored in the same centra l tables but replicate to only desired sites
22222266 nowhere555-555-5556
11111155 nowhere555-555-5555
22222244 nowhere555-555-5554
11111133 nowhere555-555-5553
22222222 nowhere555-555-5552
11111111 nowhere555-555-5551
SITEIDZIPADDRESSTEL
CUSTOMERS
11111155 nowhere555-555-5555
11111133 nowhere555-555-5553
11111111 nowhere555-555-5551
SITEIDZIPADDRESSTEL
CUSTOMERS
San Francisco2
Philadelphia1
SITENAMESITEID
SITE
TimesTen SITE 1
Philadelphia1
SITENAMESITEID
SITEOracle Source Tables
22222266 nowhere555-555-5556
22222244 nowhere555-555-5554
22222222 nowhere555-555-5552
SITEIDZIPADDRESSTEL
CUSTOMERS
TimesTen SITE 2
San Francisco2
SITENAMESITEID
SITE
CALLID.SITE (SITEID NUMBER(38) NOT NULL,SITENAME VARCHAR2(30 BYTE) INLINE NOT NULL,
primary key (SITEID))where (CALLID.SITE.SITEID=1),
CALLID.CUSTOMERS (TEL VARCHAR2(10 BYTE) INLINE NOT NULL,ADDRESS VARCHAR2(30 BYTE) INLINE,ZIP VARCHAR2(10 BYTE) INLINE,SITEID NUMBER(38),
primary key (TEL),foreign key (SITEID) references CALLID.SITE (SITEID ))
Cache Tables Creation Syntax
8
Putting It All Together In Clusters Enhances Built in Redundancy
• In order to ease maintenance CAS has been separated into multiple functional units or clusters per site
• Each unit can stand alone to serve SIP requests during maintenance windows
• The following possibilities can be overcome with little affect when running in a mode such as this:
Loss of Network Connection to Oracle Source DB
Loss of Active Master
Loss of Standby Master
Loss of Subscriber
Load
Balancer
SIP APPLICATION
IN MEMORY
DATABASE
SIP APPLICATION
IN MEMORY
DATABASE
Server 1
Cluster 1
Server 2
Cluster 1
SIP APPLICATION
IN MEMORY
DATABASE
SIP APPLICATION
IN MEMORY
DATABASE
Server 1
Cluster 2
Server 2
Cluster 2
10g Database
CACHECONNECT
CACHE CONNECT
Cluster 1 Cluster 2
10g DatabaseData Guard
REP REP
REP
9
Built in Redundancy:Loss of Network Connection to Oracle DB
• During network outage to the Source Database only updates are lost to the site
• The Site can run as a stand alone entity for hours or days if necessary
• Triggers created by Cache Connect maintain change records in intermediate tables on the Source Database
• Once connection is restored, TimesTen will receive incremental changes from Oracle
Load
Balancer
SIP APPLICATION
IN MEMORY
DATABASE
SIP APPLICATION
IN MEMORY
DATABASE
Server 1
Cluster 1
Server 2
Cluster 1
SIP APPLICATION
IN MEMORY
DATABASE
SIP APPLICATION
IN MEMORY
DATABASE
Server 1
Cluster 2
Server 2
Cluster 2
10g Database
Cluster 1 Cluster 2
10g DatabaseData Guard
REP REP
REP
10
Built in Redundancy:Loss of Active Master
• Active master loss can be overcome by elevating the Standby to Master
• Since the Standby is Committed before the Master, it is always the most accurate copy in the site so only incremental changes are required
• Applications that were locally connected to the master on Server 1 must gracefully failover to Server 2
• In this case minimal query latency will be introduced on Server 1 due to network protocol overhead and LAN latency
Load
Balancer
SIP APPLICATION SIP APPLICATION
IN MEMORY
DATABASE
Server 1
Cluster 1
Server 2
Cluster 1
SIP APPLICATION
IN MEMORY
DATABASE
SIP APPLICATION
IN MEMORY
DATABASE
Server 1
Cluster 2
Server 2
Cluster 2
10g Database
CACHE CONNECT
Cluster 1 Cluster 2
10g DatabaseData Guard
REP
REP
11
Built in Redundancy:Loss of Standby
• Standby failure can be overcome by replicating the Subscribers directly from the Master
• When the Standby is inactive, the Master will commit first
• Applications that were locally connected to Server 2 must gracefully failover to Server 1
• In this case minimal query latency will be introduced on Server 2 due to network protocol overhead and LAN latency
Load
Balancer
SIP APPLICATION
IN MEMORY
DATABASE
SIP APPLICATION
Server 1
Cluster 1
Server 2
Cluster 1
SIP APPLICATION
IN MEMORY
DATABASE
SIP APPLICATION
IN MEMORY
DATABASE
Server 1
Cluster 2
Server 2
Cluster 2
10g Database
CACHECONNECT
Cluster 1 Cluster 2
10g DatabaseData Guard
REP
REP
12
Built in Redundancy:Loss of Subscriber
• Subscriber Loss is the most simplistic case since these data stores only pull data incrementally from the Master or Standby
• Subscribers have no interaction with Oracle so there are no concerns with the incremental state of Cache Connect
• Applications that were locally connected to the failed instance must failover and minimal latency is introduced on the affected server
Load
Balancer
SIP APPLICATION
IN MEMORY
DATABASE
SIP APPLICATION
IN MEMORY
DATABASE
Server 1
Cluster 1
Server 2
Cluster 1
SIP APPLICATION SIP APPLICATION
IN MEMORY
DATABASE
Server 1
Cluster 2
Server 2
Cluster 2
10g Database
CACHECONNECT
CACHE CONNECT
Cluster 1 Cluster 2
10g DatabaseData Guard
REP
REP
13
Built in Redundancy:Ease of Maintenance
• To perform maintenance on sites, load can be redirected to Cluster 2, this creates an application snapshot in Cluster 2 which continues to serve subscribers while Cluster 1 is modified
• Schema changes could occur in Oracle Source Tables
• Complete rebuilds of cache groups could occur
• New applications could be installed
• If source table changes occur in Oracle, full cache table rebuilds must occur so that triggers can be validated
Load
Balancer
SIP APPLICATION
IN MEMORY
DATABASE
SIP APPLICATION
IN MEMORY
DATABASE
Server 1
Cluster 1
Server 2
Cluster 1
SIP APPLICATION
IN MEMORY
DATABASE
SIP APPLICATION
IN MEMORY
DATABASE
Server 1
Cluster 2
Server 2
Cluster 2
10g Database
CACHECONNECT
CACHE CONNECT
Cluster 1 Cluster 2
10g DatabaseData Guard
REP
14
Built in Redundancy:Ease of Maintenance
• Once initial maintenance is complete, traffic can be re-pointed to Cluster 1 while Cluster 2 is modified
Load
Balancer
SIP APPLICATION
IN MEMORY
DATABASE
SIP APPLICATION
IN MEMORY
DATABASE
Server 1
Cluster 1
Server 2
Cluster 1
SIP APPLICATION
IN MEMORY
DATABASE
SIP APPLICATION
IN MEMORY
DATABASE
Server 1
Cluster 2
Server 2
Cluster 2
10g Database
CACHECONNECT
CACHE CONNECT
Cluster 1 Cluster 2
10g DatabaseData Guard
REP
15
Issues Encountered / Fixes• Cache Groups should always be dropped when shutting down sites for extended periods of
time, or decommissioning, if this is not performed intermediate tables will grow infinitely or until tablespace is full tracking the incremental changes in Oracle
• Tracking down TimesTen sites that are causing intermediate table growth is relatively simple
� Determine the intermediate “table tt_03_{number}_l” causing db load
� “select * from tt_03_agent_status where object_id = {same number as table} order by bookmark;”
� The lowest bookmark should be the site with issues
� Reconnecting Cache Connect will cause intermediate table cleanup• Even minor modifications to Source Table Schema require complete Cache Group rebuild
else logs will complain about validity
Conclusion• TimesTen with Cache Connect to Oracle allows us to meet our strict latency requirements
• Simplifies redundancy
• Provides ease of site specific data replication
• TimesTen with Cache Connect provides an excellent way to maintain a centralized database with datasources at the edge