scaling web systems ts
DESCRIPTION
TRANSCRIPT
© 2010 Ariba, Inc. All rights reserved. The contents of this document are confidential and proprietary information of Ariba, Inc.
Sathya P
Scaling Web
Systems
© 2010 Ariba, Inc. All rights reserved. The contents of this document are confidential and proprietary information of Ariba, Inc.
What is scalability?
© 2010 Ariba, Inc. All rights reserved. The contents of this document are confidential and proprietary information of Ariba, Inc.
The ability of a system, network, or process to handle a growing amount of work in a capable manner or its ability to be enlarged to accommodate that growth. (Wikipedia)
The ability to handle increased workload by repeatedly applying a cost effective strategy for extending a system’s capacity (SEI)
Some standard definitions
© 2010 Ariba, Inc. All rights reserved. The contents of this document are confidential and proprietary information of Ariba, Inc.
Consider a simple web application
© 2010 Ariba, Inc. All rights reserved. The contents of this document are confidential and proprietary information of Ariba, Inc.
Load v/s Performance of a non-scalable system
© 2010 Ariba, Inc. All rights reserved. The contents of this document are confidential and proprietary information of Ariba, Inc.
Scalability Bottlenecks
Memory Out of memory Disk thrashing Fragmentation
CPU Overload Context switches I/O waits
Others Disk Network
© 2010 Ariba, Inc. All rights reserved. The contents of this document are confidential and proprietary information of Ariba, Inc.
Load v/s performance of a scalable system (Ideal)
© 2010 Ariba, Inc. All rights reserved. The contents of this document are confidential and proprietary information of Ariba, Inc.
Principles
© 2010 Ariba, Inc. All rights reserved. The contents of this document are confidential and proprietary information of Ariba, Inc.
Improve application performance…
© 2010 Ariba, Inc. All rights reserved. The contents of this document are confidential and proprietary information of Ariba, Inc.
Improve application performance
Identify and fix performance bottlenecks Algorithms DB queries Thread Deadlocks I/O
Why is it importantWhen you use less resources per task (processor
time, memory, N/W round trips etc)…
…You can handle a lot more load
© 2010 Ariba, Inc. All rights reserved. The contents of this document are confidential and proprietary information of Ariba, Inc.
Loose coupling paradigms - SOA
Loosely coupled interactions One-to-one communications Consumer-based trigger Synchronous
© 2010 Ariba, Inc. All rights reserved. The contents of this document are confidential and proprietary information of Ariba, Inc.
Decoupled interactions Many-to-many communications Event-based trigger Asynchronous
Loose coupling paradigms - EDA
© 2010 Ariba, Inc. All rights reserved. The contents of this document are confidential and proprietary information of Ariba, Inc.
Distribute work, data
Motivations: Can scale independently Failures are isolated
Segment Functionality application pools
Segment Data Based on functional areas Horizontal split
© 2010 Ariba, Inc. All rights reserved. The contents of this document are confidential and proprietary information of Ariba, Inc.
Asynchronous communication
Motivations Can scale components independently Can decouple availability Can spread peak load over time
Integrate different services asynchronously Point to point / publish subscribe Staged event driven architecture (SEDA)
© 2010 Ariba, Inc. All rights reserved. The contents of this document are confidential and proprietary information of Ariba, Inc.
Point to point messaging
Publish-Subscribe messaging
© 2010 Ariba, Inc. All rights reserved. The contents of this document are confidential and proprietary information of Ariba, Inc.
Aggressive Caching
Motivations: Save processing cycles Save on network round-trip delays
Content caching on CDNs Caching on clients (browsers/mobile devices) Caching at application layer Distributed caching
© 2010 Ariba, Inc. All rights reserved. The contents of this document are confidential and proprietary information of Ariba, Inc.
Cache Everywhere
Web BrowserWeb Browser
ProxyProxy App ServerApp Server
Distributed In memory CacheDistributed In memory Cache
DatabaseDatabase
Page CachePage CacheBrowser CacheBrowser Cache
Query CacheQuery Cache
LRU CacheLRU CacheResult CacheResult Cache
CDNCDN
Resource CacheResource Cache
© 2010 Ariba, Inc. All rights reserved. The contents of this document are confidential and proprietary information of Ariba, Inc.
Avoid or distribute state
Motivations: Save memory and processing cycles Reduce machine affinity
Strive for statelessness Maintain session data in browsers if possible Store session state in distributed cache
© 2010 Ariba, Inc. All rights reserved. The contents of this document are confidential and proprietary information of Ariba, Inc.
Database
Simplify entity relationships to aid split Use the right kind of Database lock Avoid distributed transactions Don’t select everything, read only as much
data as you can use Consider NoSQL storage
© 2010 Ariba, Inc. All rights reserved. The contents of this document are confidential and proprietary information of Ariba, Inc.
Scalability @ Ariba
© 2010 Ariba, Inc. All rights reserved. The contents of this document are confidential and proprietary information of Ariba, Inc.
How do we scale…
Individual community
© 2010 Ariba, Inc. All rights reserved. The contents of this document are confidential and proprietary information of Ariba, Inc.
Scaling storage: Multi SID
Goal: To be able to scale out DB storage as required
DB Instance #1 DB Instance #2
Persistence LayerPersistence Layer
ApplicationApplication
Realm-Schema mapRealm-Schema map
Schema 1
Schema 2Schema 20
Schema 1
Schema 2Schema 20
© 2010 Ariba, Inc. All rights reserved. The contents of this document are confidential and proprietary information of Ariba, Inc.
Clustering v/s Sharding
Clustering Oracle RAC, Hbase Automatically scale datastore Rebalances to distribute capacity Nodes communicate with each other Very complicated Cluster manager failure!
Sharding Data distributed manually Split database to add capacity Data does not move Nodes are unaware of each other Custom algorithm based on functional / key distribution
© 2010 Ariba, Inc. All rights reserved. The contents of this document are confidential and proprietary information of Ariba, Inc.
Scaling search: ArchesGoal: To be able to scale up search/publish activities linearly
© 2010 Ariba, Inc. All rights reserved. The contents of this document are confidential and proprietary information of Ariba, Inc.
Arches goals… Elastic architecture with ability to add capacity on the fly Sub-second search performance Improving indexing performance with customer isolation
Eventually be used to build the global search service all across Ariba
Arches interfaces… Pub API: Publish endpoint exposed over one way
messaging Search API: REST based search endpoint Pull API: REST based data pull endpoint to be
implemented by applications Manage API: REST based management endpoint
© 2010 Ariba, Inc. All rights reserved. The contents of this document are confidential and proprietary information of Ariba, Inc.
Lightweight Metadata: Overcoming memory bottleneck
Problem: All realms, even realms with no customization and
no activity consume lots of system resources Realms with no customization have the same foot
print as realms with lots of customization Important for mid-market offering
Goals
© 2010 Ariba, Inc. All rights reserved. The contents of this document are confidential and proprietary information of Ariba, Inc.
Light weight metadata solution
Shape of a Class is now shared across Variants
Sub Types computed dynamically
© 2010 Ariba, Inc. All rights reserved. The contents of this document are confidential and proprietary information of Ariba, Inc.
Dynamic capacity realms project
Dynamic scalability Tolerate change to cluster topology
Goal: Remove downtimes for scaling our products.
Central connection manager
Central connection manager to distribute database connections based on usage
Local pools & Global pool
Goal: Improve database connection usage.
© 2010 Ariba, Inc. All rights reserved. The contents of this document are confidential and proprietary information of Ariba, Inc.
ADE (Ariba data enrichment) scalability
Product Engines
ADEClient
Weblogic AdministrativeServer
Weblogic ManagedServer
Instance 1
Instance 3
SDB(Ops)
Load balancing (EJBs & RMI)Weight based sticky session
ADEDB
Instance 2
JMSWeblogic ManagedServer
JMSWeblogic ManagedServer
JDBC Clustering
Product Engines
Product Engines
Product Engines
JMS
Product Engines
Product Engines
Product Engines
© 2010 Ariba, Inc. All rights reserved. The contents of this document are confidential and proprietary information of Ariba, Inc.