adapted from: trends and attributes of horizontal and vertical computing...

35
Adapted from: TRENDS AND ATTRIBUTES OF HORIZONTAL AND VERTICAL COMPUTING ARCHITECTURES Tom Atwood Business Development Manager Sun Microsystems, Inc.

Upload: others

Post on 24-Apr-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Adapted from:TRENDS AND ATTRIBUTES OF HORIZONTALAND VERTICAL COMPUTING ARCHITECTURES

Tom Atwood

Business Development Manager

Sun Microsystems, Inc.

Session 1038 2

Takeaways

● Understand the technical differences between horizontal and vertical architectures

● Understand the difference between cluster for performance and clusterfor availability

● Understand the financial impact of cluster databases (e.g., Oracle 9i RAC)

Session 1038 3

Agenda

● Industry trends● Architectural definitions● Performance issues● TCO issues for database clusters● Availability issues

Session 1038 4

N-tier Architecture

DatabasesOLTP

CRM/SCM/ERP

App ServersDirectory Servers

Email ServersWeb ServicesFirewall/VPNProxy/Cache

Tier 3 Tier 2 Tier 1 Tier 0

Session 1038 5

Customer Investment in Vertical and Horizontal Servers

Source: IDC, 2003

Session 1038 6

Server Usage by CPU Capacity

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

1-way 2-way 4-way 8-way 16-way 32-way 64-way+

Edge Application Database

Source: IDC, 2003

Session 1038 7

SMP Definition

● Symmetric multi-processing– Or: shared memory multi-processing

● Shared pool of CPUs● Single-large memory space● One copy of the OS

Session 1038 8

Vertical and Horizontal Attributes

Proc

Memory Switch

Proc

Mem

I/O

Mem I/O

Proc

Network Switch

Proc

Mem

I/O

Mem

I/O

Proc

Mem

I/O

Cache-coherent shared-memory multi-processors (SMP)● Tightly-coupled: high

bandwidth, low latency● Large, workloads: ad-hoc

trans proc, data warehousing● Shared pool processors● Single large memory

Cluster multi-processor● Loosely coupled

● Standard H/W & S/W

● Highly parallel (web, some HPTC)

Horizontal Scaling

Ver

tica

l Sca

ling

Single OS Instance

Multiple

OS Instances

Session 1038 9

Vertical vs. Horizontal System Types

● Vertical– Large SMPs– Clusters of large

SMPs

● Horizontal– Blades– Clusters– Large MPPs

Session 1038 10

Vertical vs. Horizontal Characteristics

● Vertical– Large shared memory space– Many dependent threads– Tightly-coupled internal interconnect– High single-system RAS– Many standard CPUs– Single OS with many CPUs – Single-box packaging– Many CPUs/floor tile– Commodity and proprietary h/w– Single-box headroom/growth– “In-box” enhancements/upgrades– 64-bit

● Horizontal– Small non-shared memory space– Many independent threads– Loosely-coupled external interconnect– High RAS via replication– Many standard CPUs– Many OS’s with 1–4 CPUs/OS– “Rack and stack” packaging– Many cpus/floor tile– Commodity hardware– Multi-box headroom/growth– “New-node” enhancements– 32-bit and 64-bit

Session 1038 11

Vertical vs. Horizontal Applications

● Vertical– Large databases– Transactional databases– Datawarehouses– Data mining– Application servers– HPTC applications

(non-partitionable)

● Horizontal– Web servers– Firewalls– Proxy servers– Media streaming– Directories– XML processing– JSP applications– SSL– VPN– Application servers– HPTC applications (partitionable)

Session 1038 12

Critical Factors in System Performance● Processor– Capacity and throughput

● System Interconnect– Low latency– High bandwidth

● Input and output– Network and storage, sequential and transactional

● Operating system– Required for H/W to scale

● Optimized applications● System availability

Session 1038 13

Interconnect Specifications

● Gigabit ethernet– 125MB/sec bandwidth– 200,000ns latency

● SCI– 200MB/sec bandwidth– 4,000ns latency

● Infiniband– 250MB/sec-3GB/sec b/w– ??

● Sun Fire SMPs● 9.6GB/sec to 172GB/sec

bandwidth

● 200ns to 450ns latency

M

M

M

M

P

P

P

P

I

I

PCI

PCI

M

M

M

M

P

P

P

P

I

I

PCI

PCI

Network vs. Centerplane/Backplane

Session 1038 14

Distributed Memory

Mem

p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p

RAM

CPUs

Disk

Workload

Mem Mem Mem Mem Mem Mem Mem Mem Mem Mem

Large Workloads Not Handled Effectively

Session 1038 15

Shared Memory

Shared Memory

EThernet

p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p

Up to 576GB of Shared Memory

Up to 100 CPUs

Workload

RAM

CPUs

Disk

Large Workloads Handled Effectively

Session 1038 16

Database Layer Performance

● Horizontal needs to cluster for performance

● Oracle 9i RAC is most commoncluster database– RAC is “Real Application Clusters”

● Vertical does not need to clusterfor performance

Session 1038 17

What “Scaling” Means

● SMP speedup = x faster than one CPU● Cluster speedup = x faster than one node● SMP scaling = speedup/(no. CPUs)● Cluster scaling = speedup/(no of nodes)● “Scaling” depends on number of

CPUs/nodes used to calculate● 90% scaling on 2 nodes not the same

as 90% scaling on 4 nodes

Session 1038 18

Scaling Examples

● 1 node to 2 nodes:– Scaling factor is .8 or 80%– Speedup is 1.8 (1.0, 1.8)– Cluster scaling is: 1.8/2 = .9 or 90%

● 4 nodes: – Scaling factor is .8 or 80%– Speedup is 3.4 (1.0, 1.8, 2.6, 3.4)– Cluster scaling is: 3.4/4 = .85 or 85%

Session 1038 19

Scaling of Clusters

0 8 16 24 32 40 48 56 64 720

8

16

24

32

40

48

56

64

72

● 90% Scaling on 2-way Cluster of 4-way servers gives 23x Speedup at 52 CPUs (Oracle 91 RAC)*

● 95% Scaling gives 23x speedup on 24 CPU SMP (Oracle)**

● Speedup is how much faster than one CPULinear scaling

95% SMP scaling (.95=23/24CPUs)

90% cluster scaling (.90=1.8/2-servers)

* 90% scaling f rom two benchmarks:

www.sap.com/benchmark/others.htm#Round%Robin

www.oracle.com/ip/deploy/database/theme_pages/index.

html?rac_01242003.html

** 95% scaling from Internal Sun benchmarks

CPUs

4-way nodes

2 4 6 8 10 12 14 16 180

18

16

14

12

10

8

6

4

2

0

Spee

d-up

Session 1038 20

Application Layer Performance

● Application layer is normally stateless– Data resides in database layer– Consolidates connections to the database

● More processors needed than DB layer– SAP R/3 uses about 10 app CPUs per each DB CPU– Oracle Apps uses about 5 app CPUs per each DB CPU

● Scaling is not an issue– Replication of instances meets performance requirements

● Acquisition costs not affected by different software licenses

Session 1038 21

SAP App Layer Example

● Customer Evaluation: SAP Application Servers● 50% Less HW 24-way vs. 2-ways● Load Balancing on SAP Sessions at Login—Static● Larger servers had more headroom for peak workloads● Batch Jobs (Payroll, MRP...) can't span several app servers

Overload

Overload

Overload

Overload

15%

40%

10%

5%

5%

Many Little App Servers (%Use)

5%

46% HeadroomFor Peak

Fewer Large App Servers (%Use)

Lots

Net

wor

k Co

nnec

tion

s!

Server Consolidation:● Higher initial costs● Less Complexity● Fewer Boxes● Possible app contention● Possibly lower admin costs

Possibly fewer licenses,...

Server Replication:● Lower initial costs● “Hot spots”● Less predictable● Possibly higher

management costs

DBServer

DBServer

DBServer

DBServer

Session 1038 22

Application Layer Management

● If database requires 20 processors– Oracle Apps application layer: 100 processors– SAP R3 application layer: 200 processors

● 50–100 2-way servers or 5–10 20-way servers?● More OS instances may increase costs● More application instances may increase costs● Horizontal solution may have lower

TCA but higher TCO

Session 1038 23

Consolidated Solutions

Test and Develop8 CPUs

SD app server 4 CPUs

HR app server12 CPUs

Database20 CPUs

Database20 CPUs

Session 1038 24

Presentation Layer Performance

● Applications rarely scale– Applications cannot use multiple CPUs

● Applications are not data intensive– Commonly stateless

● Internode communication minimal● Cluster costs are not an issue● Horizontal has lower acquisition cost● Horizontal has sufficient performance

Session 1038 25

How to Lower TCO

● Lower acquisition costs– Use commodity

components

● Lower complexity– Fewer OS “flavors”– Fewer OS instances

● Better utilize resources– Virtualize resources– Automate resource

migration

● Centralize– Data centers– Disaster recovery– Backup

Session 1038 26

Application Lifecycle TCO

● TCO must be managed overentire application lifecycle

● Less than 20%is hardware acquisition

Session 1038 27

TCO Implications of Cluster Scalability

● Oracle 9i RAC needs more CPUsthan Oracle 9i

● Oracle 9i RAC per CPU pricing higherthan Oracle 9i

● Acquisition costs (h/w and s/w) maybe higher for horizontal scaling

● E.g., Need 13 x 4-CPU servers to equal24-way SMP server– 52 CPUs vs. 24 CPUs

Session 1038 28

Availability Continuum

98.0%

98.5%

99.0%

99.5%

100.0%

Serv

ers

w/

On-

line

Repa

ir/R

econ

fig

Act

ive-

Act

ive

Clus

ters

Serv

ers

wit

h A

uto-

rebo

ot

Failo

ver

Clus

tere

d sy

stem

s

88 h

rs

44 h

rs

8.6

hrs

2.2

hrs

99.95% 99.975%

4.4

hrs

Cost & Complexity

99.999%

5.2

min

s

Session 1038 29

Availability Strategies

● Large SMPs: High single system RAS– Full hardware redundancy– Online serviceability– Error checking and correction– HA failover (standby server)– Cluster for highest availability requirements

● Horizontal nodes: Clusters and Replication– Cluster for availability for databases– Replicate workload for non-DB applications

Different for Horizontal and Vertical

Session 1038 30

TCO Implications of Availability

● 99.95% single SMP may be sufficient● Greater than 99.95% requires cluster– Standby failover (HA) may be sufficient

● Database active on only one node, other standby● Database needs to start up● For Oracle does not require RAC licenses

– Highest avail. level requires active cluster● 2 or more nodes active● Fast failover● For Oracle RAC licenses required

Level of Availability Affects TCO

Session 1038 31

Large vs. Small Clusters

● 1 x 6800 20-way server: $826,360– Oracle 9i licenses

● 2 x 6800 12-way servers: $1,461,360– Same performance as 1 x 20-way 6800– With Oracle 9i RAC licenses

● 8 x 480 4-way servers: $1,465,600– Same performance as 1 x 20-way 6800– With Oracle 9i RAC licenses

RAC Calculator Parameters: 90% scaling, 10% decay, 50% Oracle discount, 40% 6800 discount, 20% 480 discount

Hardware and Software Acquisition Costs

Session 1038 32

Clustering for Scalability vs. Clustering for Availability

“Even customers most familiar with clustering software and clustered systems continue to be very cautious regarding the replacement of large scale-up systems with clusters of scale-out systems.”

—Matthew Eastwood, IDC

IDC Survey Data:

Source: IDC, 2003

74% of all clusters are deployed for

Availability

Session 1038 33

Availability Summary

● For highest availability: (<99.975)– Horizontal or vertical clustering– Two large-node clusters similar cost as

multiple small node clusters

● For lower availability– Vertical with HA failover will cost less– Single vertical node may be sufficient

Session 1038 34

Vertical and Horizontal Summary

Data Facing

Tier 3

Tier 2

Tier 1 Tier 0

● Vertical scaling

● Big memory● Big disk I/O

bandwidth● Big RAS● Protect state● SAN Network Facing

● Horizontal scaling

● Med memory● Big network

I/O bandwidth● Replication● Stateless● NAS

Session 1038 35

Horizontal vs. Vertical Summary

● Horizontal ideal for web-tier– Performance and acquisition cost

● Vertical well-suited for database tier– Performance and acquisition cost

● Both can be used for application tier● Clustering good for availability

but not for performance