scaling java and postgresql to great heights

21
Scaling Java and PostgreSQL to Great Heights Charles Lee Hyperic

Upload: gordon

Post on 12-Jan-2016

57 views

Category:

Documents


0 download

DESCRIPTION

Scaling Java and PostgreSQL to Great Heights. Charles Lee Hyperic. What is Hyperic HQ?. Hyperic HQ is the industry's only comprehensive product that provides cross-stack visibility for software in production, whether it's open source, commercial, or a hybrid. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Scaling Java and PostgreSQL to Great Heights

Scaling Java and PostgreSQL to Great Heights

Scaling Java and PostgreSQL to Great Heights

Charles Lee

Hyperic

Page 2: Scaling Java and PostgreSQL to Great Heights

What is Hyperic HQ?What is Hyperic HQ?

Hyperic HQ is the industry's only comprehensive product that provides cross-stack visibility for software in production, whether it's open source, commercial, or a hybrid.

Translation: HQ collects and transactionally read and write a lot of data

Page 3: Scaling Java and PostgreSQL to Great Heights

Just How Much Data?Just How Much Data?

100 Platforms

700 Servers

7000 Services

150,000 metrics enabled (20 metrics per resource)

15,000 metric data points per minute (average)

21,600,000 metric data rows per day

Scenario: IT infrastructure of 100 servers (medium size deployment)

Page 4: Scaling Java and PostgreSQL to Great Heights

Distribution of HQDistribution of HQ

All inclusive installer package (JRE, JBoss, HQ, embedded database)

PostgreSQL embedded - easy license, cross-platform support, enterprise performance

Works with PostgreSQL, Oracle, or MySQL backends

Page 5: Scaling Java and PostgreSQL to Great Heights

Application Performance BottleneckApplication Performance Bottleneck

It’s the Database

Dependent on hardware performance (disk, CPU, memory, etc)

I/O

Network latency (remote database)

Slow queries

Page 6: Scaling Java and PostgreSQL to Great Heights

Breaking Through the BottleneckBreaking Through the Bottleneck

Upgrade I/O (DAS/NAS/SAN)

Upgrade H/W (64-bit multi-processors, increased RAM, etc)

Upgrade to PostgreSQL 8.2.4

Upgrade to 64-bit JRE 6

Anything else we can do?

Case Study: Hi5.com (top 15 visited website)

Using Hyperic 2.6

Page 7: Scaling Java and PostgreSQL to Great Heights

Data Access Object (or DAO)Data Access Object (or DAO)

The Data Access Object (or DAO) pattern definition from Sun:

separates a data resource's client interface from its data access mechanisms

adapts a specific data resource's access API to a generic client interface

The DAO pattern allows data access mechanisms to change independently of the code that uses the data.

Page 8: Scaling Java and PostgreSQL to Great Heights

Postgres, OracleRDBMS

Entity EJBData Access

Session EJBBusiness Logic

User InterfacePresentation

HQ 2.7 HQ 3.0

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Postgres, OracleRDBMS

HibernateData Access

Session EJBBusiness Logic

User InterfacePresentation

The Hyperic HQ StackThe Hyperic HQ Stack

Page 9: Scaling Java and PostgreSQL to Great Heights

What’s the Problem (with EJB2)?What’s the Problem (with EJB2)?

Bad transaction handling, pessimistic locking

N+1 database problem

EJBQL - a poor query language

Home grown caches

Page 10: Scaling Java and PostgreSQL to Great Heights

Why Migrate to Hibernate?Why Migrate to Hibernate?

Straightforward transaction demarcation

Lazy fetching

HQL and Criteria based queries - more fully featured

Secondary cache integration

Popular framework

Page 11: Scaling Java and PostgreSQL to Great Heights

EJB2 Entity Beans and TransactionsEJB2 Entity Beans and Transactions

Entity Beans are proxy objects

Use Value objects to travel through transaction boundaries

Often lock pessimistically causing transaction deadlocks

MANAGER

ENTITYBEAN

ENTITYBEAN

VALUE OBJECT

BOSSGUICLI

TX TXTX

TX

Page 12: Scaling Java and PostgreSQL to Great Heights

MANAGER

Hibernate POJOs and TransactionsHibernate POJOs and Transactions

POJOs set up to lazy load collections

POJOs travel through transaction boundaries

Hibernate Sessions use optimistic locking because of session cachePOJO POJO

BOSSGUICLI

TX

Page 13: Scaling Java and PostgreSQL to Great Heights

N + 1 Database ProblemN + 1 Database Problem

Database lookups to retrieve object data

1. Issue query to retrieve collection of primary keys (PKs)

2. Issue individual query to retrieve object data per PK in collection

Total number of queries performed:N (rows) + 1 (for PKs)

Page 14: Scaling Java and PostgreSQL to Great Heights

N + 1 Database RelieveN + 1 Database Relieve

EJB2 - none Problematic in HQ 2.7

Hibernate has several solutions:1. Lazy fetching

2. Explicit outer-join declaration for associations

3. HQL supports explicit outer join fetch (“left join fetch”)

4. Multi-level caching support

Page 15: Scaling Java and PostgreSQL to Great Heights

Secondary Level CacheSecondary Level Cache

Caching reduces unnecessary roundtrips to the database

EJB2 has no support for second-level cache

Hibernate supports multi-level cache with pluggable architecture Database state can stay in memory Caches both POJOs and queries Use fast and proven caches: EHCache, JBoss TreeCache,

etc

Page 16: Scaling Java and PostgreSQL to Great Heights

The Query LanguagesThe Query Languages

EJBQL Lacks trivial functions like “ORDER BY”

JBossQL Some additional functions like “ORDER BY”, “LCASE”, etc.

Declared SQL Direct SQL-like queries declared as EJB finder methods, but

not database-specific

HQL (and Criteria API) Close to SQL Object oriented Hibernate allows query results to be paged

Page 17: Scaling Java and PostgreSQL to Great Heights

Data ConsolidationData Consolidation

Data compression runs hourly

Table storing all collected data points (most activity) capped at 2 days worth

Lower resolution tables track min, avg, and max

Inspired by RRDtool - an open source Round Robin Database to store and display time series data

Page 18: Scaling Java and PostgreSQL to Great Heights

Limited Table GrowthLimited Table Growth

MEASUREMENT_DATA

MEASUREMENT_IDTIMESTAMPVALUE

Size Limit 2 Days

MEASUREMENT_DATA_1H

MEASUREMENT_IDTIMESTAMPVALUEMINMAX

Size Limit 14 Days

MEASUREMENT_DATA_6H

MEASUREMENT_IDTIMESTAMPVALUEMINMAX

Size Limit 31 DaysMEASUREMENT_DATA_1D limit 1 year

Page 19: Scaling Java and PostgreSQL to Great Heights

Performance ComparisonPerformance Comparison

HQ 2.6 HQ 3.0

Application Server

Load 6 1

Total JVM Memory

4.5 GB 1.5 GB

Database Server

Load 5 3

TCP Inbound

268 124

H5.com - using external database server

Page 20: Scaling Java and PostgreSQL to Great Heights

As A ResultAs A Result

HQ’s performance improved dramatically

Hi5.com can downgrade hardware

Continue to rely on PostgreSQL

Adding hundreds more boxes to production environment under HQ’s management

Page 21: Scaling Java and PostgreSQL to Great Heights

Questions and CommentsQuestions and Comments

Charles Lee

[email protected]