revolutionizing the data abstraction layer with ibm optim purequery and db2
DESCRIPTION
Vladimir Bacvanski and Dan Galvin Looking for a more flexible and efficient way for Java programs to access the database? Join us as we explore how you can bridge the gap between Java and relational databases. Enhance your Java environment with access layer generation, data access best practices, traceability between Java packages and SQL statements, improved impact analysis and more. And most importantly, see how new technology can improve not only new development, but existing applications as well. Be prepared to see designs and code samples!TRANSCRIPT
October 25–29, 2009 • Mandalay Bay • Las Vegas, Nevada1
Revolutionizing the Data Abstraction Layer with IBM Optim pureQuery
Dr. Vladimir Bacvanski, Vice President, InferData, [email protected]
Daniel Galvin, Consultant, Galvin Consulting, [email protected] Number 2171
What is this revolution about?
2
NO SLOW APPS!
NO BAD SQL!
GET CONTROL BACK !
3
Show of Hands: What Data Access Technology Have You Used?
What’s most important to you?– Productivity
– Performance
– Security
– Portability
JDBC
iBatis
SQLJ
HibernateEJB Entity
Beans
JPA
mashuphttp Stored
Procedures
JSON
QoS goalsJSP
XML
JDBC
Runstats
Response Time!
SQL
Spring
REORG
Partition strategy
ApplicationDeveloper
SQLJ
JDBC
JPA
iBatis, . . .
SpringWhy does this query take so long?
I can’t believe I got called out last week. I wish I could see how these queries will run in production.
Writing Java code is so easy with this eclipse environment.I wish it was that easy to get the SQL right.
This ORM doesn’t allow me to leverage all my database’s SQL.
Static SQL? Sounds like another delay to getting my program deployed
Sometimes I need POJOs, sometime JSON, sometimes XML, what should I use?
Java Data Access – Two Views of the World
Database Developer& Administrator
Another runaway query! Where are these coming from? JDBC? Hmmm…
Inconsistent response time? How long will it take me to find the offending application sending bad SQL this time?
These ad-hoc queries are dangerous. We need a library of tested SQL interfaces.
Can I examine the SQL “before” the application is deployed?
Another GRANT request? This security administration is out of control.
5
Application-Centric– Top-Down– Start with Object Domain
Model– ORM Mapping– Well supported in dynamic
languages and frameworks
Hybrid– Meet in the middle– Can be challenging w/o
comprising
Data-Centric– Bottom-UP– Start with Relational Data
Model– Not well supported in dynamic
languages and frameworks
Persistence Layer
TopDown
BottomUp
Meet in theMiddle
Data Mapping Approaches
66
EJB, JPA, and Hibernate vs. The Database
DBA and SQL developer chasm Where is the SQL coming from? What is it? Where is it? How do we tune it? How de we manage it?
Performance Concerns: Some App Server vendors claim
(unsurprisingly) that Managed objects performs fine.
There are many user claims of bad Managed object performance is bad on the web.
As always, the truth is in the middle. And will depend on your app server,
application, database, etc ..
“Our top story: Large Customer moves from COBOL to Java to become more agile. In other news, DBA develop amnesia.”
Introducing pureQuery
pureQuery Components:Simple and intuitive API– Enables SQL access to databases or in-memory Java objects– Facilitates best practices Optim Development Studio (integrates with RAD/RSA)– Integrated development environment with Java and SQL
support – Improve problem isolation and impact analysisOptim pureQuery Runtime– Flexible static SQL deployment for DB2
A high-performance, data access platform to simplify developing, managing, securing, and optimizing data
access.
pureQuery Balances Productivity and Control
Managed objects Object-relational mapping
Spring templates
Full SQL control
Code all your SQL
Use SQL templates, inline only
Complex OR mapping and persistence management, but loss of controls
Adds container management option
JDBC / SQLJ
iBATIS
Hibernate
OpenJPA (EJB3)
Add basic OR mapping and annotated-method stylepureQuery
Code Development Productivity• Code Generation, Content Assist• Database aware, Java SQL Editor
Design Phase pureQuery close-up
SQL Performance Metrics• Find and sort query elapsed time
from Java
Java to SQL Integration• Categorize by Java, SQL, Database ,
Packages, track back to line of code
SQL Injection Prevention• Lock down SQL for Dynamic
Static SQL• Lock in Access plans, Improve Security,
Consistent Performance
Problem Determination• Monitor WebSphere Connection
Pool, JDBC Driver, Network• Track back to SQL and line of
code in the application
SQL Replacement• Replace Query w/o changing source
Existing JDBC to Static• Reroute Dynamic Queries to Static
Jump Start Application Design• Generate SQL and Code from Database Objects • Setup basic DAO Pattern
Oracle Support• Replace Query w/o changing
source
Code Example: JDBC Table Column TypeEMP NAME CHAR(64)
EMP ADDRESS CHAR(128)
EMP PHONE_NUM CHAR(10)
class Employee { String name; String homeAddress; String homePhone; …}
java.sql.PreparedStatement ps = con.prepareStatement( "SELECT NAME, ADDRESS, PHONE_NUM FROM EMP WHERE NAME=?");ps.setString(1, name);java.sql.ResultSet rs= ps.executeQuery();names.next();Employee myEmp = new Employee();myEmp.setName(rs.getString(1));myEmp.setHomeAddress(rs.getString(2));myEmp.setHomePhone(rs.getString(3));names.close();
Code Example: pureQuery
11
Employee myEmp = db.queryFirst( "SELECT NAME, ADDRESS, PHONE_NUM FROM EMP WHERE NAME=?", Employee.class, name);
Even simpler, if we have a method getEmployee with a Java annotation or XML file with SQL for the query:
Employee myEmp = getEmployee(name);
WHY SHOULD BE THE DATA SPECIALISTS BE INTERESTED IN PUREQUERY?
12
Motivations of the Data Specialists
SQL Performance Tuning Ease of Tuning Autonomy of Developers Predictability of an Optimized Data Access Path
Reduction of Costs to satisfy SQL statements Optimized Access Paths Reduction of CPU intensive components of SQL Execution Utilization of Specialty Processors
Capacity Planning Hardware Utilization
Problem Determination capabilities
13
pureQuery Capabilities
Static SQL for Runtime with Dynamic SQL Execution in Development
pureQuery can utilize SELECT INTO from a Java application With Client Optimization, Static SQL from existing JDBC with
no changes to the Application Homogeneous and Heterogeneous Batching of Statements Statically bound packages are easy to EXPLAIN and monitor
for changes in access path pureQuery coupled with IBM Optim Performance Monitoring
provides E2E Performance Monitoring and Problem Determination
Impact analysis is greatly improved by the static packages and the ability to tie each statement to a method in the application code
14
STATIC VS. DYNAMIC SQL EXECUTION
15
Static vs. Dynamic SQL
Check Plan/Package Authorization and look for
stmt in Cache
Parse SQL Statement
Check Table/View authorization
Compute Access Path
Execute Statement
Validate SQL against DB2 Catalog
Dynamic SQL – Full Prepare
(~ 300,000 lines of code)
Check Plan/Package Authorization and look for
stmt in Cache
Execute Statement
Dynamic SQL – Short Prepare (~ 15,000 lines
of code)
Create Skeleton in EDM Pool
Copy skeleton from cache to local DB2 thread storage
Check Plan/Package Authorization
Execute Statement
Load package into EDM if not previously loaded
Static SQL
Cost of Prepare
CPU cost of Short Prepare on DB2 9 for z/OS – between 400µs and 1ms
CPU cost of Full Prepare on DB2 9 for z/OS – approximately 30 to 50ms. Cost could be much higher and generally increase with complexity.
Trial ResultsEstimated CPU Savings
(In DB2 CPU)
0
1
2
3
4
5
6
7
8
9
10
Avg CPUTime (ms)
JDBC
pureQuery
19
How well does it work? – Java applications
In-house testing shows significant performance improvements
IRWW – an OLTP workload, Type 4 driverCache hit ratio between 70 and 85%23 % improvement in throughput using pureQuery over dynamic JDBC15% - 25% reduction on CPU per transaction over dynamic JDBC
274
360420 446
485524
0
100
200
300
400
500
No
rmalized
Th
rou
gh
pu
t (I
TR
)
Normalized Throughput by API for JDBC Type 4 Driver
-35%
-14%
6%15%
25%
-50%
% in
crea
se/r
edu
ctio
n in
CP
U p
er
tran
sn c
om
par
ed t
o J
DB
C
% increase/reduction in CPU per transaction compared to JDBC using Type 4 driver
20
How well does it work? - .Net applications
Throughput during static execution increased by 159% over dynamic SQL execution assuming a 79% statement cache hit ratio
*Any performance data contained in this document were determined in various controlled laboratory environments and are for reference purposes only. Customers should not adapt these performance numbers to their own environments as system performance standards. The results that may be obtained in other operating environments may vary significantly. Users of this document should verify the applicable data for their specific environment.
IRWW – OLTP application
Application accesses DB2 for z/OS
BATCHING SQL STATEMENTS
21
Homogeneous & Heterogeneous Batch
Homogeneous Batch – all instances in the batch are the same statement and require only 1 line turn
Heterogeneous Batch – allows different SQL statements to be included in batch.
Both Utilize Multi-Row Insert
Heterogeneous Batches may contain 0 to many Homogeneous Batches
Heterogeneous Batch Examplepublic int[][] insertAgentAndPolicies(AgentTO agentTO, PolicyTO[] policyTO) {
Data data = DataFactory.getData(ds, pdqProps);
try {
AgentData agentData = DataFactory.getData(AgentData.class,
data);
PolicyData policyData = DataFactory.getData(PolicyData.class,
data); data.startBatch(HeterogeneousBatchKind.heterogeneousModify__);
agentData.insertAgent(agentTO);
policyData.insertPolicies(policyTO);
return data.endBatch();
} finally {
data.close();
}
}
Client Optimization
Allows you to bind static SQL packages from existing JDBC code
Avoids the cost of rewriting the application to code to the pureQuery API
Allows Heterogeneous batch with minor changes to the code
None of the productivity advantages are realized. Code is still maintained in JDBC.
End-to-End monitoring lacks some introspective capability into the coding
Creation of the static packages requires that you run the code.
Some overhead at runtime related to resolution of statements to static packages
25
DB2 Data Servers
pureQuery client optimization enables static execution for JDBC applications (custom-developed, framework-based, or packaged) Existing JDBC Application
JDBC Driver w/ pureQuery
Dynamic SQL execution Static SQL execution
Optimize Existing JDBC Applications
Captured SQL- related
metadata
Capture Configure Bind Execute
"The ability to use static SQL with pureQuery is huge. Recently, I worked with a client who could reduce CPU usage by 7 percent thanks to this one feature."
— David Beulke, Pragmatic Solutions Inc.
Improve performance for DB2 – without changing a line of code
Why should DBAs care ?
DBAs have little to no visibility of application SQL before deployment, no opportunity for review and optimization
Problem isolation takes days with contemporary environments such as Java, PHP, .NET, etc due to inability to trace SQL to Java application and source code
Constantly increasing Java application workload taxes existing systems – need to fit more work into existing systems
SQL injection represents an increasing risk to data security
Why should Developers care ?
Get data access right the first time !
Get it done faster - Improved productivity
Single environment that spans Java application and database development
Improved problem isolation and resolution
29
Control performance– Decide at deployment time how the SQL is executed– Understand and lock down the access plan for SQL– Replace suboptimal SQL without changing the application
Control security– Prevent SQL injection– Prevent execution of unauthorized SQL– Better manage database security
See inside applications that are driving your database– Understand where SQL comes from– Understand when frameworks and ORM’s are getting in the way
Simplify problem determination and troubleshooting– Correlate problem SQL with applications, ORM’s and frameworks
Optim pureQuery Runtime
30
How do I start with pureQuery?
Existing applications– Optimize existing JDBC (and .NET!) applications
– No code changes needed
– Have to go through the client optimization process to get to static SQL
New applications– Use the pureQuery API
– Development codes using one API regardless of whether it is deployed dynamically or statically
– DBA deploys statically
– No need to go through client optimization process
Other – JPA, iBatis, Hibernate
31
pureQuery Facilitates Best Practices
Supports both inline SQL and Java annotations (method) Intuitive interfaces for common data retrieval and manipulation scenarios hides JDBC complexity
– Query First– Homogeneous Batch
Reduce network trips to the database – Query Over Java Collections– Heterogeneous Batch
Use custom result handlers to map results to POJO’s, XML, JSON, …
Write high performance Java data access applications, Part 3: Data Studio pureQuery API best practices
-- Vitor Rodrigueshttp://www.ibm.com/developerworks/db2/library/techarticle/dm-808rodrigues/?S_TACT=105AGX01&S_CMP=LP
32
A Typical Application Architecture with pureQuery
Sometimes additional layers are required– Workflow
– Data federation
– Workspace
Presentation Layer
Business Service Layer
Data Access Layer
pureQuery
Database
Using the pure-query API to access the
database.Provides a technology neutral API to the data used by the business
services
Never use the pureQuery API directly.
Gets data from the Data Access Layer
Implements the U/I or network protocols using
the business services
pureQuery makes this layer easy, fast, consistent and
traceable
RAD or RSA / Optim Development Studio Data Centric Development Scenario
Presentation
•Dojo, JSF, …
Application
•Business Logic
Objects
•Access to data
Tables
•Data
Write in JavaUsing RAD+WAS FP for Web 2.0
Write in Java with pureQueryUsing Optim Dev. Studio in RAD
Access generated Java data objects from code developed in RADWAS Feature
Pack for Web 2.0
Demo
34
Demo!
35
Conclusion: pureQuery Revolutionary Advantages
Excellent performance– Static and dynamic SQL is captured during test and optimized
before deployment
– Enables lock-in of access path
Great productivity– Excellent tool support through Optim Development Studio
• Shell share with Rational tools
– Mapping from SQL to Java captured and traceable
– Facilitates collaboration between DBA’s and developers• Performance tuning, impact analysis
Better security– Limits SQL injection
– Controlled database access
Where to go Next? Resources and more…
Optim Development Studio – http://www.ibm.com/software/data/optim/development-studio/
IBM pureQuery– http://
www.ibm.com/software/data/optim/purequery-platform/faq.html
pureQuery Custom Training – InferData, IBM Business Partner http://www.inferdata.com
– Course: Developing Database Applications with Optim Development Studio and pureQuery
– http://www.inferdata.com/training/data/optim_purequery_training.html
36
Web, Blogs
Integrated Data Management (Optim and Data Studio)– http://www.ibm.com/developerworks/spaces/optim
Vladimir’s Blog: On Building Software– http://www.OnBuildingSoftware.com
Twitter:– http://twitter.com/OnSoftware
37
3838
Thank You!Your Feedback is Important to Us
Please complete the survey for this session by:– Accessing the SmartSite on your smart phone or computer at:
iodsmartsite.com • Surveys / My Session Evaluations
– Visiting any onsite event kiosk• Surveys / My Session Evaluations
– Each completed survey increases your chance to win an Apple iPod Touch with daily drawling sponsored by Alliance Tech