java in the database–is it really useful? solving impossible big data challenges
TRANSCRIPT
![Page 1: Java in the database–is it really useful? Solving impossible Big Data challenges](https://reader035.vdocuments.mx/reader035/viewer/2022081605/58ee31f91a28ab225d8b462b/html5/thumbnails/1.jpg)
1© 2015 Rogue Wave Software, Inc. All Rights Reserved. 1
Java in the database–is it really useful?Solving impossible Big Data challenges
Wendy Hou, Product ManagerMark Sweeney, Sales Engineer
![Page 2: Java in the database–is it really useful? Solving impossible Big Data challenges](https://reader035.vdocuments.mx/reader035/viewer/2022081605/58ee31f91a28ab225d8b462b/html5/thumbnails/2.jpg)
2© 2015 Rogue Wave Software, Inc. All Rights Reserved. 2
Why embed analytics?• Faster and more efficient
– Data extraction could take a large percentage of the analysis time– Business users can get results by changes a few variables and rerun
the models and not depend on others to implement changes and rerun
– Real time, on demand, without synchronization delay• Simpler and greater volume
– Simpler user experience– Able to analyze larger data set
• Lower cost– Opportunity cost– Cost of maintaining the analytic infrastructure (HW, SW, staff,
maintenance, platforms)
![Page 3: Java in the database–is it really useful? Solving impossible Big Data challenges](https://reader035.vdocuments.mx/reader035/viewer/2022081605/58ee31f91a28ab225d8b462b/html5/thumbnails/3.jpg)
3© 2015 Rogue Wave Software, Inc. All Rights Reserved. 3
Why embed analytics in DB
• Accuracy and accessibility– Data and formula in one place avoids
potential user errors– Invoke data and analytics from any
programming language or application that can connect to the database
• Higher security – data used as input to the analytics never leaves the database
![Page 4: Java in the database–is it really useful? Solving impossible Big Data challenges](https://reader035.vdocuments.mx/reader035/viewer/2022081605/58ee31f91a28ab225d8b462b/html5/thumbnails/4.jpg)
4© 2015 Rogue Wave Software, Inc. All Rights Reserved. 4
What can you use?
JMSL is the pure Java member of the IMSL family
![Page 5: Java in the database–is it really useful? Solving impossible Big Data challenges](https://reader035.vdocuments.mx/reader035/viewer/2022081605/58ee31f91a28ab225d8b462b/html5/thumbnails/5.jpg)
5© 2015 Rogue Wave Software, Inc. All Rights Reserved. 5
Diverse data management world
SQL
NoSQL
Hadoop
MapReduc
e
SparkJava
JavaScript
In-memory
On-disk
![Page 6: Java in the database–is it really useful? Solving impossible Big Data challenges](https://reader035.vdocuments.mx/reader035/viewer/2022081605/58ee31f91a28ab225d8b462b/html5/thumbnails/6.jpg)
6© 2015 Rogue Wave Software, Inc. All Rights Reserved. 6
Taxonomy of DB analyticsPlatformAnalytic
s
Executable
ExecutableExecutableExecutable
Analytics Executable
Analytics
ProprietaryPlatform
Analytics
Multitier
DistributedPlatform Database
Analytics invoked externally but run in-server or in-database. Includes in-memory DBs
Stored data and analytics are physically separatedArchitecture could vary.
![Page 7: Java in the database–is it really useful? Solving impossible Big Data challenges](https://reader035.vdocuments.mx/reader035/viewer/2022081605/58ee31f91a28ab225d8b462b/html5/thumbnails/7.jpg)
7© 2015 Rogue Wave Software, Inc. All Rights Reserved. 7
In-database JMSL
• Analytics run on DB’s internal JVM
• JMSL classes stored as DB objects
• Highly portable, identical code runs cross-platform
Executable
Analytics
![Page 8: Java in the database–is it really useful? Solving impossible Big Data challenges](https://reader035.vdocuments.mx/reader035/viewer/2022081605/58ee31f91a28ab225d8b462b/html5/thumbnails/8.jpg)
8© 2015 Rogue Wave Software, Inc. All Rights Reserved. 8
Proprietary Platform
Multitier Distributed Database Analytics
In-database
JMSL
Execution Technologies
SAS, MATLAB, others
Windows, Linux
Hadoop, Cassandra-Spark
SAP HANA, Oracle Advanced Analytics
Database
Non-proprietary languageEfficient Data Transfer
Distributed/ScalableSecure
Portable/ReusableAlgorithm Coverage
Performance Low Cost (with setup)
Analytics
Executable Executable
Analytics
PlatformAnalytic
s AnalyticsAnalytics
ExecutableExecutableExecutable
Analytics
Executable
Analytics
Analytics
![Page 9: Java in the database–is it really useful? Solving impossible Big Data challenges](https://reader035.vdocuments.mx/reader035/viewer/2022081605/58ee31f91a28ab225d8b462b/html5/thumbnails/9.jpg)
9© 2015 Rogue Wave Software, Inc. All Rights Reserved. 9
The solution
![Page 10: Java in the database–is it really useful? Solving impossible Big Data challenges](https://reader035.vdocuments.mx/reader035/viewer/2022081605/58ee31f91a28ab225d8b462b/html5/thumbnails/10.jpg)
10© 2015 Rogue Wave Software, Inc. All Rights Reserved. 10
Challenge: Meet all requirements
In-database JMSL is uniquely positioned to solve the technical and practical challenges for DB analytics.
Pure Java
Minimizes network trafficDistributed/Scalable
Highly Secure
Portable/Reusable
Algorithm Coverage
High Performance
Low Cost
![Page 11: Java in the database–is it really useful? Solving impossible Big Data challenges](https://reader035.vdocuments.mx/reader035/viewer/2022081605/58ee31f91a28ab225d8b462b/html5/thumbnails/11.jpg)
11© 2015 Rogue Wave Software, Inc. All Rights Reserved. 11
Benefits of in-database JMSL• Faster results • Higher accuracy• Better quality of data• Higher security• Greater accessibility
Additionally:• Trusted technology – JMSL is a known and proven
product• Minimal risk – works with many platforms without
modification
Executable
Analytics
![Page 12: Java in the database–is it really useful? Solving impossible Big Data challenges](https://reader035.vdocuments.mx/reader035/viewer/2022081605/58ee31f91a28ab225d8b462b/html5/thumbnails/12.jpg)
12© 2015 Rogue Wave Software, Inc. All Rights Reserved. 12
Data quality and accuracy
• JMSL has numerous data cleaning routines for numerical data
– Eliminate data staging before loading
• Reducing network traffic reduces risk of data corruption
• Data and formula in one place - avoids potential user errors
![Page 13: Java in the database–is it really useful? Solving impossible Big Data challenges](https://reader035.vdocuments.mx/reader035/viewer/2022081605/58ee31f91a28ab225d8b462b/html5/thumbnails/13.jpg)
13© 2015 Rogue Wave Software, Inc. All Rights Reserved. 13
Security
• Java implementation• Analytics run in DB process
space – not an external procedure
• Core data never on network for analytics
• DB privileges can be fine tuned: access to run analytics but not to underlying data
![Page 14: Java in the database–is it really useful? Solving impossible Big Data challenges](https://reader035.vdocuments.mx/reader035/viewer/2022081605/58ee31f91a28ab225d8b462b/html5/thumbnails/14.jpg)
14© 2015 Rogue Wave Software, Inc. All Rights Reserved. 14
Ease of use, accessibility• JMSL installation to the
DB is extremely easy• Developers only need to
write SQL/Java interfaces to JMSL routines
• Analytics invoked from any language that can connect to the DB
![Page 15: Java in the database–is it really useful? Solving impossible Big Data challenges](https://reader035.vdocuments.mx/reader035/viewer/2022081605/58ee31f91a28ab225d8b462b/html5/thumbnails/15.jpg)
15© 2015 Rogue Wave Software, Inc. All Rights Reserved. 15
Trusted technology• In-database JMSL leverages known stable technologies
– Java– SQL
• Does not require learning the latest, greatest programming language
• Does not require learning a new ecosystem
however …• Only requirement is a JVM
– Integrates with the new ecosystems– Callable by Scala, Groovy, Clojure, etc.– Supported in many JavaScript engines
![Page 16: Java in the database–is it really useful? Solving impossible Big Data challenges](https://reader035.vdocuments.mx/reader035/viewer/2022081605/58ee31f91a28ab225d8b462b/html5/thumbnails/16.jpg)
16© 2015 Rogue Wave Software, Inc. All Rights Reserved. 16
The details
![Page 17: Java in the database–is it really useful? Solving impossible Big Data challenges](https://reader035.vdocuments.mx/reader035/viewer/2022081605/58ee31f91a28ab225d8b462b/html5/thumbnails/17.jpg)
17© 2015 Rogue Wave Software, Inc. All Rights Reserved. 17
JMSL under the hood• Pure Java• 100s of classes• Part of IMSL family• Extensive
documentation• Well supported
JMSL architecture
![Page 18: Java in the database–is it really useful? Solving impossible Big Data challenges](https://reader035.vdocuments.mx/reader035/viewer/2022081605/58ee31f91a28ab225d8b462b/html5/thumbnails/18.jpg)
18© 2015 Rogue Wave Software, Inc. All Rights Reserved. 18
Architecture
SQL subprogram
Java class
JMSL
data
Database storageDB process
execution
SQL Interpreter
Java Virtual Machine
SQL Engine
JMSL routines run here
Server
external processes
Some a
nalyt
ics
pack
ages
run h
ere
Database
![Page 19: Java in the database–is it really useful? Solving impossible Big Data challenges](https://reader035.vdocuments.mx/reader035/viewer/2022081605/58ee31f91a28ab225d8b462b/html5/thumbnails/19.jpg)
19© 2015 Rogue Wave Software, Inc. All Rights Reserved. 19
JMSL and SQL: not a paradigm shift
Targeting respective strengths
• Java introduced as RDBs grew into their modern form.
• JDBC was introduced in JDK 1.1 (1997)• Direct mappings of fundamental SQL data types
in Java • Internal DB JVM allows seamless integration
between Java and SQL• Leverages stable, familiar technologies
In the database use SQL and JMSL for their respective strengths.
• SQL: queries, DDL, DML• JMSL: advanced analytics
![Page 20: Java in the database–is it really useful? Solving impossible Big Data challenges](https://reader035.vdocuments.mx/reader035/viewer/2022081605/58ee31f91a28ab225d8b462b/html5/thumbnails/20.jpg)
20© 2015 Rogue Wave Software, Inc. All Rights Reserved. 20
It’s so easy even I could do it
![Page 21: Java in the database–is it really useful? Solving impossible Big Data challenges](https://reader035.vdocuments.mx/reader035/viewer/2022081605/58ee31f91a28ab225d8b462b/html5/thumbnails/21.jpg)
21© 2015 Rogue Wave Software, Inc. All Rights Reserved. 21
First step, install JMSL to the DB
… that’s it
![Page 22: Java in the database–is it really useful? Solving impossible Big Data challenges](https://reader035.vdocuments.mx/reader035/viewer/2022081605/58ee31f91a28ab225d8b462b/html5/thumbnails/22.jpg)
22© 2015 Rogue Wave Software, Inc. All Rights Reserved. 22
JMSL classes as DB objectsIn
stall
ed JM
SL
class
es
All dependencies resolved
Nearly 200 JMSL classes
![Page 23: Java in the database–is it really useful? Solving impossible Big Data challenges](https://reader035.vdocuments.mx/reader035/viewer/2022081605/58ee31f91a28ab225d8b462b/html5/thumbnails/23.jpg)
23© 2015 Rogue Wave Software, Inc. All Rights Reserved. 23
UDF steps 1. Write UDF as Java static method
a) Compile to byte codeb) Load class file to DB
2. Write SQL call specification for UDF a) not a wrapper (no extra execution layer)b) Maps Java and SQL typesc) Saved as SQL stored procedure
3. Use stored procedure for in-DB analytics
![Page 24: Java in the database–is it really useful? Solving impossible Big Data challenges](https://reader035.vdocuments.mx/reader035/viewer/2022081605/58ee31f91a28ab225d8b462b/html5/thumbnails/24.jpg)
24© 2015 Rogue Wave Software, Inc. All Rights Reserved. 24
Java UDFs stored with SQL alias
Java UDF as DB object
1 3AutoARIMA output
2
SQL call spec.
![Page 25: Java in the database–is it really useful? Solving impossible Big Data challenges](https://reader035.vdocuments.mx/reader035/viewer/2022081605/58ee31f91a28ab225d8b462b/html5/thumbnails/25.jpg)
25© 2015 Rogue Wave Software, Inc. All Rights Reserved. 25
Code snippet public static java.sql.Array AA1 ( ResultSet rs, int nrows, int nforecast ) throws SQLException {
java.sql.Array array = null;// … skipped lines of data prep
// 2D array to hold AutoARIMA outputdouble[][] darr = new double[7][n+1];// instantiate JMSL objectAutoARIMA autoArima = new AutoARIMA(t, x); // … skipped lines of data processing
// create a varray of varrays with the double[][] dataarray = RWArrayOut.varrVarrOut(darr);return array;
} // from RWAutoArima.java
![Page 26: Java in the database–is it really useful? Solving impossible Big Data challenges](https://reader035.vdocuments.mx/reader035/viewer/2022081605/58ee31f91a28ab225d8b462b/html5/thumbnails/26.jpg)
26© 2015 Rogue Wave Software, Inc. All Rights Reserved. 26
Summary
![Page 27: Java in the database–is it really useful? Solving impossible Big Data challenges](https://reader035.vdocuments.mx/reader035/viewer/2022081605/58ee31f91a28ab225d8b462b/html5/thumbnails/27.jpg)
27© 2015 Rogue Wave Software, Inc. All Rights Reserved. 27
Java in the DB is more than useful … when combined with JMSL• Non-proprietary language• Efficient data transfer• Distributed/scalable• Secure• Portable/reusable• Extensive collections of algorithms • Performance • Low cost and easy to implement
![Page 28: Java in the database–is it really useful? Solving impossible Big Data challenges](https://reader035.vdocuments.mx/reader035/viewer/2022081605/58ee31f91a28ab225d8b462b/html5/thumbnails/28.jpg)
28© 2015 Rogue Wave Software, Inc. All Rights Reserved. 28
Additional resources• White papers available at roguewave.com
– Tech tutorial: Embedding analytics into a database using JMSL
– Using JMSL in Hadoop MapReduce applications– Time series analysis Auto Arima– and many others
• JMSL Manual and API available at roguewave.com
• Rogue Wave Professional Services– Development of high performance applications– Migration services– Assistance with Rogue Wave products
![Page 29: Java in the database–is it really useful? Solving impossible Big Data challenges](https://reader035.vdocuments.mx/reader035/viewer/2022081605/58ee31f91a28ab225d8b462b/html5/thumbnails/29.jpg)
29© 2015 Rogue Wave Software, Inc. All Rights Reserved. 29