welcome [tc18.tableau.com] · server cert. how to turn on jdbc? ... • sap hana –sap hana...
TRANSCRIPT
Welcome
JDBC Connectors
Eric Wang
Senior Software Engineer
Tableau / Connectivity
# T C 1 8
Wei Su
Senior Software Engineer
Tableau / Connectivity
Liyun Bao
Senior Software Engineer
Tableau / Connectivity
Agenda
• JDBC Connectors in Tableau
• Generic JDBC
• Performance
• Q&A
JDBC Connectors in Tableau
Eric Wang
JDBC Connectors
• Which Connector supports JDBC?
• How does JDBC work in Tableau?
• Why do we need JDBC Connector?
• How to turn on(Desktop, Server)?
Which connector supports JDBC?
JDBC Only Connector(all platforms)
Amazon Athena
JDBC on Mac, both JDBC and ODBC on Win:
SAP Hana
Both exist on all platforms(default is ODBC):
Microsoft SQL Server, Oracle, Presto
Generic JDBC
JDBC vs ODBC
• Java Database Connectivity
• Introduced by Sun MicroSystem at 1997
• Must use Language Java
• Platform independent because of Java
• Code is easy to read
• Open Database Connectivity
• Introduced by Microsoft at eraly 1990s
• We can use any language, mostly C, C++
• Mostly used in Windows platform
• Fast performance for data import/export
How does JDBC work in Tableau?
Database
JDBC DriverTableau C++ code Java Client codeTabjdbc.jarJava Native
Interface(JNI)
JDBC APIProtocol Buffer
tabprotosrv
Java VirtualMachine(JVM)
Compile
Why do we need JDBC
1. Some DB only have JDBC drivers: Athena
2. Some DB only have JDBC driver on Mac: SAP Hana
3. Limitation in ODBC connector: Oracle SSL (discuss later)
4. performance (discuss later)
Oracle JDBC Benefits - SSL
• Using SSL feature with Oracle OCI: Add server certificate to your Oracle Wallet, add many configurations for both Desktop and Server
• Using SSL feature with Oracle JDBC: all you need do is open Oracle connection dialog, click "require SSL", and upload server cert.
How to turn on JDBC?
• Attention: The following content is only intended for advanced users
• Only do this if you know what you are doing
• As this is not officially supported by Tableau, you should use this feature at your own risk
How to turn on JDBC – Tableau Desktop
• SAP Hana, Microsoft SQL Server, Oracle, Presto
• Install a 64bit JRE(you may need to point your JAVA_HOME to JRE).
• Setup the drivers.Windows: C:/Program Files/Tableau/Drivers.
Mac: ~/Library/Tableau/Drivers.
• Launch Tableau.
Windows: tableau.exe -DForceJdbc
Mac: Tableau.app --args -DForceJdbc
Advanced usage:
Windows: tableau.exe -DUseJdbc=saphana, sqlserver, oracle, presto
Mac: Tableau.app --args -DUseJdbc=saphana, sqlserver...
** The JRE version installed need to match the java version your driver is using.
How to turn on JDBC – Tableau Server
• Make Sure you have 64bit JRE.
• Stop Server.
• Setup the drivers.Windows: C:/Program Files/Tableau/Drivers.
Linux: /opt/tableau/tableau_driver/jdbc
• Edit server configure file
tsm configuration set –k native_api.force_jdbc –v true
tsm pending-changes apply
tsm configuration set –k native_api.use_jdbc –v saphana,sqlserver...
tsm pending-changes apply
• Start Server
Publishing and Cross-Platform
ODBC
JDBC
ODBC
JDBC
JDBC/ODBC
Sharing
Publishing
Drivers
Don't modify this!!
Demo of using JDBC
Future work
JDBC Server:
JDBC Server hosts JDBC connections for use by C++ processes. It hosts multiple connections, instead of 1-JVM-per-connection pattern we currently follow in protocol server
Benefits:
• We only need to start the process once, and it can be pre-emptively started before we need it for turbo boost.
• JVM warm up/JIT only occurs once, and not for each connection.
• Reduced memory footprint. JDBC in protocol server uses 256MB heap for each connection, this can add up when we have 100’s of connections. For JDBC Server we will use a single process with a large (512M or 1G) heap which should work just as well. When running on desktop we would be more conservative.
Generic JDBC
Wei Su
Generic JDBC
• Ship in 2018.3
• Available on all platforms: Win, Mac, Linux(tableau server)
• Driver limitation:
You must have read permissions on the .jar file.
Tableau requires a JDBC 4.0 or later driver.
Tableau requires a Type 4 JDBC driver.
Generic JDBC
• Jdbc url: jdbc:mysql://hostname:3306/dbname
• Dialect selection: MySQL, PostgreSQL, SQL92
• Basic Username/Password.
• Properties file for each connection.
• Also support TDC file.
Generic JDBC properties file
Properties file is common in jdbc world.
Properties are configuration values managed as key/value pairs. In each pair, the key and value are both String values.
Mysql.properties Example:
useSSL= true
Generic JDBC TDC file
TDC file is common in tableau world.
Example file for using generic JDBC connect to mysql database.
<connection-customization class='genericjdbc' enabled='true'><vendor name='genericjdbc' /><driver name='mysql' /><customizations><customization name='CAP_JDBC_EXAMPLE' value='yes' /></customizations>
</connection-customization>
Url: jdbc:mysql://example.
Publishing and cross-platform portability
ODBC
JDBC
ODBC
JDBC
Generic JDBC
Sharing
Publishing
Drivers
Demo of Generic JDBC
Generic JDBC + MySQL + Neo4J
Performance
Liyun Bao
Agenda
• JDBC vs. ODBC, which one is faster
• Tune JDBC Performance with Fetch Size
• Setting JDBC Fetch Size in Tableau
• Demo
Experiment Settings
• Target DB: Athena, Redshift, Oracle, Presto and SAP HANA• Athena - athena.us-east-1.amazonaws.com• Redshift - [tableau server].us-east-1.redshift.amazonaws.com (multi-node)• Oracle - Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production• Presto – 0.212• SAP HANA – SAP HANA version 1.00.101
• Scenario• Compare extracting performance with JDBC or ODBC connection• 1000≤ Row Number ≤5M
• Environment• Dataset: TPC-H 10GB• Table Name: LINEITEM• Row Count: 60M• Row Size: 300b• Machine Setting: Windows 10 Enterprise 64-bit, Intel Xeon CPU E5-2630 2.40 GHz, 32.0 GB
RAM
JDBC vs. ODBC, which one is faster?
• JDBC is faster – Athena, Redshift
JDBC30%faster
JDBC3x
faster
JDBC vs. ODBC, which one is faster?
• ODBC is faster – Oracle, Presto
ODBC10x
faster
ODBC2.5x
faster
JDBC vs. ODBC, which one is faster?
• ODBC is faster for small result set, JDBC is faster for large result set – SAP HANA
JDBC30%faster
• JDBC Fetch Size
• Determine how many rows each fetch from DB
• With large fetch size
• Fewer round trips to DB, hence better performance
• Test Scenario
• Compare extracting time of different JDBC Fetch Size
Tune JDBC Performance with Fetch Size
• JDBC performance gets better with larger JDBC Fetch Size• Oracle ~ 4x ~ 10x faster
Tune JDBC Performance with Fetch Size
• Should we set the JDBC Fetch Size as large as possible?
Tune JDBC Performance with Fetch Size
• Not all JDBC drivers support setting JDBC Fetch Size
Setting JDBC Fetch Size in Tableau
Database Set JDBC Fetch Size by connecton property?
JDBC Fetch Size Property Default Value
Athena Yes RowsToFetchPerBlock 10000 for result set streaming, 1000 for pagination
Redshift No
Presto No
Oracle Yes defaultRowPrefetch 10
SAP HANA No
• Named Connector• Set JDBC Fetch Size in properties file
Setting JDBC Fetch Size in Tableau
• Generic JDBC• Set JDBC Fetch Size in URL
• Set JDBC Fetch Size in properties file
Setting JDBC Fetch Size in Tableau
Connection URL Properties File
Demo – Tune Oracle JDBC Performance
Please complete the
session survey from the
Session Details screen
in your TC18 app
Thank you!
#TC18
Contact or CTA info goes here