getting started with apache ignite sql · •ignite sql basics: dml, ddl, connectivity,...
TRANSCRIPT
![Page 1: Getting Started With Apache Ignite SQL · •Ignite SQL Basics: DML, DDL, connectivity, configuration •Affinity Co-Location and Distributed JOINs •Beyond Memory Capacity: Disk](https://reader033.vdocuments.mx/reader033/viewer/2022052815/60a332d8980b1d7b9542f8b5/html5/thumbnails/1.jpg)
Getting Started With Apache Ignite SQL
Denis Magda, GridGain Developer RelationsIgor Seliverstov, GridGain Architecture Group
![Page 2: Getting Started With Apache Ignite SQL · •Ignite SQL Basics: DML, DDL, connectivity, configuration •Affinity Co-Location and Distributed JOINs •Beyond Memory Capacity: Disk](https://reader033.vdocuments.mx/reader033/viewer/2022052815/60a332d8980b1d7b9542f8b5/html5/thumbnails/2.jpg)
Topics
• Ignite SQL Basics: DML, DDL, connectivity, configuration• Affinity Co-Location and Distributed JOINs• Beyond Memory Capacity: Disk Tier Usage and Memory Quotas• Ignite SQL Evolution With Apache Calcite
![Page 3: Getting Started With Apache Ignite SQL · •Ignite SQL Basics: DML, DDL, connectivity, configuration •Affinity Co-Location and Distributed JOINs •Beyond Memory Capacity: Disk](https://reader033.vdocuments.mx/reader033/viewer/2022052815/60a332d8980b1d7b9542f8b5/html5/thumbnails/3.jpg)
Ignite SQL Basics
![Page 4: Getting Started With Apache Ignite SQL · •Ignite SQL Basics: DML, DDL, connectivity, configuration •Affinity Co-Location and Distributed JOINs •Beyond Memory Capacity: Disk](https://reader033.vdocuments.mx/reader033/viewer/2022052815/60a332d8980b1d7b9542f8b5/html5/thumbnails/4.jpg)
Ignite SQL = ANSI SQL at Scale
• ANSI-99 DML and DDL syntax– SELECT, UPDATE, CREATE…
• Distributed joins, grouping, sorting
• Schema changes in runtime– ALTER TABLE, CREATE/DROP INDEX
• Works with in-memory and disk-only records– If Ignite Persistence is used as a disk tier
![Page 5: Getting Started With Apache Ignite SQL · •Ignite SQL Basics: DML, DDL, connectivity, configuration •Affinity Co-Location and Distributed JOINs •Beyond Memory Capacity: Disk](https://reader033.vdocuments.mx/reader033/viewer/2022052815/60a332d8980b1d7b9542f8b5/html5/thumbnails/5.jpg)
Connectivity Options
• Thick Client APIs– Java, C#/.NET, C++
• JDBC and ODBC drivers
• Thin Client APIs– Multi-language support
![Page 6: Getting Started With Apache Ignite SQL · •Ignite SQL Basics: DML, DDL, connectivity, configuration •Affinity Co-Location and Distributed JOINs •Beyond Memory Capacity: Disk](https://reader033.vdocuments.mx/reader033/viewer/2022052815/60a332d8980b1d7b9542f8b5/html5/thumbnails/6.jpg)
Configuration Option #1: Programmatically With Annotations
Usage Scenario:• Spring-style development by annotating POJOs• DDL can be used to apply changes in runtime.
![Page 7: Getting Started With Apache Ignite SQL · •Ignite SQL Basics: DML, DDL, connectivity, configuration •Affinity Co-Location and Distributed JOINs •Beyond Memory Capacity: Disk](https://reader033.vdocuments.mx/reader033/viewer/2022052815/60a332d8980b1d7b9542f8b5/html5/thumbnails/7.jpg)
Configuration Option #2: Spring XML With Query Entities
Usage Scenario:• Ignite as a cache that writes-through
changes to an external database.
• DDL can be used to apply changes in runtime.
![Page 8: Getting Started With Apache Ignite SQL · •Ignite SQL Basics: DML, DDL, connectivity, configuration •Affinity Co-Location and Distributed JOINs •Beyond Memory Capacity: Disk](https://reader033.vdocuments.mx/reader033/viewer/2022052815/60a332d8980b1d7b9542f8b5/html5/thumbnails/8.jpg)
Configuration Option #3: In Pure SQL With DDL
Usage Scenario:• SQL-driven applications• Green-field applications using Ignite as a
database with its native persistence
![Page 9: Getting Started With Apache Ignite SQL · •Ignite SQL Basics: DML, DDL, connectivity, configuration •Affinity Co-Location and Distributed JOINs •Beyond Memory Capacity: Disk](https://reader033.vdocuments.mx/reader033/viewer/2022052815/60a332d8980b1d7b9542f8b5/html5/thumbnails/9.jpg)
Demo Time
Cluster Startup and Database Creation
![Page 10: Getting Started With Apache Ignite SQL · •Ignite SQL Basics: DML, DDL, connectivity, configuration •Affinity Co-Location and Distributed JOINs •Beyond Memory Capacity: Disk](https://reader033.vdocuments.mx/reader033/viewer/2022052815/60a332d8980b1d7b9542f8b5/html5/thumbnails/10.jpg)
Affinity Co-Location andDistributed JOINs
![Page 11: Getting Started With Apache Ignite SQL · •Ignite SQL Basics: DML, DDL, connectivity, configuration •Affinity Co-Location and Distributed JOINs •Beyond Memory Capacity: Disk](https://reader033.vdocuments.mx/reader033/viewer/2022052815/60a332d8980b1d7b9542f8b5/html5/thumbnails/11.jpg)
Ignite SQL Engine Internals
Data & Indexes
Ignite SQL
H2 Engine
Data & Indexes
Ignite SQL
H2 Engine
Data & Indexes
Ignite SQL
H2 Engine
![Page 12: Getting Started With Apache Ignite SQL · •Ignite SQL Basics: DML, DDL, connectivity, configuration •Affinity Co-Location and Distributed JOINs •Beyond Memory Capacity: Disk](https://reader033.vdocuments.mx/reader033/viewer/2022052815/60a332d8980b1d7b9542f8b5/html5/thumbnails/12.jpg)
Query Execution Phases
City
City
Thick Client
Map
Map
Map
Reduce
Reduce
![Page 13: Getting Started With Apache Ignite SQL · •Ignite SQL Basics: DML, DDL, connectivity, configuration •Affinity Co-Location and Distributed JOINs •Beyond Memory Capacity: Disk](https://reader033.vdocuments.mx/reader033/viewer/2022052815/60a332d8980b1d7b9542f8b5/html5/thumbnails/13.jpg)
Default Data Distribution
Canada
Toronto
Calgary
Paris
France
MarseilleMontreal
Ottawa
Country Table
City Table
![Page 14: Getting Started With Apache Ignite SQL · •Ignite SQL Basics: DML, DDL, connectivity, configuration •Affinity Co-Location and Distributed JOINs •Beyond Memory Capacity: Disk](https://reader033.vdocuments.mx/reader033/viewer/2022052815/60a332d8980b1d7b9542f8b5/html5/thumbnails/14.jpg)
SQL JOIN With Data Shuffling
Thick Client
Canada
Toronto
CalgaryParis
France
MarseilleOttawa
Montreal
ParisOttawaMontreal
1 & 4
2
2
3
1. Initiating Execution2. Execution on Servers (map phase)3. Data Shuffling4. Reduce Phase
![Page 15: Getting Started With Apache Ignite SQL · •Ignite SQL Basics: DML, DDL, connectivity, configuration •Affinity Co-Location and Distributed JOINs •Beyond Memory Capacity: Disk](https://reader033.vdocuments.mx/reader033/viewer/2022052815/60a332d8980b1d7b9542f8b5/html5/thumbnails/15.jpg)
Co-Located Distribution (aka. Affinity Co-Location)
Canada
Toronto
Calgary
France
Marseille
Country Table
City Table
MontrealOttawa Paris
![Page 16: Getting Started With Apache Ignite SQL · •Ignite SQL Basics: DML, DDL, connectivity, configuration •Affinity Co-Location and Distributed JOINs •Beyond Memory Capacity: Disk](https://reader033.vdocuments.mx/reader033/viewer/2022052815/60a332d8980b1d7b9542f8b5/html5/thumbnails/16.jpg)
All You Need is to Configure Affinity Key
![Page 17: Getting Started With Apache Ignite SQL · •Ignite SQL Basics: DML, DDL, connectivity, configuration •Affinity Co-Location and Distributed JOINs •Beyond Memory Capacity: Disk](https://reader033.vdocuments.mx/reader033/viewer/2022052815/60a332d8980b1d7b9542f8b5/html5/thumbnails/17.jpg)
Affinity Key to Node Mapping Process
Affinity Key Partition
Application Process Network Call
Node
CityRecord
![Page 18: Getting Started With Apache Ignite SQL · •Ignite SQL Basics: DML, DDL, connectivity, configuration •Affinity Co-Location and Distributed JOINs •Beyond Memory Capacity: Disk](https://reader033.vdocuments.mx/reader033/viewer/2022052815/60a332d8980b1d7b9542f8b5/html5/thumbnails/18.jpg)
High-Performance SQL JOIN
Thick Client
Canada
Toronto
Calgary
France
Marseille
1 & 3
2
2
1. Initiating Execution2. Execution on Servers (map phase)3. Reduce Phase
Ottawa
Paris
![Page 19: Getting Started With Apache Ignite SQL · •Ignite SQL Basics: DML, DDL, connectivity, configuration •Affinity Co-Location and Distributed JOINs •Beyond Memory Capacity: Disk](https://reader033.vdocuments.mx/reader033/viewer/2022052815/60a332d8980b1d7b9542f8b5/html5/thumbnails/19.jpg)
Demo Time
Queries With JOINs
![Page 20: Getting Started With Apache Ignite SQL · •Ignite SQL Basics: DML, DDL, connectivity, configuration •Affinity Co-Location and Distributed JOINs •Beyond Memory Capacity: Disk](https://reader033.vdocuments.mx/reader033/viewer/2022052815/60a332d8980b1d7b9542f8b5/html5/thumbnails/20.jpg)
Beyond Memory Capacity:Disk-Tier and Memory Quotas
![Page 21: Getting Started With Apache Ignite SQL · •Ignite SQL Basics: DML, DDL, connectivity, configuration •Affinity Co-Location and Distributed JOINs •Beyond Memory Capacity: Disk](https://reader033.vdocuments.mx/reader033/viewer/2022052815/60a332d8980b1d7b9542f8b5/html5/thumbnails/21.jpg)
Multi-Tier Storage architecture
1. In-Memory - General in-memory caching, high-performance computing
2. In-Memory + Native Persistence - Ignite as an in-memory database
3. In-Memory + External Database - Acceleration of services and APIs with write-through and write-behind capability
![Page 22: Getting Started With Apache Ignite SQL · •Ignite SQL Basics: DML, DDL, connectivity, configuration •Affinity Co-Location and Distributed JOINs •Beyond Memory Capacity: Disk](https://reader033.vdocuments.mx/reader033/viewer/2022052815/60a332d8980b1d7b9542f8b5/html5/thumbnails/22.jpg)
Multi-Tier Storage Architecture
Index Page(root)
Index page(inner)
Inner page 2
Leaf page 2Index Page(leaf)
Leaf page 3
Data page Index Page(root) Leaf page Metadata page Leaf page Data page Metadata page
Memory segment
Key-Value
Key-Value
Key-Value
Key-Value
Data page
DataIndex
![Page 23: Getting Started With Apache Ignite SQL · •Ignite SQL Basics: DML, DDL, connectivity, configuration •Affinity Co-Location and Distributed JOINs •Beyond Memory Capacity: Disk](https://reader033.vdocuments.mx/reader033/viewer/2022052815/60a332d8980b1d7b9542f8b5/html5/thumbnails/23.jpg)
Multi-Tier Storage Architecture
Data page #0 Data page #1
Partition file with Data
Data page #2 Data page #3 Data page #4 Data page #5 Data page #6
Data page #5 Inner page Leaf page Metadata page Leaf page Data page #2 Metadata page
Memory segment
PageIdPages map Pointer in a memory segment
(read/write ops)
Position in a file (load page/checkpoint)
![Page 24: Getting Started With Apache Ignite SQL · •Ignite SQL Basics: DML, DDL, connectivity, configuration •Affinity Co-Location and Distributed JOINs •Beyond Memory Capacity: Disk](https://reader033.vdocuments.mx/reader033/viewer/2022052815/60a332d8980b1d7b9542f8b5/html5/thumbnails/24.jpg)
Java off-heap vs Java heap
SQL query processing
ParsingPlanning
Computing(filters, joins, expressions)
Scanning(index or table scan)
Heap
Heap
More Heap
Off-heap
![Page 25: Getting Started With Apache Ignite SQL · •Ignite SQL Basics: DML, DDL, connectivity, configuration •Affinity Co-Location and Distributed JOINs •Beyond Memory Capacity: Disk](https://reader033.vdocuments.mx/reader033/viewer/2022052815/60a332d8980b1d7b9542f8b5/html5/thumbnails/25.jpg)
Java off-heap vs Java heap
Sorting
Renaming
Aggregation
Projection
Join
Filtering
ScanningCOUNTRYCITY
σcode in (‘CAN’, ‘FRA’)
⋈country.code = city.countrycode
country.name, city.nameℱMAX(city.population)
τmax_pop
πcountry.name, city.name, city.population
ρname, name0, max_pop
Here we need full set in heap
Here we need full set in heap too
![Page 26: Getting Started With Apache Ignite SQL · •Ignite SQL Basics: DML, DDL, connectivity, configuration •Affinity Co-Location and Distributed JOINs •Beyond Memory Capacity: Disk](https://reader033.vdocuments.mx/reader033/viewer/2022052815/60a332d8980b1d7b9542f8b5/html5/thumbnails/26.jpg)
Query memory quotas
How to configure:
![Page 27: Getting Started With Apache Ignite SQL · •Ignite SQL Basics: DML, DDL, connectivity, configuration •Affinity Co-Location and Distributed JOINs •Beyond Memory Capacity: Disk](https://reader033.vdocuments.mx/reader033/viewer/2022052815/60a332d8980b1d7b9542f8b5/html5/thumbnails/27.jpg)
Interim results offloading
Sorting
Renaming
Aggregation
Projection
Join
Filtering
ScanningCOUNTRYCITY
σcode in (‘CAN’, ‘FRA’)
⋈country.code = city.countrycode
country.name, city.nameℱMAX(city.population)
τmax_pop
πcountry.name, city.name, city.population
ρname, name0, max_pop
Why don’t you flush result sets to disk?
And it
![Page 28: Getting Started With Apache Ignite SQL · •Ignite SQL Basics: DML, DDL, connectivity, configuration •Affinity Co-Location and Distributed JOINs •Beyond Memory Capacity: Disk](https://reader033.vdocuments.mx/reader033/viewer/2022052815/60a332d8980b1d7b9542f8b5/html5/thumbnails/28.jpg)
Intermediate results offloading
How to configure:
![Page 29: Getting Started With Apache Ignite SQL · •Ignite SQL Basics: DML, DDL, connectivity, configuration •Affinity Co-Location and Distributed JOINs •Beyond Memory Capacity: Disk](https://reader033.vdocuments.mx/reader033/viewer/2022052815/60a332d8980b1d7b9542f8b5/html5/thumbnails/29.jpg)
When you need quotas/offloading enabled
● Sorting (ORDER BY)
● Grouping (DISTINCT, GROUP BY)
● Complex subqueries
![Page 30: Getting Started With Apache Ignite SQL · •Ignite SQL Basics: DML, DDL, connectivity, configuration •Affinity Co-Location and Distributed JOINs •Beyond Memory Capacity: Disk](https://reader033.vdocuments.mx/reader033/viewer/2022052815/60a332d8980b1d7b9542f8b5/html5/thumbnails/30.jpg)
Demo Time
Running SQL Over Disk-Only Records
![Page 31: Getting Started With Apache Ignite SQL · •Ignite SQL Basics: DML, DDL, connectivity, configuration •Affinity Co-Location and Distributed JOINs •Beyond Memory Capacity: Disk](https://reader033.vdocuments.mx/reader033/viewer/2022052815/60a332d8980b1d7b9542f8b5/html5/thumbnails/31.jpg)
Apache Ignite SQL EvolutionWith Apache Calcite
![Page 32: Getting Started With Apache Ignite SQL · •Ignite SQL Basics: DML, DDL, connectivity, configuration •Affinity Co-Location and Distributed JOINs •Beyond Memory Capacity: Disk](https://reader033.vdocuments.mx/reader033/viewer/2022052815/60a332d8980b1d7b9542f8b5/html5/thumbnails/32.jpg)
Why do we need it?
Here we need Map-Reduce phase
Here we need Map-Reduce phase too
![Page 33: Getting Started With Apache Ignite SQL · •Ignite SQL Basics: DML, DDL, connectivity, configuration •Affinity Co-Location and Distributed JOINs •Beyond Memory Capacity: Disk](https://reader033.vdocuments.mx/reader033/viewer/2022052815/60a332d8980b1d7b9542f8b5/html5/thumbnails/33.jpg)
Typical execution flow
User
Parser
Optimizer modeRule-based optimizer
Cost-based optimizer
Dictionary
Row source generator Execution
SQL Query
RBO CBOStatistics
Result
Query plan
ValidatorValidation
Schema
Query AST
![Page 34: Getting Started With Apache Ignite SQL · •Ignite SQL Basics: DML, DDL, connectivity, configuration •Affinity Co-Location and Distributed JOINs •Beyond Memory Capacity: Disk](https://reader033.vdocuments.mx/reader033/viewer/2022052815/60a332d8980b1d7b9542f8b5/html5/thumbnails/34.jpg)
Apache Calcite
JDBC Client
JDBC Server
SQL Parser/Validator
Query optimizer
3rd party ops 3rd party ops
Metadata SPI
Plugable rules
3rd party data
3rd party data
Optional
Core
Pluggable
Need to implement:
● Splitter
● Runtime
● Indexes support
● DML support
● DDL support
![Page 35: Getting Started With Apache Ignite SQL · •Ignite SQL Basics: DML, DDL, connectivity, configuration •Affinity Co-Location and Distributed JOINs •Beyond Memory Capacity: Disk](https://reader033.vdocuments.mx/reader033/viewer/2022052815/60a332d8980b1d7b9542f8b5/html5/thumbnails/35.jpg)
Query Parser and Transformer
Select
expr=p.id, d.name
from=person, dep
cond=p.depId = d.id AND (p.id > 10 OR p.id < 10000)
order=p.name DESC
offset=10
dep(d)person(p)
σp.id > 10 OR p.id < 10000
⋈p.depId = d.id
τp.name DESC
πp.id, p.name
Query AST
Relational operators tree (query plan)
![Page 36: Getting Started With Apache Ignite SQL · •Ignite SQL Basics: DML, DDL, connectivity, configuration •Affinity Co-Location and Distributed JOINs •Beyond Memory Capacity: Disk](https://reader033.vdocuments.mx/reader033/viewer/2022052815/60a332d8980b1d7b9542f8b5/html5/thumbnails/36.jpg)
Cost-Based Optimizer
Estimator Dictionary
Plan generator
Query transformer
Query AST (from parser)
Relational tree
Relational tree + costs
Statistics
Equivalent relational
tree
Query plan (to Row source generator)
Rules
![Page 37: Getting Started With Apache Ignite SQL · •Ignite SQL Basics: DML, DDL, connectivity, configuration •Affinity Co-Location and Distributed JOINs •Beyond Memory Capacity: Disk](https://reader033.vdocuments.mx/reader033/viewer/2022052815/60a332d8980b1d7b9542f8b5/html5/thumbnails/37.jpg)
Cost-Based Splitter
dep(d)person(p)
σp.id > 10 OR p.id < 10000
⋈p.depId = d.id
τp.name DESC
πp.id, p.name
Root
![Page 38: Getting Started With Apache Ignite SQL · •Ignite SQL Basics: DML, DDL, connectivity, configuration •Affinity Co-Location and Distributed JOINs •Beyond Memory Capacity: Disk](https://reader033.vdocuments.mx/reader033/viewer/2022052815/60a332d8980b1d7b9542f8b5/html5/thumbnails/38.jpg)
Reactive Execution Flow
Scan Filter Sender Receiver Client cursor
Push Push Send PushData flow
Request Request Acknowledge Request
Node buffer Node buffer Node buffer Node buffer
Backpressure
Network communication
![Page 39: Getting Started With Apache Ignite SQL · •Ignite SQL Basics: DML, DDL, connectivity, configuration •Affinity Co-Location and Distributed JOINs •Beyond Memory Capacity: Disk](https://reader033.vdocuments.mx/reader033/viewer/2022052815/60a332d8980b1d7b9542f8b5/html5/thumbnails/39.jpg)
Demo Time
Calcite Prototype Demo With Sub-Queries
![Page 40: Getting Started With Apache Ignite SQL · •Ignite SQL Basics: DML, DDL, connectivity, configuration •Affinity Co-Location and Distributed JOINs •Beyond Memory Capacity: Disk](https://reader033.vdocuments.mx/reader033/viewer/2022052815/60a332d8980b1d7b9542f8b5/html5/thumbnails/40.jpg)
Learn More
• Apache Ignite SQL– https://apacheignite-sql.readme.io/docs
• Memory Quotas (available in GridGain Community Edition):
– https://www.gridgain.com/docs/latest/developers-guide/memory-configuration/memory-quotas
• Demos shown in this webinar– https://github.com/GridGain-Demos/ignite-sql-intro-samples
• New Apache Calcite-based engine– https://cwiki.apache.org/confluence/display/IGNITE/IEP-37%3A+New+
query+execution+engine