a look at the cql changes in 3.x (benjamin lerer, datastax) | cassandra summit 2016

Post on 16-Apr-2017

157 Views

Category:

Software

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Benjamin Lerer

A look at the CQL changes in 3.x

2© DataStax, All Rights Reserved.

• Updates and Deletions• Filtering• Grouping

Updates and Deletions (3.0)

© DataStax, All Rights Reserved. 3

Updates and Deletions

© DataStax, All Rights Reserved. 4

CREATE TABLE toys (brand textcategory text,id int,name text,price decimal,PRIMARY KEY (brand, category, id)

Clustering columns

Simple updates

© DataStax, All Rights Reserved. 5

INSERT INTO toys (brand, category, id, name, price)VALUES (‘Lego’, ‘Star Wars’, 75060, ‘Slave I’, 219.99)

UPDATE toys SET name = ‘Tie Fighter’, price = 219.99 WHERE brand = ‘Lego’ AND category = ‘Star Wars’ AND id = 75095

name: ‘Slave I’ ts: t1 price: 219.99 ts: t1 ‘Star Wars’-75060 ts: t1

name: ‘Tie Fighter’ ts: t2 price: 219.99 ts: t2 ‘Star Wars’-75095 ts: Long.MIN

Memtable

Multi-updates

© DataStax, All Rights Reserved. 6

UPDATE toys SET price = 229.99 WHERE brand = ‘Lego’AND category = ‘Star Wars’AND id IN (75059, 75060, 75095)

name: ‘Slave I’ ts: t1 price: 229.99 ts: t3 ‘Star Wars’-75060 ts: t1

name: ‘Tie Fighter’ ts: t2 price: 229.99 ts: t3 ‘Star Wars’-75095 ts: Long.MIN

Memtable

price: 229.99 ts: t3 ‘Star Wars’-75059 ts: Long.MIN

Column deletion

© DataStax, All Rights Reserved. 7

DELETE name FROM toysWHERE brand = ‘Lego’ AND category = ‘Star Wars’ AND id IN (75059, 75060)

name: <tombstone> ts: t4 price: 229.99 ts: t3 ‘Star Wars’-75060 ts: t1

name: ‘Tie Fighter’ ts: t2 price: 229.99 ts: t3 ‘Star Wars’-75095 ts: Long.MIN

Memtable

price: 229.99 ts: t3 ‘Star Wars’-75059 ts: Long.MIN name: <tombstone> ts: t4

Column deletion on empty Memtable

© DataStax, All Rights Reserved. 8

DELETE name FROM toysWHERE brand = ‘Lego’ AND category = ‘Star Wars’ AND id IN (75059, 75060)

name: <tombstone> ts: t4 ‘Star Wars’-75060 ts: Long.MIN

Memtable

‘Star Wars’-75059 ts: Long.MIN name: <tombstone> ts: t4

Row deletion

© DataStax, All Rights Reserved. 9

DELETE FROM toysWHERE brand = ‘Lego’ AND category = ‘Star Wars’ AND id = 75059

name: <tombstone> ts: t4 price: 229.99 ts: t3 ‘Star Wars’-75060 ts: t1

name: ‘Tie Fighter’ ts: t2 price: 229.99 ts: t3 ‘Star Wars’-75095 ts: Long.MIN

Memtable

‘Star Wars’-75059 ts: Long.MIN deletedAt: t5

Range deletion (3.0)

© DataStax, All Rights Reserved. 10

DELETE FROM toysWHERE brand = ‘Lego’ AND category = ‘Star Wars’ AND id <= 75060

name: <tombstone> ts: t4 price: 229.99 ts: t3 ‘Star Wars’-75060 ts: t1

name: ‘Tie Fighter’ ts: t2 price: 229.99 ts: t3 ‘Star Wars’-75095 ts: Long.MIN

Memtable

‘Star Wars’-75059 ts: Long.MIN deletedAt: t5

DeletionInfo deletedAt: Long.MIN ranges: (‘Star Wars’ … ‘Start Wars’-75060]

Partition deletion

© DataStax, All Rights Reserved. 11

DELETE FROM toys WHERE brand = ‘Lego'

name: <tombstone> ts: t4 price: 229.99 ts: t3 ‘Star Wars’-75060 ts: t1

name: ‘Tie Fighter’ ts: t2 price: 229.99 ts: t3 ‘Star Wars’-75095 ts: Long.MIN

Memtable

‘Star Wars’-75059 ts: Long.MIN deletedAt: t5

DeletionInfo deletedAt: t6 ranges: (‘Star Wars’ … ‘Start Wars’-75060]

Filtering (3.0, 3.6, 3.10)

© DataStax, All Rights Reserved. 12

Filtering

© DataStax, All Rights Reserved. 13

CREATE TABLE scores ( user text, game text, year int, month int, day int, score int, PRIMARY KEY (user, game, year, month, day))

Filtering

© DataStax, All Rights Reserved. 14

In 2.2:

SELECT * FROM scores WHERE user = ‘Aleksey’ AND game = ‘coup’ AND score >= 1000

InvalidRequest: Error from server: code=2200 [Invalid query] message="Predicates on non-primary-key columns (score) are not yet supported for non secondary index queries"

Filtering

© DataStax, All Rights Reserved. 15

In 3.0:

SELECT * FROM scores WHERE user = ‘Aleksey’ AND game = ‘coup’ AND score >= 1000

InvalidRequest: Error from server: code=2200 [Invalid query] message="Cannot execute this query as it might involve data filtering and thus may have unpredictable performance. If you want to execute this query despite the performance unpredictability, use ALLOW FILTERING"

© DataStax, All Rights Reserved. 16

Filtering = Brute Force approach

Filtering

© DataStax, All Rights Reserved. 17

String partitionKey = "Aleksey"; String[] clusteringPrefix = new String[]{"coup"};

List<Row> rows = loadRows(partitionKey, clusteringPrefix); List<Row> filteredRows = new ArrayList<>();

for (Row row : rows) { if (row.getInt("score") >= 1000) { filteredRows.add(row); }}return filteredRows;

SELECT * FROM scores WHERE user = ‘Aleksey’ AND game = ‘coup’ AND score >= 1000 ALLOW FILTERING

Clustering column filtering (3.6)

© DataStax, All Rights Reserved. 18

SELECT * FROM scores WHERE user = ‘Aleksey’ AND game = ‘coup’ AND month = 9 ALLOW FILTERING

SELECT * FROM scores WHERE user = ‘Aleksey’ AND game = ‘coup’ AND year >= 2014 AND month = 9 ALLOW FILTERING

Clustering sliceFiltering

Filtering

Filtering

© DataStax, All Rights Reserved. 19

Filtering is performed on the replica side

Filtering can return stale data(CASSANDRA-8273)

Filtering

© DataStax, All Rights Reserved. 20

3 replicas: A, B, C

INSERT INTO scores (user, game, year, month, day, score)VALUES (‘Aleksey’, ‘coup’, 2016, 1, 12, 1100);

At QUORUM:

UPDATE scores SET score = 1200 WHERE user = ‘Aleksey’ AND game = ‘coup’ AND year = 2016AND month = 1 AND day = 12;

SELECT * FROM scores WHERE user = ‘Aleksey’ AND game = ‘coup’ AND score = 1100 ALLOW FILTERING

Filtering

© DataStax, All Rights Reserved. 21

In 3.0 filtering is supported on:• Non primary key columns• Static columns

In 3.6 filtering is also supported on clustering columns

In 3.10 filtering will be supported on partition key

When using filtering, be aware of:• Its performance unpredictability• The fact that it can return stale data

Grouping (3.10)

© DataStax, All Rights Reserved. 22

Grouping

© DataStax, All Rights Reserved. 23

SELECT year, month, max(score), min(score), count(score) FROM scoresWHERE user = ‘Aleksey’ AND game = ‘coup’ AND year = 2016GROUP BY month LIMIT 2

Year Month Day Score2016 1 12 1200

2016 1 31 800

2016 2 8 1050

2016 3 1 1400

[…]

2016 6 24 800

Grouping

© DataStax, All Rights Reserved. 24

SELECT score, count(*) FROM scoresWHERE user = ‘Aleksey’ AND game = ‘coup’ AND year = 2016GROUP BY score LIMIT 2

Year Month Day Score2016 1 12 1200

2016 1 31 800

2016 2 8 1050

2016 3 1 1400

[…]

2016 6 24 800

Grouping

© DataStax, All Rights Reserved. 25

SELECT score, count(*) FROM scoresWHERE user = ‘Aleksey’ AND game = ‘coup’ AND year = 2016GROUP BY score LIMIT 2

InvalidRequest: Error from server: code=2200 [Invalid query] message="Group by is currently only supported on the columns of the PRIMARY KEY, got score"

Grouping

© DataStax, All Rights Reserved. 26

CREATE MATERIALIZED VIEW yearlyHighAS SELECT user, game, year, score, month, day FROM scoresWHERE user IS NOT NULL AND game IS NOT NULL AND year IS NOT NULL AND score IS NOT NULL AND month IS NOT NULL AND day IS NOT NULLPRIMARY KEY (user, game, year, score, month, day)WITH CLUSTERING ORDER BY (game ASC, year DESC, score DESC)

Grouping

© DataStax, All Rights Reserved. 27

SELECT score, count(*) FROM yearlyHighWHERE user = ‘Aleksey’ AND game = ‘coup’ AND year = 2016GROUP BY score LIMIT 2

Year Score Month Day2016 1400 3 1

2016 1400 6 12

2016 1050 2 8

2016 1020 5 23

[…]

2016 800 6 24

Grouping

© DataStax, All Rights Reserved. 28

CREATE TABLE gameScores ( user text, game text, year int, month int, day int, score int, PRIMARY KEY ((user, game, year), month, day))

Partition key

Grouping

© DataStax, All Rights Reserved. 29

SELECT year, max(score), min(score), count(score) FROM gameScoresGROUP BY user, game

InvalidRequest: Error from server: code=2200 [Invalid query] message="Group by is not supported on only a part of the partition key"

Grouping

© DataStax, All Rights Reserved. 30

SELECT user, game, max(score), min(score), count(score) FROM scores GROUP BY user, game

A

C

D B Driver

Computes aggregates

Grouping

© DataStax, All Rights Reserved. 31

SELECT user, game, max(score), min(score), count(score) FROM scores GROUP BY user, game

A

C

D B Driver

Computes aggregates

Page size in # of groups

Sub-page size in # of rows

Grouping

© DataStax, All Rights Reserved. 32

SELECT user, game, max(score), min(score), count(score) FROM scores WHERE user = ‘Aleksey’ GROUP BY user, game

A

C

D B Driver

Computes aggregates …with TokenAwarePolicy

Per Partition Limit (3.6)

© DataStax, All Rights Reserved.

SELECT user, score FROM yearlyHighWHERE game = ‘coup’ AND year = ‘2016’PER PARTITION LIMIT 1ALLOW FILTERING

SELECT user, score, count(*) FROM yearlyHighWHERE game = ‘coup’ AND year = ‘2016’GROUP BY user, game, year, scorePER PARTITION LIMIT 1ALLOW FILTERING

Grouping by time range (CASSANDRA-11871)

© DataStax, All Rights Reserved.

CREATE TABLE temperature (deviceId text PRIMARY KEY,time timestamp,value double)

SELECT deviceId, floor(time, 2h), min(value), max(value), count(value) FROM temperatureWHERE deviceId = ‘AT-AT’GROUP BY floor(time, 2h)

Grouping

© DataStax, All Rights Reserved. 35

• It is only possible to group rows at the partition key level or at a clustering column level

• The GROUP BY clause only accept as arguments primary key column names in the primary key order

• Aggregates are built on the coordinator to insure consistency

• Queries might be paged internally

• If a primary key column is restricted by an equality restriction it is not required to be present in the GROUP BY clause

Questions ?

© DataStax, All Rights Reserved. 36

Thank you

top related