a look at the cql changes in 3.x (benjamin lerer, datastax) | cassandra summit 2016

37
Benjamin Lerer A look at the CQL changes in 3.x

Upload: datastax

Post on 16-Apr-2017

157 views

Category:

Software


1 download

TRANSCRIPT

Page 1: A look at the CQL changes in 3.x (Benjamin Lerer, Datastax) | Cassandra Summit 2016

Benjamin Lerer

A look at the CQL changes in 3.x

Page 2: A look at the CQL changes in 3.x (Benjamin Lerer, Datastax) | Cassandra Summit 2016

2© DataStax, All Rights Reserved.

• Updates and Deletions• Filtering• Grouping

Page 3: A look at the CQL changes in 3.x (Benjamin Lerer, Datastax) | Cassandra Summit 2016

Updates and Deletions (3.0)

© DataStax, All Rights Reserved. 3

Page 4: A look at the CQL changes in 3.x (Benjamin Lerer, Datastax) | Cassandra Summit 2016

Updates and Deletions

© DataStax, All Rights Reserved. 4

CREATE TABLE toys (brand textcategory text,id int,name text,price decimal,PRIMARY KEY (brand, category, id)

Clustering columns

Page 5: A look at the CQL changes in 3.x (Benjamin Lerer, Datastax) | Cassandra Summit 2016

Simple updates

© DataStax, All Rights Reserved. 5

INSERT INTO toys (brand, category, id, name, price)VALUES (‘Lego’, ‘Star Wars’, 75060, ‘Slave I’, 219.99)

UPDATE toys SET name = ‘Tie Fighter’, price = 219.99 WHERE brand = ‘Lego’ AND category = ‘Star Wars’ AND id = 75095

name: ‘Slave I’ ts: t1 price: 219.99 ts: t1 ‘Star Wars’-75060 ts: t1

name: ‘Tie Fighter’ ts: t2 price: 219.99 ts: t2 ‘Star Wars’-75095 ts: Long.MIN

Memtable

Page 6: A look at the CQL changes in 3.x (Benjamin Lerer, Datastax) | Cassandra Summit 2016

Multi-updates

© DataStax, All Rights Reserved. 6

UPDATE toys SET price = 229.99 WHERE brand = ‘Lego’AND category = ‘Star Wars’AND id IN (75059, 75060, 75095)

name: ‘Slave I’ ts: t1 price: 229.99 ts: t3 ‘Star Wars’-75060 ts: t1

name: ‘Tie Fighter’ ts: t2 price: 229.99 ts: t3 ‘Star Wars’-75095 ts: Long.MIN

Memtable

price: 229.99 ts: t3 ‘Star Wars’-75059 ts: Long.MIN

Page 7: A look at the CQL changes in 3.x (Benjamin Lerer, Datastax) | Cassandra Summit 2016

Column deletion

© DataStax, All Rights Reserved. 7

DELETE name FROM toysWHERE brand = ‘Lego’ AND category = ‘Star Wars’ AND id IN (75059, 75060)

name: <tombstone> ts: t4 price: 229.99 ts: t3 ‘Star Wars’-75060 ts: t1

name: ‘Tie Fighter’ ts: t2 price: 229.99 ts: t3 ‘Star Wars’-75095 ts: Long.MIN

Memtable

price: 229.99 ts: t3 ‘Star Wars’-75059 ts: Long.MIN name: <tombstone> ts: t4

Page 8: A look at the CQL changes in 3.x (Benjamin Lerer, Datastax) | Cassandra Summit 2016

Column deletion on empty Memtable

© DataStax, All Rights Reserved. 8

DELETE name FROM toysWHERE brand = ‘Lego’ AND category = ‘Star Wars’ AND id IN (75059, 75060)

name: <tombstone> ts: t4 ‘Star Wars’-75060 ts: Long.MIN

Memtable

‘Star Wars’-75059 ts: Long.MIN name: <tombstone> ts: t4

Page 9: A look at the CQL changes in 3.x (Benjamin Lerer, Datastax) | Cassandra Summit 2016

Row deletion

© DataStax, All Rights Reserved. 9

DELETE FROM toysWHERE brand = ‘Lego’ AND category = ‘Star Wars’ AND id = 75059

name: <tombstone> ts: t4 price: 229.99 ts: t3 ‘Star Wars’-75060 ts: t1

name: ‘Tie Fighter’ ts: t2 price: 229.99 ts: t3 ‘Star Wars’-75095 ts: Long.MIN

Memtable

‘Star Wars’-75059 ts: Long.MIN deletedAt: t5

Page 10: A look at the CQL changes in 3.x (Benjamin Lerer, Datastax) | Cassandra Summit 2016

Range deletion (3.0)

© DataStax, All Rights Reserved. 10

DELETE FROM toysWHERE brand = ‘Lego’ AND category = ‘Star Wars’ AND id <= 75060

name: <tombstone> ts: t4 price: 229.99 ts: t3 ‘Star Wars’-75060 ts: t1

name: ‘Tie Fighter’ ts: t2 price: 229.99 ts: t3 ‘Star Wars’-75095 ts: Long.MIN

Memtable

‘Star Wars’-75059 ts: Long.MIN deletedAt: t5

DeletionInfo deletedAt: Long.MIN ranges: (‘Star Wars’ … ‘Start Wars’-75060]

Page 11: A look at the CQL changes in 3.x (Benjamin Lerer, Datastax) | Cassandra Summit 2016

Partition deletion

© DataStax, All Rights Reserved. 11

DELETE FROM toys WHERE brand = ‘Lego'

name: <tombstone> ts: t4 price: 229.99 ts: t3 ‘Star Wars’-75060 ts: t1

name: ‘Tie Fighter’ ts: t2 price: 229.99 ts: t3 ‘Star Wars’-75095 ts: Long.MIN

Memtable

‘Star Wars’-75059 ts: Long.MIN deletedAt: t5

DeletionInfo deletedAt: t6 ranges: (‘Star Wars’ … ‘Start Wars’-75060]

Page 12: A look at the CQL changes in 3.x (Benjamin Lerer, Datastax) | Cassandra Summit 2016

Filtering (3.0, 3.6, 3.10)

© DataStax, All Rights Reserved. 12

Page 13: A look at the CQL changes in 3.x (Benjamin Lerer, Datastax) | Cassandra Summit 2016

Filtering

© DataStax, All Rights Reserved. 13

CREATE TABLE scores ( user text, game text, year int, month int, day int, score int, PRIMARY KEY (user, game, year, month, day))

Page 14: A look at the CQL changes in 3.x (Benjamin Lerer, Datastax) | Cassandra Summit 2016

Filtering

© DataStax, All Rights Reserved. 14

In 2.2:

SELECT * FROM scores WHERE user = ‘Aleksey’ AND game = ‘coup’ AND score >= 1000

InvalidRequest: Error from server: code=2200 [Invalid query] message="Predicates on non-primary-key columns (score) are not yet supported for non secondary index queries"

Page 15: A look at the CQL changes in 3.x (Benjamin Lerer, Datastax) | Cassandra Summit 2016

Filtering

© DataStax, All Rights Reserved. 15

In 3.0:

SELECT * FROM scores WHERE user = ‘Aleksey’ AND game = ‘coup’ AND score >= 1000

InvalidRequest: Error from server: code=2200 [Invalid query] message="Cannot execute this query as it might involve data filtering and thus may have unpredictable performance. If you want to execute this query despite the performance unpredictability, use ALLOW FILTERING"

Page 16: A look at the CQL changes in 3.x (Benjamin Lerer, Datastax) | Cassandra Summit 2016

© DataStax, All Rights Reserved. 16

Filtering = Brute Force approach

Page 17: A look at the CQL changes in 3.x (Benjamin Lerer, Datastax) | Cassandra Summit 2016

Filtering

© DataStax, All Rights Reserved. 17

String partitionKey = "Aleksey"; String[] clusteringPrefix = new String[]{"coup"};

List<Row> rows = loadRows(partitionKey, clusteringPrefix); List<Row> filteredRows = new ArrayList<>();

for (Row row : rows) { if (row.getInt("score") >= 1000) { filteredRows.add(row); }}return filteredRows;

SELECT * FROM scores WHERE user = ‘Aleksey’ AND game = ‘coup’ AND score >= 1000 ALLOW FILTERING

Page 18: A look at the CQL changes in 3.x (Benjamin Lerer, Datastax) | Cassandra Summit 2016

Clustering column filtering (3.6)

© DataStax, All Rights Reserved. 18

SELECT * FROM scores WHERE user = ‘Aleksey’ AND game = ‘coup’ AND month = 9 ALLOW FILTERING

SELECT * FROM scores WHERE user = ‘Aleksey’ AND game = ‘coup’ AND year >= 2014 AND month = 9 ALLOW FILTERING

Clustering sliceFiltering

Filtering

Page 19: A look at the CQL changes in 3.x (Benjamin Lerer, Datastax) | Cassandra Summit 2016

Filtering

© DataStax, All Rights Reserved. 19

Filtering is performed on the replica side

Filtering can return stale data(CASSANDRA-8273)

Page 20: A look at the CQL changes in 3.x (Benjamin Lerer, Datastax) | Cassandra Summit 2016

Filtering

© DataStax, All Rights Reserved. 20

3 replicas: A, B, C

INSERT INTO scores (user, game, year, month, day, score)VALUES (‘Aleksey’, ‘coup’, 2016, 1, 12, 1100);

At QUORUM:

UPDATE scores SET score = 1200 WHERE user = ‘Aleksey’ AND game = ‘coup’ AND year = 2016AND month = 1 AND day = 12;

SELECT * FROM scores WHERE user = ‘Aleksey’ AND game = ‘coup’ AND score = 1100 ALLOW FILTERING

Page 21: A look at the CQL changes in 3.x (Benjamin Lerer, Datastax) | Cassandra Summit 2016

Filtering

© DataStax, All Rights Reserved. 21

In 3.0 filtering is supported on:• Non primary key columns• Static columns

In 3.6 filtering is also supported on clustering columns

In 3.10 filtering will be supported on partition key

When using filtering, be aware of:• Its performance unpredictability• The fact that it can return stale data

Page 22: A look at the CQL changes in 3.x (Benjamin Lerer, Datastax) | Cassandra Summit 2016

Grouping (3.10)

© DataStax, All Rights Reserved. 22

Page 23: A look at the CQL changes in 3.x (Benjamin Lerer, Datastax) | Cassandra Summit 2016

Grouping

© DataStax, All Rights Reserved. 23

SELECT year, month, max(score), min(score), count(score) FROM scoresWHERE user = ‘Aleksey’ AND game = ‘coup’ AND year = 2016GROUP BY month LIMIT 2

Year Month Day Score2016 1 12 1200

2016 1 31 800

2016 2 8 1050

2016 3 1 1400

[…]

2016 6 24 800

Page 24: A look at the CQL changes in 3.x (Benjamin Lerer, Datastax) | Cassandra Summit 2016

Grouping

© DataStax, All Rights Reserved. 24

SELECT score, count(*) FROM scoresWHERE user = ‘Aleksey’ AND game = ‘coup’ AND year = 2016GROUP BY score LIMIT 2

Year Month Day Score2016 1 12 1200

2016 1 31 800

2016 2 8 1050

2016 3 1 1400

[…]

2016 6 24 800

Page 25: A look at the CQL changes in 3.x (Benjamin Lerer, Datastax) | Cassandra Summit 2016

Grouping

© DataStax, All Rights Reserved. 25

SELECT score, count(*) FROM scoresWHERE user = ‘Aleksey’ AND game = ‘coup’ AND year = 2016GROUP BY score LIMIT 2

InvalidRequest: Error from server: code=2200 [Invalid query] message="Group by is currently only supported on the columns of the PRIMARY KEY, got score"

Page 26: A look at the CQL changes in 3.x (Benjamin Lerer, Datastax) | Cassandra Summit 2016

Grouping

© DataStax, All Rights Reserved. 26

CREATE MATERIALIZED VIEW yearlyHighAS SELECT user, game, year, score, month, day FROM scoresWHERE user IS NOT NULL AND game IS NOT NULL AND year IS NOT NULL AND score IS NOT NULL AND month IS NOT NULL AND day IS NOT NULLPRIMARY KEY (user, game, year, score, month, day)WITH CLUSTERING ORDER BY (game ASC, year DESC, score DESC)

Page 27: A look at the CQL changes in 3.x (Benjamin Lerer, Datastax) | Cassandra Summit 2016

Grouping

© DataStax, All Rights Reserved. 27

SELECT score, count(*) FROM yearlyHighWHERE user = ‘Aleksey’ AND game = ‘coup’ AND year = 2016GROUP BY score LIMIT 2

Year Score Month Day2016 1400 3 1

2016 1400 6 12

2016 1050 2 8

2016 1020 5 23

[…]

2016 800 6 24

Page 28: A look at the CQL changes in 3.x (Benjamin Lerer, Datastax) | Cassandra Summit 2016

Grouping

© DataStax, All Rights Reserved. 28

CREATE TABLE gameScores ( user text, game text, year int, month int, day int, score int, PRIMARY KEY ((user, game, year), month, day))

Partition key

Page 29: A look at the CQL changes in 3.x (Benjamin Lerer, Datastax) | Cassandra Summit 2016

Grouping

© DataStax, All Rights Reserved. 29

SELECT year, max(score), min(score), count(score) FROM gameScoresGROUP BY user, game

InvalidRequest: Error from server: code=2200 [Invalid query] message="Group by is not supported on only a part of the partition key"

Page 30: A look at the CQL changes in 3.x (Benjamin Lerer, Datastax) | Cassandra Summit 2016

Grouping

© DataStax, All Rights Reserved. 30

SELECT user, game, max(score), min(score), count(score) FROM scores GROUP BY user, game

A

C

D B Driver

Computes aggregates

Page 31: A look at the CQL changes in 3.x (Benjamin Lerer, Datastax) | Cassandra Summit 2016

Grouping

© DataStax, All Rights Reserved. 31

SELECT user, game, max(score), min(score), count(score) FROM scores GROUP BY user, game

A

C

D B Driver

Computes aggregates

Page size in # of groups

Sub-page size in # of rows

Page 32: A look at the CQL changes in 3.x (Benjamin Lerer, Datastax) | Cassandra Summit 2016

Grouping

© DataStax, All Rights Reserved. 32

SELECT user, game, max(score), min(score), count(score) FROM scores WHERE user = ‘Aleksey’ GROUP BY user, game

A

C

D B Driver

Computes aggregates …with TokenAwarePolicy

Page 33: A look at the CQL changes in 3.x (Benjamin Lerer, Datastax) | Cassandra Summit 2016

Per Partition Limit (3.6)

© DataStax, All Rights Reserved.

SELECT user, score FROM yearlyHighWHERE game = ‘coup’ AND year = ‘2016’PER PARTITION LIMIT 1ALLOW FILTERING

SELECT user, score, count(*) FROM yearlyHighWHERE game = ‘coup’ AND year = ‘2016’GROUP BY user, game, year, scorePER PARTITION LIMIT 1ALLOW FILTERING

Page 34: A look at the CQL changes in 3.x (Benjamin Lerer, Datastax) | Cassandra Summit 2016

Grouping by time range (CASSANDRA-11871)

© DataStax, All Rights Reserved.

CREATE TABLE temperature (deviceId text PRIMARY KEY,time timestamp,value double)

SELECT deviceId, floor(time, 2h), min(value), max(value), count(value) FROM temperatureWHERE deviceId = ‘AT-AT’GROUP BY floor(time, 2h)

Page 35: A look at the CQL changes in 3.x (Benjamin Lerer, Datastax) | Cassandra Summit 2016

Grouping

© DataStax, All Rights Reserved. 35

• It is only possible to group rows at the partition key level or at a clustering column level

• The GROUP BY clause only accept as arguments primary key column names in the primary key order

• Aggregates are built on the coordinator to insure consistency

• Queries might be paged internally

• If a primary key column is restricted by an equality restriction it is not required to be present in the GROUP BY clause

Page 36: A look at the CQL changes in 3.x (Benjamin Lerer, Datastax) | Cassandra Summit 2016

Questions ?

© DataStax, All Rights Reserved. 36

Page 37: A look at the CQL changes in 3.x (Benjamin Lerer, Datastax) | Cassandra Summit 2016

Thank you