meetup - unboxing apache cassandra 3.10

14
© DataStax, All Rights Reserved. Apache Cassandra What’s new in 3.10 1 Erick Ramirez DataStax Engineering @flightc

Upload: erick-ramirez

Post on 21-Jan-2018

123 views

Category:

Technology


6 download

TRANSCRIPT

Page 1: MEETUP - Unboxing Apache Cassandra 3.10

© DataStax, All Rights Reserved.

Apache Cassandra What’s new in 3.10

1

Erick Ramirez DataStax Engineering

@flightc

Page 2: MEETUP - Unboxing Apache Cassandra 3.10

Welcome• Support for arithmetic operations in CQL (CASSANDRA-11935)• ALLOW FILTERING on PK columns without secondary indexes (CASSANDRA-11031)• Dynamically change compaction thread count with nodetool (CASSANDRA-12248)• Log slow queries at DEBUG level (CASSANDRA-12403)

• Support for GROUP BY queries (CASSANDRA-10707)• Garbage-collection compaction to evict deleted data (CASSANDRA-7019)• Prepared statements persisted to system table (CASSANDRA-8831)• Snapshots prefixed with dropped or truncated (CASSANDRA-12178)

• New TimeWindowCompactionStrategy to replace DTCS (CASSANDRA-9666)

Page 3: MEETUP - Unboxing Apache Cassandra 3.10

© DataStax, All Rights Reserved.

https://academy.datastax.com

3

Page 4: MEETUP - Unboxing Apache Cassandra 3.10

© DataStax, All Rights Reserved.

CASSANDRA-11935

4

• Support for arithmetic operations in CQL• Works for numeric and counter types• Higher precedence - *, /, %

• Next precedence - +, -

• Also negation - -• If same precedence, evaluated left to right• Coming soon - operations on dates, strings

SELECT a + b AS x FROM …

SELECT cost / length AS unit_cost FROM …

Page 5: MEETUP - Unboxing Apache Cassandra 3.10

© DataStax, All Rights Reserved.

CASSANDRA-11031

5

• ALLOW FILTERING support on PK columns

• Previously required secondary indexes• Same caveats apply with ALLOW FILTERING• Use with caution

CREATE TABLE tracks_by_album (

album text,

year integer,

track text

PRIMARY KEY ( (album, year), track)

)

SELECT * FROM tracks_by_album \

WHERE year = 1980 ALLOW FILTERING;

Page 6: MEETUP - Unboxing Apache Cassandra 3.10

© DataStax, All Rights Reserved.

CASSANDRA-12248

6

• Dynamically change compaction threads• Done via nodetool • Still need to persist across reboots by setting

concurrent_compactors

• Useful for situations like bootstrapping a node• Same caveats apply

nodetool setconcurrentcompactors 4

nodetool getconcurrentcompactors

Page 7: MEETUP - Unboxing Apache Cassandra 3.10

CASSANDRA-12403

• Log slow queries at DEBUG level, done on replicas instead of coordinator nodes

• Queries are considered slow if it exceeds slow_query_log_timeout_in_ms in cassandra.yaml

• WARNING - watch for false positives, e.g. overloaded nodes will report lots of queries as slow

INFO [ScheduledTasks:1] … NoSpamLogger.java:91 - Some operations were slow, details available at debug level (debug.log)

DEBUG [ScheduledTasks:1] … MonitoringTask.java:173 - 1 operations were slow in the last 4998 msecs:

<SELECT * FROM ks.test2 LIMIT 5000>, time 3026 msec - slow timeout 500 msec

Page 8: MEETUP - Unboxing Apache Cassandra 3.10

© DataStax, All Rights Reserved.

CASSANDRA-10707

8

• Can now GROUP BY when reading data

• Works on partition key and primary key columns only• Lots of sharp edges, use with caution• Similar caveats to COUNT(), etc

SELECT pk, max(v) FROM table \

GROUP BY pk;

SELECT pk, clust0, max(v) FROM table \

GROUP BY pk, clust0;

Page 9: MEETUP - Unboxing Apache Cassandra 3.10

© DataStax, All Rights Reserved.

CASSANDRA-7019

9

• Manually trigger GC compaction to evict deleted data across overlapping sstables

• Useful workaround for high-delete workloads or non-optimum data model

• Set granularity delete whole rows/partitions or cells• Watch out for elevated IO

nodetool garbagecollect -- ks table

nodetool garbagecollect \

-g ROW -- ks table

nodetool garbagecollect \

-g CELL -- ks table

Page 10: MEETUP - Unboxing Apache Cassandra 3.10

CASSANDRA-8831

• Previously, clients (drivers) had to re-prepare statements when a node has been restarted• Prepared statements are now persisted to system.prepared_statements• On startup, this table is read to pre-load all previously cached statements• Efficiency gain is magnified in large clusters

Page 11: MEETUP - Unboxing Apache Cassandra 3.10

CASSANDRA-12178

• Not so exciting unless you’re a Cassandra admin• Snapshots now prefixed with dropped or truncate• Easier to identify unwanted snapshots for filesystem cleanup

Page 12: MEETUP - Unboxing Apache Cassandra 3.10

CASSANDRA-9666

• New TimeWindowCompactionStrategy (from C* 3.8)• Recommended for time series and expiring TTL workloads• Similar to DTCS (deprecated) but much simpler• Groups sstables into a series of “time windows”• At the end of a time window, sstables compacted into one sstable using STCS based on a max timestamp• Once major compaction is done, no further compactions performed• Not recommended for non-TTL data

Page 13: MEETUP - Unboxing Apache Cassandra 3.10

© DataStax, All Rights Reserved.

https://datastaxacademy.slack.com

13

Page 14: MEETUP - Unboxing Apache Cassandra 3.10

© DataStax, All Rights Reserved.

Thank you

14