cql, and the road to redemption - datastax: always-on …€¦ · • an alternative query...
TRANSCRIPT
• Query language for Apache Cassandra
• SQL for the most part
• An alternative query interface
• Available since Cassandra 0.8.0
Cassandra Query Language(aka CQL)
• RPC-based query interface
• Implemented in Thrift
• Compact binary serialization
• Loads of supported languages
• Generated language code
• Low level; very little abstraction
Status Quo
This cannot be unseen!// Your columnColumn col = new Column(ByteBuffer.wrap(“name”.getBytes()));col.setValue(ByteBuffer.wrap(“value”.getBytes()));col.setTimestamp(System.currentTimeMillis());
// Don’t askColumnOrSuperColumn cosc = new ColumnOrSuperColumn();cosc.setColumn(cosc);
// Hang on, here we go...Mutation mutation = new Mutation();mutation.setColumnOrSuperColumn(cosc);
List<Mutation> mutations = new ArrayList<Mutation>();mutations.add(mutation);
Map mutations_map = new HashMap<ByteBuffer, Map<String, List<Mutation>>>();Map cf_map = new Map<String, List<Mutation>>();cf_map.set(“Standard1”, mutations);mutations_map.put(ByteBuffer.wrap(“key”.getBytes()), cf_map);
cassandra.batch_mutate(mutations_map, consistency_level);
This cannot be unseen!// Your columnColumn col = new Column(ByteBuffer.wrap(“name”.getBytes()));col.setValue(ByteBuffer.wrap(“value”.getBytes()));col.setTimestamp(System.currentTimeMillis());
// Don’t askColumnOrSuperColumn cosc = new ColumnOrSuperColumn();cosc.setColumn(cosc);
// Hang on, here we go...Mutation mutation = new Mutation();mutation.setColumnOrSuperColumn(cosc);
List<Mutation> mutations = new ArrayList<Mutation>();mutations.add(mutation);
Map mutations_map = new HashMap<ByteBuffer, Map<String, List<Mutation>>>();Map cf_map = new Map<String, List<Mutation>>();cf_map.set(“Standard1”, mutations);mutations_map.put(ByteBuffer.wrap(“key”.getBytes()), cf_map);
cassandra.batch_mutate(mutations_map, consistency_level);
SQL
Pros Cons
• Ubiquitous
• Widely known
• Excellent mental fit
• Client uniformity
• People whinging
• Security(?)
SQL
Pros Cons
• Ubiquitous
• Widely known
• Excellent mental fit
• Client uniformity
• People whinging
• Security(?)
Hello...-- Create or updateINSERT INTO users (id, given, surname) VALUES (jericevans, Eric, Evans);
-- Create or updateUPDATE users SET given = Eric, surname = Evans WHERE id = jericevans;
SELECT surname, given FROM users WHERE id = jericevans;
-- Adding an indexCREATE INDEX surnameidx ON users (surname);
SELECT id, given FROM users WHERE surname = Evans;
-- Limiting the number of rowsSELECT id, given FROM users WHERE surname = Evans LIMIT 1000;
...is it me you’re looking for?
-- From column, to columnSELECT ‘2012-01-01’..’2012-03-28’ FROM NewsWHERE topic = cassandra
-- Last N columnsSELECT FIRST 10 REVERSED * FROM NewsWHERE topic = cassandra
Querying column ranges
-- Get your count onUPDATE inventory SET apples = apples + 1 WHERE id = fruit;
UPDATE inventory SET carrots = carrots - 1 WHERE id = vegetable;
Counting
BEGIN BATCH INSERT INTO msgs (owner, subject, body) VALUES(jericevans, ‘Hi’, ‘Howdy’); UPDATE subjects SET subject = now WHERE owner = jericevansAPPLY BATCH
Batching writes
Drivers• Not (necessarily) a replacement for high-level,
idiomatic libraries
• Avoids duplicating efforts, (error handling, pooling, etc)
• Consistently scoped, JDBC, etc
• Consistently hosted, licensed
• Discoverable
• More work needed...
CQL 3.0-- A materialized timeline of tweetsCREATE COLUMNFAMILY timeline ( username text, posted_at timestamp, body text, posted_by text, PRIMARY KEY (username, posted_at));
CQL 3.0INSERT INTO timeline (username, posted_at, body, posted_by)VALUES (scotty, ‘2012-03-23 14:36’ ‘stupid klingons...’, jtkirk);
INSERT INTO timeline (username, posted_at, body, posted_by)VALUES (scotty, ‘2012-03-23 16:12’ ‘@jtkirk green?’, spock);
INSERT INTO timeline (username, posted_at, body, posted_by)VALUES (scotty, ‘2012-03-23 17:42’ ‘@spock yes, green’, jtkirk);
INSERT INTO timeline (username, posted_at, body, posted_by)VALUES (scotty, ‘2012-03-25 08:14’ ‘get off my lawn!’, bones);
In Cassandra’s eyes eye
scotty(23/03 14:36, body):
stupid klingons...(23/03 14:36, posted_by):
jtkirk(23/03 16:12, body):
@jtkirk green?...
-- Tweets in Scotty’s timeline, by dateSELECT * FROM timeline WHERE username = scotty AND posted_at > ‘2012-03-22’;
Is it a row, or a table?Yes.
username posted_at body posted_by
scotty 23/03 14:36 stupid klingons... jtkirk
scotty 23/03 16:12 @jtkirk green? spock
scotty 23/03 17:42 @spock yes, green jtkirk
scotty 25/03 08:14 get off my lawn! bones
Also...
• Column names are strictly UTF-8
• Column names are case-insensitive (unless quoted)
• Old slice notation is gone (<start>..<end>)
• Static column families are actually static (schema-enforced)