Cassandra 1.1

Download Cassandra 1.1

Post on 15-Jan-2015




0 download

Embed Size (px)




<ul><li> 1. Apache Cassandra 1.1Jonathan Ellis / @spyced2012 DataStax</li></ul> <p> 2. New features in 1.1 CQL3 Global row + key caches Fine-grained data storage control Row level isolation Concurrent schema changes O-heap cache works on Windows "Write survey mode" Hadoop improvements Stress tool2012 DataStax 3. Modern Cassandra, briey 0.7CREATE COLUMN FAMILYTTLSecondary (column) indexes 0.8CountersAutomatic memtable tuning 1.0CompressionLeveled compaction2012 DataStax 4. Global row + key caches cassandra.yamlkey_cache_size_in_mb (default 2)row_cache_size_in_mb (default 0) Also save periodsPer-CF: caching=ALL|KEYS_ONLY*|ROWS_ONLY|NONE Old CF-level options are ignoredrow_cache_size, key_cache_size save periods2012 DataStax 5. Data storage Old:/var/lib/cassandra/data/Keyspace1/Standard1-hc-1-Data.db New:/var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1- Standard1-hc-1-Data.db(Includes KS in filename for easier bulk loading)2012 DataStax 6. Row-level isolation Never see partial updates to a row We now have AID from ACIDC in ACID != C in CAP2012 DataStax 7. Concurrent schema changes Fixes Can still have temporary disagreements if you use a newCF before all nodes have it Also speeds up adding new nodes2012 DataStax 8. Off-heap cache on Windows SerializingCacheProvider no longer requires JNA SCP is the default starting with 1.0, but falls back toCLHCP if JNA is not present in &lt; 1.12012 DataStax 9. Write survey mode bin/cassandra -Dcassandra.write_survey=true Allows experimenting w/ compaction, compression, newversions*isolate node to test reads2012 DataStax 10. Abortable compactions nodetool stop 2012 DataStax 11. CQL3 (CQL2 is still default) Composite PK support.. slice syntax removedORDER BY syntax conforms to SQL2012 DataStax 12. A simple exampleCREATE TABLE tweets (tweet_id uuid PRIMARY KEY,author varchar,body varchar);2012 DataStax 13. Tweetstweet_id author bodyTo be prepared for war is one of the most1790 gwashingtoneffectual means of preserving peace All men having power ought to be distrusted1787jmadison to a certain degreeThose gentlemen, who will be elected senators, will fix themselves in the federal1778 gmason town, and become citizens of that townmore than of your state2012 DataStax 14. With clusteringCREATE TABLE timeline (user_id varchar,tweet_id uuid,author varchar,body varchar,PRIMARY KEY (user_id, tweet_id)); partition keyclustered2012 DataStax 15. Timeline user_id tweet_idauthorbody jadams 1787jmadisonAll men ... jadams 1790gwashingtonTo be prepared ...ahamilton 1778gmasonThose gentlemen ...ahamilton 1790gwashingtonTo be prepared ...notclustered (within partition key) clustered2012 DataStax 16. Timeline, physical layout(1787, author):(1787, body):(1790, author): (1790, body): Tojadams jmadison All men ...gwashington be prepared ...(1778, author):(1778, body):(1790, author): (1790, body): Toahamiltongmason Those gentlemen ...gwashington be prepared ...Non-PK columns containstring literal of column name 2012 DataStax 17. WITH COMPACT CREATE TABLE timeline ( user_id varchar, tweet_id uuid, author varchar, body varchar, PRIMARY KEY (user_id, tweet_id, author) ) WITH COMPACT STORAGE;All but one column For backwards compatibility2012 DataStax 18. (1787, jmadison):(1790, gwashington): jadams All men ... To be prepared ...(1778, gmason): (1790, gwashington):ahamilton Those gentlemen ... To be prepared body literal 2012 DataStax 19. Earlier changes (1.0.6) Allow CF names to be qualified by keyspace forINSERT, ALTER, DELETE, TRUNCATEINSERT INTO (...) VALUES (...)(SELECT was done in 1.0.1) (1.0.4) ALTER CF attributes2012 DataStax 20. cqlsh SOURCE and CAPTURE commands (1.0.8) DESCRIBE COLUMNFAMILIES2012 DataStax 21. The future is CQL (based) cqlsh performanceprepared statementsnetty-based transport (CASSANDRA-2478) What does this mean for pycassa, Hector, et al?2012 DataStax 22. Hadoop Integration 2I support* Wide row support* BulkOutputFormat (*Covered in updated WordCount)2012 DataStax 23. Secondary Index support IndexExpression expr = new IndexExpression( ByteBufferUtil.bytes("int4"), IndexOperator.EQ, ByteBufferUtil.bytes(0)); ConfigHelper.setInputRange( job.getConfiguration(),2012 DataStax 24. Wide row support ConfigHelper.setInputColumnFamily( job.getConfiguration(), KEYSPACE, COLUMN_FAMILY, true); Also: PIG_WIDEROW_INPUT2012 DataStax 25. BulkOutputFormat job.setOutputFormatClass( BulkOutputFormat.class); Compatible w/ CFOF + extra options OUTPUT_LOCATION BUFFER_SIZE_IN_MB STREAM_THROTTLE_MBITS (system default, 64, unlimited) Limitation: cant stream to dead nodes (fix in 1.1.1?)2012 DataStax 26. Stress tool tools/bin/stress* Insert, read, seq scan, indexed scan, multiget, counteradd/get CQL2012 DataStax 27. Bonus: Whats new in C* 1.1.1 Incremental repair by token range Support for commitlog archiving and PITR Identify and blacklist corrupted SSTables from futurecompactions Open 1 sstableScanner per level for leveled compaction More CQL3 improvements (e.g. reversed clustering) fix re-creating Keyspaces/ColumnFamilies with the samename as dropped ones2012 DataStax 28. DataStax Community, withOpsCenter2012 DataStax </p>