bulk exporting from cassandra - carlo cabanilla

20

Click here to load reader

Upload: datadogslides

Post on 13-Jul-2015

438 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Bulk Exporting from Cassandra - Carlo Cabanilla

Bulk exporting datafrom Cassandra

Carlo Cabanilla@clofresh

Page 2: Bulk Exporting from Cassandra - Carlo Cabanilla

Why export?

Page 3: Bulk Exporting from Cassandra - Carlo Cabanilla

snapshot

Page 4: Bulk Exporting from Cassandra - Carlo Cabanilla

sstable2json

Page 5: Bulk Exporting from Cassandra - Carlo Cabanilla

Killing IO on live cluster

Page 6: Bulk Exporting from Cassandra - Carlo Cabanilla

sstable2json sstable2csv, with filters

Page 7: Bulk Exporting from Cassandra - Carlo Cabanilla

ionice -c 3

Page 8: Bulk Exporting from Cassandra - Carlo Cabanilla

Need a place to put it

Page 9: Bulk Exporting from Cassandra - Carlo Cabanilla

EBS to the rescue

Page 10: Bulk Exporting from Cassandra - Carlo Cabanilla

gzipped

Page 11: Bulk Exporting from Cassandra - Carlo Cabanilla

S3cmd

Page 12: Bulk Exporting from Cassandra - Carlo Cabanilla

Need to dedupe

Page 13: Bulk Exporting from Cassandra - Carlo Cabanilla

Hadoop

Page 14: Bulk Exporting from Cassandra - Carlo Cabanilla

numpy pickles

Page 15: Bulk Exporting from Cassandra - Carlo Cabanilla

Haderp Mortar Data

Page 16: Bulk Exporting from Cassandra - Carlo Cabanilla

numpy pickles msgpack lz4

Page 17: Bulk Exporting from Cassandra - Carlo Cabanilla

gzipped lzo'd

Page 18: Bulk Exporting from Cassandra - Carlo Cabanilla

Haderp file naming!2010-07-27~org-1018~m-48778.csv-1,316.gz

Page 19: Bulk Exporting from Cassandra - Carlo Cabanilla

S3 copy

Page 20: Bulk Exporting from Cassandra - Carlo Cabanilla

Bulk exporting datafrom Cassandra

Carlo Cabanilla@clofresh