san francisco cassadnra meetup - march 2014: i/o performance tuning on aws for cassandra with...

38
Eddie Garcia, VP of InfoSec and Services Sam Heywood VP of Products and Marke<ng I/O Performance tuning for Cassandra running on AWS with Gazzang

Upload: planet-cassandra

Post on 04-Jul-2015

1.908 views

Category:

Technology


1 download

DESCRIPTION

What You'll Learn at this Meetup Tips and Tricks to achieve high performance when running Cassandra on AWS • Configuration tuning for Cassandra • Tools to benchmark raw filesystem IO • AWS available AMIs to boost performance • Stress testing on AWS i2 HVM instances • Configuring AWS EC2 instances with SSDs and EBS storage with PIOPS

TRANSCRIPT

Page 1: San Francisco Cassadnra Meetup - March 2014: I/O Performance tuning on AWS for Cassandra with Gazzang

Eddie  Garcia,  VP  of  InfoSec  and  Services    Sam  Heywood  VP  of  Products  and  Marke<ng  

I/O  Performance  tuning  for  Cassandra  running  on  AWS  with  Gazzang  

Page 2: San Francisco Cassadnra Meetup - March 2014: I/O Performance tuning on AWS for Cassandra with Gazzang

Which  is  faster?  

3/26/14 © Gazzang, Inc. -- CONFIDENTIAL -- 2

•  2012  Kawasaki  Ninja  ZX-­‐14  /  ZZR1400  

 •  2013  Mclaren  P1  

•  2011  Boeing  747-­‐8    

Page 3: San Francisco Cassadnra Meetup - March 2014: I/O Performance tuning on AWS for Cassandra with Gazzang

DEFINE  FASTER?  0  TO  60  MPH?  Engineering  approach  

3/26/14 © Gazzang, Inc. -- CONFIDENTIAL -- 3

Page 4: San Francisco Cassadnra Meetup - March 2014: I/O Performance tuning on AWS for Cassandra with Gazzang

Which  is  faster?  

3/26/14 © Gazzang, Inc. -- CONFIDENTIAL -- 4

•  2012  Kawasaki  Ninja  ZX-­‐14  /  ZZR1400  –  0  to  60  in  2.5  seconds  

 •  2013  Mclaren  P1  

–  0  to  60  in  2.8  seconds    

•  2011  Boeing  747-­‐8  –  0  to  60  in  10-­‐20  seconds  

Page 5: San Francisco Cassadnra Meetup - March 2014: I/O Performance tuning on AWS for Cassandra with Gazzang

DEFINE  FASTER?  TOP  SPEED?  Engineering  approach  

3/26/14 © Gazzang, Inc. -- CONFIDENTIAL -- 5

Page 6: San Francisco Cassadnra Meetup - March 2014: I/O Performance tuning on AWS for Cassandra with Gazzang

Which  is  faster?  

3/26/14 © Gazzang, Inc. -- CONFIDENTIAL -- 6

•  2012  Kawasaki  Ninja  ZX-­‐14  /  ZZR1400  –  Top  Speed  186  mph  

 •  2013  Mclaren  P1  

–  Top  Speed  217.5  mph    

•  2011  Boeing  747-­‐8  –  Top  Speed  614  mph  

Page 7: San Francisco Cassadnra Meetup - March 2014: I/O Performance tuning on AWS for Cassandra with Gazzang

Today’s  Agenda  

•  Tips  and  Tricks  to  achieve  high  performance  when  running  

Cassandra  on  AWS  

•  ConfiguraXon  tuning  for  Cassandra  

•  Tools  to  benchmark  raw  file  system  I/O  

•  AWS  available  AMIs  to  boost  performance  

•  Stress  tesXng  on  AWS  i2  HVM  instances  

•  Configuring  AWS  EC2  instances  with  SSDs  and  EBS  storage  

with  PIOPS  

3/26/14 © Gazzang, Inc. -- CONFIDENTIAL -- 7

Page 8: San Francisco Cassadnra Meetup - March 2014: I/O Performance tuning on AWS for Cassandra with Gazzang

Performance  tuning  

• Tuning  at  every  layer  – Tune  the  AWS  layer  – Tune  the  Cassandra  layer  – Tune  the  file  system  /  security  layer  

3/26/14 © Gazzang, Inc. -- CONFIDENTIAL -- 8

Page 9: San Francisco Cassadnra Meetup - March 2014: I/O Performance tuning on AWS for Cassandra with Gazzang

 Tune  the  AWS  layer  

   

3/26/14 © Gazzang, Inc. -- CONFIDENTIAL -- 9

Page 10: San Francisco Cassadnra Meetup - March 2014: I/O Performance tuning on AWS for Cassandra with Gazzang

Tune  the  AWS  layer  

•  i2  HVM  instances  will  provide  be^er  I/O  over  other  instance  

types  

•  i2  instances  will  support  SSD  TRIM  for  be^er  SDD  health  and  

performance  over  Xme  

•  Use  Amazon  Linux  distribuXon  AMI  or  kernel  version  3.8  and  

greater  for  higher  I/O  performance  

•  Use  Amazon  Linux  distribuXon  AMI  for  built-­‐in  SR-­‐IOV  (single  

root  I/O  virtualizaXon)  drivers  to  enable  higher  performance  

AWS  Enhanced  Networking  when  running  in  a  VPC  

3/26/14 © Gazzang, Inc. -- CONFIDENTIAL -- 10

Page 11: San Francisco Cassadnra Meetup - March 2014: I/O Performance tuning on AWS for Cassandra with Gazzang

Amazon  Linux  AMI  Instance  Types  and  Sizes  

3/26/14 © Gazzang, Inc. -- CONFIDENTIAL -- 11

http://aws.amazon.com/amazon-linux-ami/

Page 12: San Francisco Cassadnra Meetup - March 2014: I/O Performance tuning on AWS for Cassandra with Gazzang

Amazon  Linux  AMI  Instance  Types  and  Cost  on-­‐demand  in  US  East  

3/26/14 © Gazzang, Inc. -- CONFIDENTIAL -- 12

http://aws.amazon.com/ec2/pricing/

Page 13: San Francisco Cassadnra Meetup - March 2014: I/O Performance tuning on AWS for Cassandra with Gazzang

 Tune  the  Cassandra  layer  

   

3/26/14 © Gazzang, Inc. -- CONFIDENTIAL -- 13

Page 14: San Francisco Cassadnra Meetup - March 2014: I/O Performance tuning on AWS for Cassandra with Gazzang

Tune  the  Cassandra  layer  

•  Follow  DataStax  published  Cassandra  best  pracXces  h^p://www.datastax.com/documentaXon/cassandra/2.0/cassandra/install/installRecommendSefngs.html  

•  Data  directory  should  go  on  the  mounted  ephemeral  instance  

storage,  avoid  EBS  storage  for  maximum  I/O  performance  

•  IMPORTANT:  You  must  have  a  backup  strategy  when  using  

ephemeral,  for  example  using  S3  for  backups  

•  RAID-­‐0  (stripe)  of  SSDs  is  supported  but  Cassandra  also  does  a  great  job  of  using  all  mounted  drives  without  RAID  

•  Scale  by  adding  smaller  instances  vs.  increasing  instance  size  

3/26/14 © Gazzang, Inc. -- CONFIDENTIAL -- 14

Page 15: San Francisco Cassadnra Meetup - March 2014: I/O Performance tuning on AWS for Cassandra with Gazzang

Tune  the  Cassandra  layer  

•  Cassandra  writes  immutable  sstable  files  to  disk.    It  then  

compacts  mulXple  sstables  into  1  larger  sstable  with  some  

cleanup  occurring  along  the  way  which  also  helps  TRIM    

•  More  OS  memory  the  be^er,  on  read  the  sstables  are  cached  

as  normal  memory  mapped  file  loaded  into  OS  memory  

•  Increasing  the  JVM  heap  size  can  cause  performance  issues  for  

Cassandra  during  garbage  collecXon  “Death  by  Garbage  

CollecXon”  

3/26/14 © Gazzang, Inc. -- CONFIDENTIAL -- 15

Page 16: San Francisco Cassadnra Meetup - March 2014: I/O Performance tuning on AWS for Cassandra with Gazzang

Tune  the  Cassandra  layer  

•  AddiXonal  DataStax  recommendaXons  

–  For  EC2  •  h^p://www.datastax.com/documentaXon/cassandra/2.0/webhelp/cassandra/

architecture/architecturePlanningEC2_c.html  

–  AnX-­‐pa^erns  •  h^p://www.datastax.com/documentaXon/cassandra/2.0/webhelp/cassandra/

architecture/architecturePlanningAnXPa^erns_c.html  

–  Hardware  •  h^p://www.datastax.com/documentaXon/cassandra/2.0/webhelp/cassandra/

architecture/architecturePlanningHardware_c.html  

3/26/14 © Gazzang, Inc. -- CONFIDENTIAL -- 16

Page 17: San Francisco Cassadnra Meetup - March 2014: I/O Performance tuning on AWS for Cassandra with Gazzang

 Tune  the  file  system  /  security  layer  

   

3/26/14 © Gazzang, Inc. -- CONFIDENTIAL -- 17

Page 18: San Francisco Cassadnra Meetup - March 2014: I/O Performance tuning on AWS for Cassandra with Gazzang

Tune  the  file  system  layer  

•  Format  the  file  system  with  ext4  vs  ext3  or  xfs  if  supported  by  

your  chosen  Linux  distribuXon  

•  Use  the  most  current  Linux  version  for  your  distribuXon,  many  

performance  fixes  are  supported  only  in  newer  kernels  

•  Use  IOZone  or  other  file  system  tests  before  and  amer  

configuraXons  to  benchmark  raw  file  I/O  before  loading  your  

Cassandra  data  

3/26/14 © Gazzang, Inc. -- CONFIDENTIAL -- 18

Page 19: San Francisco Cassadnra Meetup - March 2014: I/O Performance tuning on AWS for Cassandra with Gazzang

Tune  the  file  security  layer  

•  Use  Block  Level  encrypXon  dedicaXng  enXre  SSD  volume  

•  Encrypt  the  cluster  before  loading  data  whenever  possible  

•  Use  systems  that  support  hardware  encrypXon  acceleraXon  

like  Intel  AES-­‐NI  h^p://aws.amazon.com/ec2/instance-­‐types  

3/26/14 © Gazzang, Inc. -- CONFIDENTIAL -- 19

Page 20: San Francisco Cassadnra Meetup - March 2014: I/O Performance tuning on AWS for Cassandra with Gazzang

     Test  and  measure  

   

3/26/14 © Gazzang, Inc. -- CONFIDENTIAL -- 20

Page 21: San Francisco Cassadnra Meetup - March 2014: I/O Performance tuning on AWS for Cassandra with Gazzang

Performance  Tes<ng  

•  When  tesXng  performance  reduce  the  number  of  variables  

that  can  affect  the  test  

–  Stopping  and  stopping  a  server  can  switch  your  instance  to  a  different  host  with  different  performance  

–  Time  of  day  when  you  run  tests  can  affect  the  performance  

–  Eliminate  cached  in  memory  data  from  prior  tests  which  may  

contaminate  your  results  

–  Avoid  tesXng  on  systems  with  unknown  state  and  size  of  data  

3/26/14 © Gazzang, Inc. -- CONFIDENTIAL -- 21

Page 22: San Francisco Cassadnra Meetup - March 2014: I/O Performance tuning on AWS for Cassandra with Gazzang

Cassandra  Test  Environment  

3/26/14 © Gazzang, Inc. -- CONFIDENTIAL -- 22

Cassandra  Stress  Client  

Cassandra  Node  1  

Cassandra  Node  2  

Cassandra  Node  3  

Cassandra  Node  4  

Cassandra  Node  5  

Cassandra  Node  6  

EBS  Clear  text  

EBS  4K  PIOPS  

SSD  Clear  text  

SSD  Encrypted  

IOZone Tests Cassandra

Stress Tests

S3  Backups  

Page 23: San Francisco Cassadnra Meetup - March 2014: I/O Performance tuning on AWS for Cassandra with Gazzang

Test  Environment  Specifica<ons  

Instance:  i2.2xlarge      AZ:  us-­‐east-­‐1a  AMI  InformaXon:  amzn-­‐ami-­‐hvm-­‐2013.09.2.x86_64-­‐ebs  (ami-­‐e9a18d80)  Linux  DistribuXon:  Amazon  Linux  AMI  release  2013.09  Kernel  Version:  3.4.73-­‐64.112.amzn1.x86_64  Drive  Layout:          Filesystem                        Size    Used  Avail  Use%  Mounted  on          /dev/xvda1                        7.9G    1.8G    6.1G    23%  /    (EBS  backed  for  tests,  ephemeral  is  be^er)          tmpfs                                    30G          0      30G      0%  /dev/shm          /dev/xvdb                          734G    197M    697G      1%  /mount/ssd1    (Cleartext  test  SSD)          /dev/mapper/encrypted  734G      36G    662G      6%  /encrypted    (Encrypted  test  SSD)    Cassandra  Stress  Client  –  m1.medium    Cassandra  Cluster:  6  Nodes  DataStax  enterprise:  dse-­‐libcassandra-­‐3.2.2-­‐1.noarch  Cassandra:  version  1.2.12.2    Java  HotSpot(TM)  64-­‐Bit  Server  VM/1.6.0_45    

3/26/14 © Gazzang, Inc. -- CONFIDENTIAL -- 23

Page 24: San Francisco Cassadnra Meetup - March 2014: I/O Performance tuning on AWS for Cassandra with Gazzang

IOZone  SSD  vs.  Non-­‐SSD  

IOZone  test  configuraXon  Xme  iozone  -­‐ORa  -­‐s  163840  -­‐r  16384            Iozone:  Performance  Test  of  File  I/O                            Version  $Revision:  3.420  $                      Compiled  for  64  bit  mode.                      Build:  linux-­‐AMD64              OPS  Mode.  Output  is  in  operaXons  per  second.            Excel  chart  generaXon  enabled            Auto  Mode            File  size  set  to  163840  KB            Record  Size  16384  KB            Command  line  used:  iozone  -­‐ORa  -­‐s  163840  -­‐r  16384            Time  ResoluXon  =  0.000001  seconds.            Processor  cache  size  set  to  1024  Kbytes.            Processor  cache  line  size  set  to  32  bytes.            File  stride  size  set  to  17  *  record  size.  

3/26/14 © Gazzang, Inc. -- CONFIDENTIAL -- 24

http://www.iozone.org/

Page 25: San Francisco Cassadnra Meetup - March 2014: I/O Performance tuning on AWS for Cassandra with Gazzang

IOZone  SSD  vs.  Non-­‐SSD  

Clear  text  test  on  default  EBS  root  parXXon  (control/baseline)                                                                                                                              random    random        bkwd      record      stride                                                                                                  KB    reclen      write  rewrite        read        reread        read      write        read    rewrite          read      fwrite  frewrite      fread    freread                      163840      16384          100          164            278            281          278          176          270            195            273            135            140          265            263      

real          1m6.360s  user          0m0.084s  sys          0m0.911s  

3/26/14 © Gazzang, Inc. -- CONFIDENTIAL -- 25

Page 26: San Francisco Cassadnra Meetup - March 2014: I/O Performance tuning on AWS for Cassandra with Gazzang

IOZone  SSD  vs.  Non-­‐SSD  

EncrypXon  test  on  EBS  w/PIOPS  4,000  and  ebs  opXmized      

                                                                                                                       random    random        bkwd      record      stride                                                                                                  KB    reclen      write  rewrite        read        reread        read      write        read    rewrite          read      fwrite  frewrite      fread    freread                      163840      16384            98          168            293            296          295          178          290            196            291            138            144          279            285      

real          0m15.223s  user          0m0.115s  sys          0m1.391s  

3/26/14 © Gazzang, Inc. -- CONFIDENTIAL -- 26

Page 27: San Francisco Cassadnra Meetup - March 2014: I/O Performance tuning on AWS for Cassandra with Gazzang

IOZone  SSD  vs.  Non-­‐SSD  

EncrypXon  test  on  SSD      

                                                                                                                       random    random        bkwd      record      stride                                                                                                  KB    reclen      write  rewrite        read        reread        read      write        read    rewrite          read      fwrite  frewrite      fread    freread                      163840      16384            99          167            291            296          298          178          292            195            292            138            144          291            297      

real          0m9.951s  user          0m0.291s  sys          0m3.595s  

3/26/14 © Gazzang, Inc. -- CONFIDENTIAL -- 27

Page 28: San Francisco Cassadnra Meetup - March 2014: I/O Performance tuning on AWS for Cassandra with Gazzang

Cassandra  Test  Environment  

3/26/14 © Gazzang, Inc. -- CONFIDENTIAL -- 28

Cassandra  Node    

EBS  Clear  text  

EBS  4K  PIOPS  encrypted  

SSD  

SSD  Encrypted  

IOZone Tests

real 1m6.360s user 0m0.084s sys 0m0.911s

real 0m15.223s user 0m0.115s sys 0m1.391s

real 0m9.951s user 0m0.291s sys 0m3.595s

Page 29: San Francisco Cassadnra Meetup - March 2014: I/O Performance tuning on AWS for Cassandra with Gazzang

I/O  gap  reduced  with  larger  file  and  record  sizes  

3/26/14 © Gazzang, Inc. -- CONFIDENTIAL -- 29

0  

50  

100  

150  

200  

250  

256   512   1024   2048   4096   8192   16384  

Test  Write  clear  text  vs  zNcrypt-­‐block  vs  zNcrypt-­‐file  8  MB  file  

rootclear8192  

data2dm  8192  

data1ec  8192  

Clear text

Block encryption

File encryption

KB/s

MB (record size)

Page 30: San Francisco Cassadnra Meetup - March 2014: I/O Performance tuning on AWS for Cassandra with Gazzang

Cassandra  stress  

The  cassandra-­‐stress  tool  

•  A  Java-­‐based  stress  tesXng  uXlity  for  benchmarking  and  load  tesXng  a  Cassandra  cluster.  

•  The  binary  installaXon  of  the  tool  also  includes  a  daemon,  which  in  larger-­‐scale  tesXng  can  prevent  potenXal  skews  in  the  test  results  by  keeping  the  JVM  warm.  

•  Modes  of  operaXon:  –  InserXng:  Loads  test  data.  –  Reading:  Reads  test  data.  –  Indexed  range  slicing:  Works  with  RandomParXXoner  on  indexed  tables.  

3/26/14 © Gazzang, Inc. -- CONFIDENTIAL -- 30

http://www.datastax.com/documentation/cassandra/2.0/cassandra/tools/toolsCStress_t.html

Page 31: San Francisco Cassadnra Meetup - March 2014: I/O Performance tuning on AWS for Cassandra with Gazzang

Current  Cassandra  stress  test  configura<on  

•  Cassandra  stress  test  command  –  <cassandra  home>/tools/bin/cassandra-­‐stress  -­‐l  3  -­‐o  insert  -­‐n  100000000  -­‐i  1  -­‐e  ONE  -­‐c  10  -­‐d  <Cassandra  Node  IPs>  -­‐t  150  -­‐f  T1.csv  &  

•  In  the  stress  test,  client  stress  test  nodes  1  –  3  will  target  two  separate  Cassandra  nodes.  On  client  node  #4,  target  all  Cassandra  nodes.  –  Client#1  —>  CAS  1,  2  –  Client#2  —>  CAS  3,  4  –  Client#3  —>  CAS  5,  6  –  Client#4  —>  CAS  1,  2,  3,  4,  5,  6  

3/26/14 © Gazzang, Inc. -- CONFIDENTIAL -- 31

Page 32: San Francisco Cassadnra Meetup - March 2014: I/O Performance tuning on AWS for Cassandra with Gazzang

Cassandra  Test  Environment  

3/26/14 © Gazzang, Inc. -- CONFIDENTIAL -- 32

Stress    Client  1  

Cassandra  Node  1  

Cassandra  Node  2  

Cassandra  Node  3  

Cassandra  Node  4  

Cassandra  Node  5  

Cassandra  Node  6  

SSD  Clear  text  

SSD  Encrypted  

Cassandra Stress Tests

Stress    Client  2  

Stress    Client  3  

Stress    Client  4  

Page 33: San Francisco Cassadnra Meetup - March 2014: I/O Performance tuning on AWS for Cassandra with Gazzang

Benchmark  clear  text  vs  encrypted  inserts  (write)  

3/26/14 © Gazzang, Inc. -- CONFIDENTIAL -- 33

Page 34: San Francisco Cassadnra Meetup - March 2014: I/O Performance tuning on AWS for Cassandra with Gazzang

Summary  

•  Test  in  your  environment  with  your  data,  results  will  vary  greatly  on  OS,  HW  and  applicaXon  configuraXons  –  Baseline  before  you  tune  –  Tune  –  Test  amer  tuning  –  Measure  –  Rinse  and  repeat  twice  

 •  Security  and  Performance  are  not  mutually  exclusive,  

encrypXon  can  coexist  with  High  I/O  performance    •  Do  your  homework,  configure  and  run  tests  that  map  to  your  

use  case  

3/26/14 © Gazzang, Inc. -- CONFIDENTIAL -- 34

Page 35: San Francisco Cassadnra Meetup - March 2014: I/O Performance tuning on AWS for Cassandra with Gazzang

• Headquartered  in  AusXn,  Texas  • Focus  on  securing  sensiXve  data  in  cloud  and  big  data  environments  

• Enable  customers  to  meet  compliance    requirements  like  HIPAA,  PCI,  FIPS  and  FERPA  

• SaXsfy  internal  security  mandates  

• Protect  valuable  client  informaXon  

About  Gazzang  

Page 36: San Francisco Cassadnra Meetup - March 2014: I/O Performance tuning on AWS for Cassandra with Gazzang

Gazzang  is  focused  on  data  at-­‐rest  encrypXon  

 

Security  in  the  cloud  is  a  layered  approach  

36 3/26/14 Gazzang - All rights reserved 2013

Data  in  process  (in  applica<on)  

Data  at  rest  (storage)  

Data  in  transit  (SSL)  

Page 37: San Francisco Cassadnra Meetup - March 2014: I/O Performance tuning on AWS for Cassandra with Gazzang

and  key  management  

 

37 3/26/14 Gazzang - All rights reserved 2013

Security  in  the  cloud  is  a  layered  approach  

Data  in  process  (in  applica<on)  

Data  at  rest  (storage)  

Data  in  transit  (SSL)  

Page 38: San Francisco Cassadnra Meetup - March 2014: I/O Performance tuning on AWS for Cassandra with Gazzang

Thank  you!  

Gazzang,  Inc  www.gazzang.com      Eddie  Garcia  VP  of  InfoSec  and  Services  [email protected]    Sam  Heywood  VP  of  Products  and  MarkeXng  [email protected]    

3/26/14 © Gazzang, Inc. -- CONFIDENTIAL -- 38

Airport Race Porsche 911 GT3 Cup vs Boeing 747

http://www.youtube.com/watch?v=duOlJa5Vjdo