scaling and managing cassandra with docker, coreos and presto

Download Scaling and Managing Cassandra with docker, CoreOS and Presto

Post on 18-Jul-2015

177 views

Category:

Software

6 download

Embed Size (px)

TRANSCRIPT

  • Javascript addict, Experienced as technical leader and being focused on building high-quality, distributed, scalable applications.

    VALI MALINOIU @0X4139

  • HOW TO - WITH CASSANDRA Scaling with DOCKER

    Clustering with COREOS

    Querying with PRESTO

  • SCALING - DOCKER What is docker ?

    Why should i use it ?

    How does it help me ?

  • Docker is an open platform for developers and sysadmins

    to build, ship, and run distributed applications

    WHAT IS DOCKER???

    DOCKER

    Why developers like it?

    Why sysadmins like it?

    Docker vs VM (VBox, VMWARE)

  • FOR DEVELOPERS Any language using any toolchain Dockerized apps are completely portable and can run

    anywhere No more apt-get install, yum install Get started quickly using one of the 13000+ apps on

    Docker Hub

  • FOR SYSADMINSProvides standardized environments for QA,DEV,PROD Reduces works on my machine finger pointing Deploy and run any app on any infrastructure, quickly and

    reliable

  • Each virtualized application includes not only the application - which may be 10s of MBS and the necessary binaries and libraries but also the entire guest operating system - which may weigh 10s of GB

    VIRTUAL MACHINES

    The docker Engine container comprises just the application and its dependencies. It runs as an isolated process in userspace on the host operating system, sharing the kernel with other containers.

    DOCKER

  • DOCKER DEMO

  • CLUSTER - COREOS What is cores ?

    Why should i use it ?

    How does it help me ?

  • CoreOS is Linux for Massive Server Deployments

    WHAT IS COREOS???

    COREOS

    ETCD

    FLEET

    FLANNEL

  • ETCD

    ETCD is a distributed key value store that provides a reliable way to store data across a cluster of machines. ETCD gracefully handles master elections during network partitions and will tolerate machine failure, including the master. Your applications can read and write data into etcd.

  • FLEET

    Easy Warehouse-scale Computing Treat your CoreOS cluster as if it shared a single init system. Automatic migration of the units when a machine is failing

  • FLEET - ADVANTAGESDeploy docker containers on arbitrary hosts in a cluster Distribute services across a cluster using machine-level anti-affinity Maintain N instances of a service, re-scheduling on machine failure Discover machines running in the cluster Automatically SSH into the machine running a job

  • Listing machines in a cluster

    Listing units in a cluster

    Listing unit files in a cluster

  • Creating a unit

  • FLANNEL

    is an overlay network that gives a subnet to each machine for use with Docker/Kubernetes Allows for Docker containers to communicate even tough the containers are located on different machines (magic)

  • Facebook uses Presto for interactive queries against several internal data stores,

    including their 300PB data warehouse. Over 1,000 Facebook employees use Presto daily

    to run more than 30,000 queries that in total scan over a petabyte each per day.

    DISTRIBUTED SQL ENGINE FOR BIG

    DATA

    PRESTOby Facebook

  • HOW DOES IT WORK? Coordinator-worker

    architecture

    Works with varios connectors

    Hadoop/HIVE

    Cassandra

    TCP-H (mostly for testing)

  • WHY?? Combine data from

    multiple data sources

    JOINS!!!!

    Scalable

    Maintained!!!

  • QUESTIONS?