couchbase meetup jan 2016

11

Upload: michael-kehoe

Post on 10-Feb-2017

663 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Couchbase Meetup Jan 2016
Page 2: Couchbase Meetup Jan 2016

Michael Kehoe Senior Site Reliability Engineer

LinkedIn

LinkedIn’s Big Data Pipeline with Kafka, Hadoop and

Couchbase

Page 3: Couchbase Meetup Jan 2016

3

$ whoami Michael Kehoe

• Sr Site Reliability Engineer (SRE)

• Member of CBVT• B.E. (Electrical Engineering)

fromthe University of Queensland,Australia

Page 4: Couchbase Meetup Jan 2016

4

Kafka @ LinkedIn

• Kafka was created by LinkedIn• Kafka is a publish-subscribe

system as a distributed commit log

• Processes 500+ TB/ day (~500 billion messages)

Page 5: Couchbase Meetup Jan 2016

5

LinkedIn’s use of Kafka

• Monitoring• Pub-SubMessaging• Analytics• Buildingblockfor(log)distributed

application• Samza• Espresso• Pinot

Page 6: Couchbase Meetup Jan 2016

6

 Kafka to Hadoop (Analytics)Use Case

• LinkedIntracksdatatobetterunderstandhowmembersuseourproducts

• InformationsuchaswhichpagegotviewedandwhichcontentgotclickedonaresentintoaKafkaclusterineachdatacenter

• SomeoftheseeventsareallcentrallycollectedandpushedontoourHadoopgridforanalysisanddailyreportgeneration

Page 7: Couchbase Meetup Jan 2016

7

Couchbase @ LinkedIn

• About80separateserviceswithoneormoreclustersinmultipledatacenters

• Upto~70serversinacluster• Single&Multi-tenantclusters

Page 8: Couchbase Meetup Jan 2016

8

Hadoop to Couchbase

• Ourprimaryuse-caseforHadoopCouchbaseisforbuilding(warming)/restoringCouchbasebuckets

• LinkedInbuiltit’sownin-housesolutiontoworkwithourETLprocessesetc

Page 9: Couchbase Meetup Jan 2016

9

 Jobs ClusterClusters & Numbers

• Usedforread-scaling,>150kQPS,27nodeclusters

• WeuseHadooptopre-builddatabypartition• Couchbaseaveragelatencyis2-3ms

• 99thpercentileis~8-12ms

Page 10: Couchbase Meetup Jan 2016

10

Questions?Thank You

Page 11: Couchbase Meetup Jan 2016

©2014 LinkedIn Corporation. All Rights Reserved.