hadoop infrastructure @uber past , present and future

Post on 31-Dec-2016

227 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

U B E R | Data

HadoopInfrastructure@UberPast,PresentandFutureMayankBansal

U B E R | Data

“Transporta=onasreliableasrunningwater,everywhere,foreveryone”

Uber’sMission

75+Countries 500+Ci=es

Andgrowing…

U B E R | Data

HowUberworks

U B E R | Data

HowUberworks

U B E R | Data

HowUberworks

U B E R | Data

DataDrivenDecisions

U B E R | Data

DataInfraOnceUpona8me..(2014)

Kafka Logs

Key-Val DB

RDBMS DBs

S3

Applica=ons

ETL

BusinessOps

A/BExperiments

Adhoc Analytics

CityOps

Vertica DataWarehouse

Data

Science

EMR

U B E R | Data

DataInfrastructureToday

Kafka8 Logs

Schemaless DB

SOA DBs

Service Accounts

ETL MachineLearning

Experimenta=on

Data Science

Adhoc Analytics Ops/DataScience

HDFS

CityOpsDataScience

Spark|PrestoHive

FewTakeaways…

●  StrictSchemaManagement○  BecauseourlargestdataaudienceareSQL

Savvy!(1000sofUberOps!)○  SQL=StrictSchema

●  BigDataProcessingToolsUnlocked-Hive,PrestoandSpark○  MigrateSQLsavvyusersfromVer=catoHive

&Presto(1000sofOps&100sofdatascien=sts&analysts)

○  Sparkformoreadvancedusers-100sofdatascien=sts

U B E R | Data

HadoopEvolu8on@ebay

2014

1XNodes1XPB

2015 10X Nodes 4X PB Data 3000+ node 30,000+ cores 50+ PB

2016 90X Nodes

40X PB Data

HadoopEvolu8on@Uber

U B E R | Data

HadoopClusterU=liza=on

•  Overprovisioningforthepeakloads.

•  Overcapacityforan=cipa=onoffuturegrowth

U B E R | Data

HadoopEvolu8on@ebay

20140Nodes

2015 X Nodes

2016

300XNodes

MesosEvolu8on@Uber

U B E R | Data

MesosClusterU=liza=on

•  Overprovisioningforthepeakloads

•  Overcapacityforan=cipa=onoffuturegrowth

U B E R | Data

EndGoal

Online

Presto

U B E R | Data

Whatweneed?

GLOBALVIEWOFRESOURCES

U B E R | Data

AvailableResourceManagers

U B E R | Data

MesosvsYARN

YARN MESOSSingleLevelScheduler TwoLevelSchedulerUseCgroupsforisola=on UseCgroupsforIsola=onCPU,Memoryasaresource CPU,MemoryandDiskasa

resource

WorkswellwithHadoopworkloads Workswellwithlongerrunningservices

YARNsupport=mebasedreserva=ons

Mesosdoesnothavesupportofreserva=ons

Dominantresourcescheduling Schedulingisdonebyframeworksanddependsoncasetocasebasis

ScalesBegerSimilarIsola=on

Diskisbeger

ThisisImportant

ImpforbatchSLA’sBegerforbatch

U B E R | Data

Let’s8edthemtogether

YARNisgoodforHadoopMesosisgoodforLongerRunningServices

InaNutshell

U B E R | Data

U B E R | Data

•  MyriadisMesosFrameworkforApacheYARN

•  MesosmanagesDataCenterresources•  YARNmanagesHadoopworkloads

•  Myriad•  GetsresourcesfromMesos•  LaunchesNodeManagers

U B E R | Data

•  YARNwillhandleresourceshanded

overtoit.•  Mesoswillworkonrestoftheresources

Myriad’sLimita8onsSta=cResourcePar==oning

U B E R | Data

•  YARNwillneverbeabletodooversubscrip=on.•  NodeManagerwillgoaway•  Fragmenta=onofresources

•  Mesosoversubscrip=oncankillYARNtoo

Myriad’sLimita8onsResourceOverSubscrip=on

U B E R | Data

•  NoGlobalQuotaEnforcement

•  NoGlobalPriori=es

Myriad’sLimita8ons

U B E R | Data

•  Elas=cResourceManagement

•  BinPacking•  Stability

•  LongList…

Myriad’sLimita8ons

U B E R | Data

UnifiedScheduler

U B E R | Data

HighLevelCharacteris8cs

•  GlobalQuotaManagement

•  CentralSchedulingpolicies

•  Oversubscrip=onforbothOnlineandBatch

•  Isola=onandbinpacking

•  SLAguaranteesatGlobalLevel

U B E R | Data

UnifiedScheduler

U B E R | Data

FewTakeaways…

•  Weneedoneschedulinglayeracrossallworkloads

•  Par==oningresourcesarenotgood •  Atleastcansave30%resources

•  StabilityandsimplicitywinsinProduc=on•  Mul=LevelofresourceManagementandschedulingwillnotbescalable

U B E R | Data

U B E R | Data

Ques=ons?

mabansal@uber.commayank@apache.org

U B E R | Data

ThankYou!!!

top related