a birds-eye view of pig and scalding jobs with hraven

39
A Bird’s-Eye View of Pig and Scalding with hRaven a tale by @gario and @joep Hadoop Summit 2013 v1.2

Upload: hadoop-summit

Post on 12-May-2015

1.828 views

Category:

Technology


0 download

DESCRIPTION

As Twitter's use of mapreduce rapidly expands, tracking usage on our clusters grows correspondingly more difficult. With an ever increasing job load, and a reliance on higher level abstractions such as Pig and Scalding, the utility of existing tools for viewing job history decreases rapidly, and extracting insights becomes a challenge. At Twitter, we created hRaven to fill this gap. hRaven archives the full history and metrics from all mapreduce jobs on our clusters, and strings together each job from a Pig or Scalding script execution into a combined flow. From this archive, we can easily derive aggregate resource utilization by user, pool, or application. While the historical trending of an individual application allows us to perform runtime optimization of resource scheduling. We will cover how hRaven provides a rich historical archive of mapreduce job execution, and how the data is structured into higher level flows representing the job sequence for frameworks such as Pig, Scalding, and Hive. We will then explore how we mine hRaven data to account for Hadoop resource utilization, to optimize runtime scheduling, and to identify common anti-patterns in user jobs. Finally, we will look at the end user experience, including Ambrose integration for flow visualization.

TRANSCRIPT

Page 1: A Birds-Eye View of Pig and Scalding Jobs with hRaven

A Bird’s-Eye View of Pig and Scalding

with hRavena tale by @gario and @joep

Hadoop Summit 2013

v1.2

Page 2: A Birds-Eye View of Pig and Scalding Jobs with hRaven

@Twitter#HadoopSummit2013 2

Apache HBase PMC member andCommitter

Software Engineer @ Twitter

Core Storage Team - Hadoop/HBase

••

About the authors

Software Engineer @ Twitter

Engineering Manager Hadoop/HBaseteam @ Twitter

••

Page 3: A Birds-Eye View of Pig and Scalding Jobs with hRaven

@Twitter#HadoopSummit2013 3

Chapter 1: The ProblemChapter 2: Why hRaven?Chapter 3: How Does it Work?

3a: Loading

3b: Table structure / queryingChapter 4: Current UsesAppendix: Future Work

••

Table of Contents

Page 4: A Birds-Eye View of Pig and Scalding Jobs with hRaven

Chapter 1: The Problem

Illustration by Sirxlem (CC BY-NC-ND3.0)

Page 5: A Birds-Eye View of Pig and Scalding Jobs with hRaven

@Twitter#HadoopSummit2013 5

Most users run Pig and Scalding scripts, not straight map reduceJobTracker UI shows jobs, not DAGs of jobs generated by Pig and Scalding

Chapter 1: Mismatched Abstractions

Page 6: A Birds-Eye View of Pig and Scalding Jobs with hRaven

@Twitter#HadoopSummit2013

Chapter 1: A Problem of Scale

6

Page 7: A Birds-Eye View of Pig and Scalding Jobs with hRaven

@Twitter#HadoopSummit2013 7

How many Pig versus Scalding jobs do we run ?What cluster capacity do jobs in my pool take ?How many jobs do we run each day ?What % of jobs have > 30k tasks ?Why do I need to hand-tune these (hundreds) of jobs, can’t the cluster learn ?

Chapter 1: Questions

Page 8: A Birds-Eye View of Pig and Scalding Jobs with hRaven

@Twitter#HadoopSummit2013 8

How many Pig versus Scalding jobs do we run ?What cluster capacity do jobs in my pool take ?How many jobs do we run each day ?What % of jobs have > 30k tasks ?Why do I need to hand-tune these (hundreds) of jobs, can’t the cluster learn ?

Chapter 1: Questions

#Nevermore

Page 9: A Birds-Eye View of Pig and Scalding Jobs with hRaven

Chapter 2: Why hRaven?

Photo by DAVID ILIFF. License: CC-BY-SA3.0

Page 10: A Birds-Eye View of Pig and Scalding Jobs with hRaven

@Twitter#HadoopSummit2013 10

Stores stats, configuration and timing for every map reduce job on everyclusterStructured around the full DAG of jobs from a Pig or Scalding applicationEasily queryable for historical trendingAllows for Pig reducer optimization based on historical run statsKeep data online forever (12.6M jobs, 4.5B tasks + attempts)

Chapter 2: Why hRaven?

Page 11: A Birds-Eye View of Pig and Scalding Jobs with hRaven

@Twitter#HadoopSummit2013 11

cluster - each cluster has a unique name mapping to the Job Trackeruser - map reduce jobs are run as a given userapplication - a Pig or Scalding script (or plain map reduce job)flow - the combined DAG of jobs executed from a single run of anapplicationversion - changes impacting the DAG are recorded as a new version of thesame application

Chapter 2: Key Concepts

Page 12: A Birds-Eye View of Pig and Scalding Jobs with hRaven

@Twitter#HadoopSummit2013 12

Chapter 2: Application Flows

Edgar

Page 13: A Birds-Eye View of Pig and Scalding Jobs with hRaven

@Twitter#HadoopSummit2013 13

Chapter 2: Application Flows

Edgar

Page 14: A Birds-Eye View of Pig and Scalding Jobs with hRaven

@Twitter#HadoopSummit2013 14

All jobs in a flow are ordered together•

Chapter 2: Flow Storage

Page 15: A Birds-Eye View of Pig and Scalding Jobs with hRaven

@Twitter#HadoopSummit2013 15

Most recent flow is ordered first•

Chapter 2: Flow Storage

Page 16: A Birds-Eye View of Pig and Scalding Jobs with hRaven

@Twitter#HadoopSummit2013 16

All jobs in a flow are ordered togetherPer-job metrics stored

Total map and reduce tasks

HDFS bytes read / written

File bytes read / written

Total map and reduce slot milliseconds

Easy to aggregate stats for an entire flowEasy to scan the timeseries of each application’s flows

••••

Chapter 2: Key Features

Page 17: A Birds-Eye View of Pig and Scalding Jobs with hRaven

Chapter 3: How Does it Work?

Page 18: A Birds-Eye View of Pig and Scalding Jobs with hRaven

@Twitter#HadoopSummit2013 18

Chapter 3: ETL - Step 1: JobFilePreprocessor

Page 19: A Birds-Eye View of Pig and Scalding Jobs with hRaven

@Twitter#HadoopSummit2013 19

Chapter 3: ETL - Step 2: JobFileRawLoader

Page 20: A Birds-Eye View of Pig and Scalding Jobs with hRaven

@Twitter#HadoopSummit2013 20

Chapter 3: ETL - Step 3: JobFileProcessor

Page 21: A Birds-Eye View of Pig and Scalding Jobs with hRaven

@Twitter#HadoopSummit2013 21

Chapter 3: ETL - Step 3: JobFileProcessor

Jobs finish out of order with respect to job_id

Page 22: A Birds-Eye View of Pig and Scalding Jobs with hRaven

@Twitter#HadoopSummit2013 22

job_history_raw

job_history

job_history_task

job_history_app_version

Chapter 3: Tables

Page 23: A Birds-Eye View of Pig and Scalding Jobs with hRaven

@Twitter#HadoopSummit2013 23

Row key: cluster!jobID

Columns:

jobconf - stores serialized raw job_*_conf.xml file

jobhistory - stored serialized raw job history log file

job_processed_success - indicates whether job has been processed

•••

Chapter 3: job_history_raw

Page 24: A Birds-Eye View of Pig and Scalding Jobs with hRaven

@Twitter#HadoopSummit2013 24

Row key: cluster!user!application!timestamp!jobIDcluster - unique cluster name (ie. “cluster1@dc1”)

user - user running the application (“edgar”)

application - application ID derived from job configuration:

uses “batch.desc” property if set

otherwise parses a consistent ID from “mapred.job.name”

timestamp - inverted (Long.MAX_VALUE - value) value of submission time

jobID - stored as Job Tracker start time (long), concatenated with job sequence number

job_201306271100_0001 -> [1372352073732L][1L]

•••

••

••

Chapter 3: job_history

Page 25: A Birds-Eye View of Pig and Scalding Jobs with hRaven

@Twitter#HadoopSummit2013 25

Row key: cluster!user!application!timestamp!jobID!taskIDsame components as job_history key (same ordering)

taskID - (ie. “m_00001”) uniquely identifies individual task/attempt in job

Two row types:Task - “meta” row

cluster1@dc1!edgar!wordcount!9654...!...[00001]!m_00001

Task Attempt - individual execution on a Task Trackercluster1@dc1!edgar!wordcount!9654...!...[00001]!m_00001_1

••

Chapter 3: job_history_task

Page 26: A Birds-Eye View of Pig and Scalding Jobs with hRaven

@Twitter#HadoopSummit2013 26

Row key: cluster!user!application

Example: cluster1@dc1!edgar!wordcount

Columns:v1=1369585634000

v2=1372263813000

Chapter 3: job_history_app_version

Page 27: A Birds-Eye View of Pig and Scalding Jobs with hRaven

@Twitter#HadoopSummit2013 27

Using Pig’s HBaseStorage (or direct HBase APIs)Through Client APIThrough REST API

Chapter 3: Querying hRaven

Page 28: A Birds-Eye View of Pig and Scalding Jobs with hRaven

Chapter 4: Current Uses

Page 29: A Birds-Eye View of Pig and Scalding Jobs with hRaven

@Twitter#HadoopSummit2013 29

Pig reducer optimizationsCluster utilization / capacity planningApplication performance trending over timeIdentifying common job anti-patternsAd-hoc analysis troubleshooting cluster problems

Chapter 4: Current Uses

Page 30: A Birds-Eye View of Pig and Scalding Jobs with hRaven

@Twitter#HadoopSummit2013 30

Chapter 4: Cluster reads-writes

Page 31: A Birds-Eye View of Pig and Scalding Jobs with hRaven

@Twitter#HadoopSummit2013

Chapter 4: Pool / Application reads/writes

31

Pool view

Spike in File size read

Indicates jobs spilling

••

Application view

Spike in HDFS sizeread

Indicates spiking input

Page 32: A Birds-Eye View of Pig and Scalding Jobs with hRaven

@Twitter#HadoopSummit2013

Chapter 4: Pool usage: Used vs. Allocated

32

Page 33: A Birds-Eye View of Pig and Scalding Jobs with hRaven

@Twitter#HadoopSummit2013 33

Chapter 4: Compute cost

Page 34: A Birds-Eye View of Pig and Scalding Jobs with hRaven

Appendix: Future Work

Page 35: A Birds-Eye View of Pig and Scalding Jobs with hRaven

@Twitter#HadoopSummit2013 35

Real-time data loading from Job Tracker / Application MasterFull flow-centric UI (Job Tracker UI replacement)Hadoop 2.0 compatibility (in-progress)Ambrose integration

Appendix: Future Work

Page 36: A Birds-Eye View of Pig and Scalding Jobs with hRaven

@Twitter#HadoopSummit2013 36

hRaven on Githubhttps://github.com/twitter/hraven

hRaven Mailing [email protected]

[email protected]

••

Additional Resources

Page 37: A Birds-Eye View of Pig and Scalding Jobs with hRaven

@Twitter#HadoopSummit2013

Afterword

37

Now will thou drop your job data on the floor ?Quoth the hRaven, 'Nevermore.'

Page 38: A Birds-Eye View of Pig and Scalding Jobs with hRaven

#TheEnd@gario and @joep

Come visit us at booth #26 to continue the story

Page 39: A Birds-Eye View of Pig and Scalding Jobs with hRaven

@Twitter#HadoopSummit2013 39

Desired orderjob_201306271100_9999job_201306271100_10000...job_201306271100_99999job_201306271100_100000...job_201306271100_999999job_201306271100_1000000

Sort order Variable length job_idLexical order

job_201306271100_10000job_201306271100_100000job_201306271100_1000000job_201306271100_9999job_201306271100_99999job_201306271100_999999