a birds-eye view of pig and scalding jobs with hraven
DESCRIPTION
As Twitter's use of mapreduce rapidly expands, tracking usage on our clusters grows correspondingly more difficult. With an ever increasing job load, and a reliance on higher level abstractions such as Pig and Scalding, the utility of existing tools for viewing job history decreases rapidly, and extracting insights becomes a challenge. At Twitter, we created hRaven to fill this gap. hRaven archives the full history and metrics from all mapreduce jobs on our clusters, and strings together each job from a Pig or Scalding script execution into a combined flow. From this archive, we can easily derive aggregate resource utilization by user, pool, or application. While the historical trending of an individual application allows us to perform runtime optimization of resource scheduling. We will cover how hRaven provides a rich historical archive of mapreduce job execution, and how the data is structured into higher level flows representing the job sequence for frameworks such as Pig, Scalding, and Hive. We will then explore how we mine hRaven data to account for Hadoop resource utilization, to optimize runtime scheduling, and to identify common anti-patterns in user jobs. Finally, we will look at the end user experience, including Ambrose integration for flow visualization.TRANSCRIPT
A Bird’s-Eye View of Pig and Scalding
with hRavena tale by @gario and @joep
Hadoop Summit 2013
v1.2
@Twitter#HadoopSummit2013 2
Apache HBase PMC member andCommitter
Software Engineer @ Twitter
Core Storage Team - Hadoop/HBase
•
••
About the authors
Software Engineer @ Twitter
Engineering Manager Hadoop/HBaseteam @ Twitter
••
@Twitter#HadoopSummit2013 3
Chapter 1: The ProblemChapter 2: Why hRaven?Chapter 3: How Does it Work?
3a: Loading
3b: Table structure / queryingChapter 4: Current UsesAppendix: Future Work
•
•
•
••
•
•
Table of Contents
Chapter 1: The Problem
Illustration by Sirxlem (CC BY-NC-ND3.0)
@Twitter#HadoopSummit2013 5
Most users run Pig and Scalding scripts, not straight map reduceJobTracker UI shows jobs, not DAGs of jobs generated by Pig and Scalding
•
•
Chapter 1: Mismatched Abstractions
@Twitter#HadoopSummit2013
Chapter 1: A Problem of Scale
6
@Twitter#HadoopSummit2013 7
How many Pig versus Scalding jobs do we run ?What cluster capacity do jobs in my pool take ?How many jobs do we run each day ?What % of jobs have > 30k tasks ?Why do I need to hand-tune these (hundreds) of jobs, can’t the cluster learn ?
•
•
•
•
•
Chapter 1: Questions
@Twitter#HadoopSummit2013 8
How many Pig versus Scalding jobs do we run ?What cluster capacity do jobs in my pool take ?How many jobs do we run each day ?What % of jobs have > 30k tasks ?Why do I need to hand-tune these (hundreds) of jobs, can’t the cluster learn ?
•
•
•
•
•
Chapter 1: Questions
#Nevermore
Chapter 2: Why hRaven?
Photo by DAVID ILIFF. License: CC-BY-SA3.0
@Twitter#HadoopSummit2013 10
Stores stats, configuration and timing for every map reduce job on everyclusterStructured around the full DAG of jobs from a Pig or Scalding applicationEasily queryable for historical trendingAllows for Pig reducer optimization based on historical run statsKeep data online forever (12.6M jobs, 4.5B tasks + attempts)
•
•
•
•
•
Chapter 2: Why hRaven?
@Twitter#HadoopSummit2013 11
cluster - each cluster has a unique name mapping to the Job Trackeruser - map reduce jobs are run as a given userapplication - a Pig or Scalding script (or plain map reduce job)flow - the combined DAG of jobs executed from a single run of anapplicationversion - changes impacting the DAG are recorded as a new version of thesame application
•
•
•
•
•
Chapter 2: Key Concepts
@Twitter#HadoopSummit2013 12
Chapter 2: Application Flows
Edgar
@Twitter#HadoopSummit2013 13
Chapter 2: Application Flows
Edgar
@Twitter#HadoopSummit2013 14
All jobs in a flow are ordered together•
Chapter 2: Flow Storage
@Twitter#HadoopSummit2013 15
Most recent flow is ordered first•
Chapter 2: Flow Storage
@Twitter#HadoopSummit2013 16
All jobs in a flow are ordered togetherPer-job metrics stored
Total map and reduce tasks
HDFS bytes read / written
File bytes read / written
Total map and reduce slot milliseconds
Easy to aggregate stats for an entire flowEasy to scan the timeseries of each application’s flows
•
•
••••
•
•
Chapter 2: Key Features
Chapter 3: How Does it Work?
@Twitter#HadoopSummit2013 18
Chapter 3: ETL - Step 1: JobFilePreprocessor
@Twitter#HadoopSummit2013 19
Chapter 3: ETL - Step 2: JobFileRawLoader
@Twitter#HadoopSummit2013 20
Chapter 3: ETL - Step 3: JobFileProcessor
@Twitter#HadoopSummit2013 21
Chapter 3: ETL - Step 3: JobFileProcessor
Jobs finish out of order with respect to job_id
@Twitter#HadoopSummit2013 22
job_history_raw
job_history
job_history_task
job_history_app_version
•
•
•
•
Chapter 3: Tables
@Twitter#HadoopSummit2013 23
Row key: cluster!jobID
Columns:
jobconf - stores serialized raw job_*_conf.xml file
jobhistory - stored serialized raw job history log file
job_processed_success - indicates whether job has been processed
•••
Chapter 3: job_history_raw
@Twitter#HadoopSummit2013 24
Row key: cluster!user!application!timestamp!jobIDcluster - unique cluster name (ie. “cluster1@dc1”)
user - user running the application (“edgar”)
application - application ID derived from job configuration:
uses “batch.desc” property if set
otherwise parses a consistent ID from “mapred.job.name”
timestamp - inverted (Long.MAX_VALUE - value) value of submission time
jobID - stored as Job Tracker start time (long), concatenated with job sequence number
job_201306271100_0001 -> [1372352073732L][1L]
•••
••
••
•
Chapter 3: job_history
@Twitter#HadoopSummit2013 25
Row key: cluster!user!application!timestamp!jobID!taskIDsame components as job_history key (same ordering)
taskID - (ie. “m_00001”) uniquely identifies individual task/attempt in job
Two row types:Task - “meta” row
cluster1@dc1!edgar!wordcount!9654...!...[00001]!m_00001
Task Attempt - individual execution on a Task Trackercluster1@dc1!edgar!wordcount!9654...!...[00001]!m_00001_1
••
•
•
Chapter 3: job_history_task
@Twitter#HadoopSummit2013 26
Row key: cluster!user!application
Example: cluster1@dc1!edgar!wordcount
Columns:v1=1369585634000
v2=1372263813000
Chapter 3: job_history_app_version
@Twitter#HadoopSummit2013 27
Using Pig’s HBaseStorage (or direct HBase APIs)Through Client APIThrough REST API
•
•
•
Chapter 3: Querying hRaven
Chapter 4: Current Uses
@Twitter#HadoopSummit2013 29
Pig reducer optimizationsCluster utilization / capacity planningApplication performance trending over timeIdentifying common job anti-patternsAd-hoc analysis troubleshooting cluster problems
•
•
•
•
•
Chapter 4: Current Uses
@Twitter#HadoopSummit2013 30
Chapter 4: Cluster reads-writes
@Twitter#HadoopSummit2013
Chapter 4: Pool / Application reads/writes
31
Pool view
Spike in File size read
Indicates jobs spilling
•
••
Application view
Spike in HDFS sizeread
Indicates spiking input
•
•
•
@Twitter#HadoopSummit2013
Chapter 4: Pool usage: Used vs. Allocated
32
@Twitter#HadoopSummit2013 33
Chapter 4: Compute cost
Appendix: Future Work
@Twitter#HadoopSummit2013 35
Real-time data loading from Job Tracker / Application MasterFull flow-centric UI (Job Tracker UI replacement)Hadoop 2.0 compatibility (in-progress)Ambrose integration
•
•
•
•
Appendix: Future Work
@Twitter#HadoopSummit2013 36
hRaven on Githubhttps://github.com/twitter/hraven
hRaven Mailing [email protected]
•
••
Additional Resources
@Twitter#HadoopSummit2013
Afterword
37
Now will thou drop your job data on the floor ?Quoth the hRaven, 'Nevermore.'
#TheEnd@gario and @joep
Come visit us at booth #26 to continue the story
@Twitter#HadoopSummit2013 39
Desired orderjob_201306271100_9999job_201306271100_10000...job_201306271100_99999job_201306271100_100000...job_201306271100_999999job_201306271100_1000000
•
Sort order Variable length job_idLexical order
job_201306271100_10000job_201306271100_100000job_201306271100_1000000job_201306271100_9999job_201306271100_99999job_201306271100_999999
•