gluecon miller horizon
TRANSCRIPT
NEARING THE EVENT HORIZON.HADOOP WAS PREDICTABLE, WHAT’S NEXT?
Mike [email protected]
@mlmilleratmitMay 23, 2012
Mike Miller, GlueCon May 2012
What I Am
Cloudant Founder, Chief Scientist(we’re hiring at all positions)
A!liate Assistant Professor, Particle Physics(UW)
Background: machine learning, analysis, big data, globally distributed systems
2
Mike Miller, GlueCon May 2012
What I Am
3
A CDN for your Application Data
Mike Miller, GlueCon May 2012
What I Am Not
4
didn’t see these comingSuper luminal neutrinosRed Sox epic collapse in SeptemberRed Wings losing in the first round...
But here I go anyway
Mike Miller, GlueCon May 2012
My First Postulate of Big-Data
What matters for google...... matters for the internet......and therefore matters for the enterprise...... will therefore be re-architected by Apache...... and therefore matters to you.
5
Google Matters
Mike Miller, GlueCon May 2012
Evidence
6
Business Week, 12/24/2007
Mike Miller, GlueCon May 2012
Evidence
6
Business Week, 12/24/2007
Mike Miller, GlueCon May 2012
Evidence
6
Business Week, 12/24/2007
Mike Miller, GlueCon May 2012
The Old Canon
• Google File System (the important one)http://labs.google.com/papers/gfs.html
• MapReduce (the big one)http://labs.google.com/papers/mapreduce.html
• BigTable (clone me!)http://labs.google.com/papers/bigtable.html
• Dynamo (ok, AWS. but masterless quorum) http://s3.amazonaws.com/AllThingsDistributed/sosp/amazon-dynamo-sosp2007.pdf
7
copy these. use these. print $$$
Mike Miller, GlueCon May 2012
MapReduce: The Awesome• Approachable interface
“What do I do with a single piece of data?”
• Data ParallelDevelopers can basically forget about scatter-gather
• Fault TolerantFailure at scale is the norm!Protects both user and system operator
• IO OptimizedBuilt for sequential IOcommodity disks spinning forward at O(20 MB/sec) each
8
Mike Miller, GlueCon May 2012
So... is that it?
9
http://gigaom.com/cloud/democratizing-big-data-is-hadoop-our-only-hope/
Mike Miller, GlueCon May 2012
So... is that it?
9
http://gigaom.com/cloud/democratizing-big-data-is-hadoop-our-only-hope/
http://gigaom.com/cloud/what-it-really-means-when-someone-says-hadoop/
Mike Miller, GlueCon May 2012
So... is that it?
9
http://gigaom.com/cloud/democratizing-big-data-is-hadoop-our-only-hope/
http://gigaom.com/cloud/what-it-really-means-when-someone-says-hadoop/
http://mackiemathew.com/2012/02/25/the-problems-in-hadoop-when-does-it-fail-to-deliver/
Mike Miller, GlueCon May 2012
MapReduce: The not so Awesome
• Hadoop doesn’t power big data applicationsNot a transactional datastore. Slosh back and forth via ETL
• Processing latencyNon-incremental, must re-slurp entire dataset every pass
• Ad-Hoc queriesBare metal interface, data import
• GraphsOnly a handful of graph problems amenable to MRhttp://www.computer.org/portal/web/csdl/doi/10.1109/MCSE.2009.120
10
Mike Miller, GlueCon May 2012 11
To the Event Horizon
Mike Miller, GlueCon May 2012
Enter The New Canon• Percolator
incremental processinghttp://research.google.com/pubs/pub36726.html
• Dremelad-hoc analysis querieshttp://research.google.com/pubs/pub36632.html
• PregelBig graphshttp://dl.acm.org/citation.cfm?id=1807184
12
Scalable, Fault Tolerant, Approachable
Mike Miller, GlueCon May 2012
Percolator
13
Mike Miller, GlueCon May 2012
Percolator: incremental processing• Replaced MapReduce as the tool to build search index
“However, reprocessing the entire web discards the work done in earlier runs and makes latency proportional to the size of the repository, rather than the size of the update.”
• Bigtable alone can’t do it“BigTable scales...but doesn’t provide tools to help programmers maintain data invariants in the face of concurrent updates.”
• ApplicabilityIncrementally updating dataComputational output can be broken down into small piecesComputation large in some dimension (data size, cpu, etc)
• Does it matter?“...Converting the indexing system to an incremental system ... reduced the averaging document processing latency by a factor of 100...”
14
Mike Miller, GlueCon May 2012
Percolator: incremental processing• BigTable plus...
Multi-row ACID Transactionssnapshot isolation, lazy locksup to 10s write latencies
Timestamps
NotificationsDo not maintain invariants
Observer Frameworkyour code to be run upon notification of an update
15
Start Timestamp (read)
Commit Timestamp (write)
Mike Miller, GlueCon May 2012
Percolator: incremental processing
16
Near Linear Scaling to 15k Cores
Mike Miller, GlueCon May 2012
Percolator: incremental processing
17
Latency lower than MapReduce by 100x
Mike Miller, GlueCon May 2012
Dremel
18
Mike Miller, GlueCon May 2012
Dremel: ad-hoc Query• Scalable, interactive ad-hoc query system for read-only nested data
“...capable of running aggregation queries over trillion-row tables in seconds.”
• ... on nested data structures in situWeb and scientific data is often non-relationalnested data (protobu"s) underlies most structured data at Google
• UsageDEFINE TABLE t AS /path/to/data/*SELECT TOP(signal1,100), COUNT(*) FROM t
• ApplicabilityAnalysis of crawled documentsTracking of install data for apps on Android MarketCrash reportsSpam analysis...
19
Dream BI Tool
Mike Miller, GlueCon May 2012
Dremel: ad-hoc Query
• IngredientsIn situ dataSQL like interfaceServing trees for query executionColumn striped data (3-10x)Analysis Catalogs
20
Mike Miller, GlueCon May 2012
Dremel: ad-hoc Query
21Columns ~10x faster than Records
Mike Miller, GlueCon May 2012
Dremel: ad-hoc Query
22
MapReduce (via Sawzall)
Dremel (via SQL)
Benchmark Data
Mike Miller, GlueCon May 2012
Dremel: ad-hoc Query
23
Dremel ~100x Faster than Stock MR
Significant Optimization Possible
Mike Miller, GlueCon May 2012
Dremel: ad-hoc Query
24
Most Production Queries Executed in <10 seconds
Mike Miller, GlueCon May 2012
Pregel
25
Mike Miller, GlueCon May 2012
Pregel: Big Graphs• Massively parallel processing of big graphs
billions of vertices, trillions of edges
• Bulk synchronous parallel modelsequence of vertex oriented iterationssend/receive messages from other vertex computationsread/modify state of vertex, outgoing edges, graph topology
• Expressive, easy to programdistribution details hidden behind abstract API
• Iterativecomputation continues until each vertex votes to terminate
• In productionPageRank 15 lines of code
26
Mike Miller, GlueCon May 2012
Pregel: Big Graphs• Master “Name” node
connects processes for messaging
• Message Passingno remote procedures, reads
• Graph hashed across nodesvertex, outgoing edges stored in RAM
• Aggregators global mechanism for aggregationall but final reduce computed on node local data
• Checkpointing configurable, enables automatic recovery
27
Mike Miller, GlueCon May 2012
Pregel: Big Graphs
28
Mike Miller, GlueCon May 2012
Pregel: Big Graphs
29Near Linear Scaling to 1B nodes
Mike Miller, GlueCon May 2012
Learn More• Incremental Processing
Incremental, in-database map/reduce in Cloudant’s BigCouchHBase 0.92 supports observers/coprocessors Stream processing via Storm, HStreaming, etc.
• Ad Hoc QueryGoogle BigQueryColumn stores (Vertica, etc)OpenDremel (stalled?)?
• Big GraphsGiraph on Hadoop (Apache Incubator)Golden Orb (stalled?)
30
Mike Miller, GlueCon May 2012
Lessons Learned
• Hire Je! Dean and Sanjay Ghemawat
• GFS enables everything
• There is massive opportunity on the horizon
31