data! data! data! i can't make bricks without clay!
TRANSCRIPT
“Data! Data! Data! I Can’t Make Bricks Without
Clay!”*Shai Fine
Principal Engineer, Advanced Analytics, Intel
(*) Sherlock Holmes, The Adventure in the Copper Beeches
Analytics to the Rescue
• “Without big data analytics, companies are blind and deaf, wandering out onto the web like deer on a freeway”• Geoffrey Moore, Author of Crossing the Chasm
• … and who will lead the way?!
Big Data's High-Priests of AlgorithmsThe Wall Street Journal, Aug. 2014
Adoption of Analytics Faces Hurdles
• Developing Analytics solutions • Far from being an engineering process• There is a chasm to cross between “traditional” BI and Advanced Analytics
• Consumability of Analytics • Deploying Analytics solutions is difficult• Reliability, “Self Maintenance”
• Analytics Workloads are Challenging• Speed (latency, time-to-solution), Throughput, Scalability, …
The ML Building Blocks Concept
There are “infinite” number of algorithms and datasets
But there are finite set of Building Blocks
Building Blocks:A finite set of elements that can be mapped into HW and SW primitives and patterns
Building Blocks
UsagesHigh-level Libraries
Low-level Libraries
Hardware Platforms
Xeon
Xeon Phi
Xeon FPGA
Iris Pro Graphics
Xeon Accel.
New ISA
Tier-1
Cloud
HPC
Enterprise
Academia
Machine Learning Building Blocks
• ML basic building blocks1. Linear Algebra2. Measures3. Special Functions4. Mathematical Optimization5. Data Characteristics6. Data-dependent Compute7. Memory Access 8. Very large models9. Hybrid Methods
• ML Meta building blocks1. Learning Protocols2. Learning Phases3. Algorithmic Flow and Structure
Compute
Data
Compute - Data Interplay
Process
Towards a Comprehensive ML Workload Suite
• Workload design should cover elements of• Compute
• Data Characteristics
• Data – Compute interplay
• Each workload includes• Multiple data sets x Multiple algorithms
• Coverage of relevant data characteristics
• Coverage of compute patterns
The Building Block concept provides a mean for designing the ML Workload Suite
Machine Learning Workloads Suite
Workload Linear Algebra
Measure Calc.
Special Funcs
Math Optim.
Data Characteristics
Data-dep. Compute
Mem.Access
large model
Linear AlgebraSparseDense
X X XUn/Supervised,
Numeric
Data Dependency
X X XUn/Supervised,
Num/CatX X
Large Models X X XUn/Supervised,
NumericX
Workload Dataset Type Characteristics
Linear AlgebraClustered Dense, Numeric
Graphs Sparse, Numeric
Data Dependency
Bio informatics High Dep - Dense/Sparse Clustered Dense Text High Dep – Sparse Manufacturing High Dep – Numeric, Dense
Large Models Images Dense, Numeric
ALGORITHMS
DATASETS
Machine Learning Workloads Suite
Workload Linear Algebra
Measure Calc.
Special Funcs
Math Optim.
Data Characteristics
Data-dep. Compute
Mem.Access
large model
Linear AlgebraSparseDense
X X XUn/Supervised,
Numeric
Data Dependency
X X XUn/Supervised,
Num/CatX X
Large Models X X XUn/Supervised,
NumericX
Workload Dataset Type Characteristics
Linear AlgebraClustered Dense, Numeric
Graphs Sparse, Numeric
Data Dependency
Bio informatics High Dep - Dense/Sparse Clustered Dense Text High Dep – Sparse Manufacturing High Dep – Numeric, Dense
Large Models Images Dense, Numeric
ALGORITHMS
DATASETS
ML Bench 1.0
• Algorithm X Data
• Reference Models
• Data Generator
The “Dwarfs” Connection
• Phill Collela’s “Seven Dwarfs” (2004) –• Patterns of computation and communication
that are important for science and engineering
• Berkley’s view (2006) –• Extended to 13 Dwarfs after examining
the original 7 Dwarfs outside the HPC scope
• US National Research Council’s Committee “Frontiers in Massive Data Analysis” (2013) –• Chapter 10: “The Seven Computational Giants of Massive Data Analysis”
• The ML Building Blocks provide a further extension and a different perspective• Introducing data characteristics and the interplay with compute, communication, memory