optimization with big data

14
Peter Richtarik School of Mathematics Optimization with Big Data * in a billion dimensional space on a foggy day Extreme* Mountain Climbing =

Upload: afra

Post on 10-Jan-2016

39 views

Category:

Documents


1 download

DESCRIPTION

=. Extreme* Mountain Climbing. Optimization with Big Data. * in a billion dimensional space on a foggy day. Peter Richtarik School of Mathematics. BIG DATA. BIG Volume BIG Velocity BIG Variety. BIG Volume BIG Velocity BIG Variety. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Optimization with Big Data

Peter RichtarikSchool of Mathematics

Optimization with Big Data * in a billion dimensional space on a foggy day

Extreme* Mountain Climbing=

Page 2: Optimization with Big Data

BIG DATA

• digital images & videos• transaction records• government records• health records• defence• internet activity (social media, wikipedia, ...)• scientific measurements (physics, climate models, ...)

BIG Volume BIG Velocity BIG Variety

Sources

BIG Volume BIG Velocity BIG Variety

Page 3: Optimization with Big Data

Western General Hospital(Creutzfeldt-Jakob Disease)

Arup (Truss Topology Design)

Ministry of Defence dstl lab(Algorithms for Data Simplicity)Royal Observatory

(Optimal Planet Growth)

Page 4: Optimization with Big Data

GOD’S Algorithm = Teleportation

Page 5: Optimization with Big Data

If you are not a God...

x0x1

x2 x3

Page 6: Optimization with Big Data

Optimization as Lock Breaking

Setup: Combination maximizing F opens the lock

x = (x1, x2, x3, x4) F(x) = F(x1, x2, x3, x4)

A number representing the

“quality” of a combination

Optimization Problem: Find combination maximizing F

Page 7: Optimization with Big Data

Optimization Algorithm

Page 8: Optimization with Big Data

How to Open a Lock with Billion Interconnected Dials?

F : Rn R# variables/dials = n = 109

x1

x2

Assumption:F = F1 + F2 + ... + Fn

-----------------------Fj depends on the neighbours of xj only

x3

x4

Example:F1 depends on x1, x2, x3 and x4

F2 depends on x1 and x2, ...

xn

Page 9: Optimization with Big Data

Optimization Methods

Computing Architectures• Multicore CPUs• GP GPU accelerators• Clusters / Clouds

• Effectivity• Efficiency• Scalability• Parallelism• Distribution• Asynchronicity• Randomization

Page 10: Optimization with Big Data

Optimization Methods for Big Data

• Randomized Coordinate Descent– P. R. and M. Takac: Parallel coordinate descent

methods for big data optimization, ArXiv:1212.0873 [can solve a problem with 1 billion variables in 2 hours using 24

processors]• Stochastic (Sub) Gradient Descent

– P. R. and M. Takac: Randomized lock-free methods for minimizing partially separable convex functions

[can be applied to optimize an unknown function]• Both of the above

M. Takac, A. Bijral, P. R. and N. Srebro: Mini-batch primal and dual methods for SVMs, ArXiv:1302.xxxx

Page 11: Optimization with Big Data

Theory vs Reality

Page 12: Optimization with Big Data

start

settle for this

holy grail

Parallel Coordinate Descent

Page 13: Optimization with Big Data

TOOLSProbability

Machine LearningMatrix Theory

HPC

Page 14: Optimization with Big Data