tools for fast numerical optimization in pythonstephentu/presentations/pyopt.pdf · the domain of...
TRANSCRIPT
![Page 1: Tools for fast numerical optimization in Pythonstephentu/presentations/pyopt.pdf · The domain of numerical programs is vastly different • User facing applications want to perform](https://reader030.vdocuments.mx/reader030/viewer/2022041014/5ec5383cc69ea076bd4df488/html5/thumbnails/1.jpg)
Tools for fast numerical optimization in Python
Stephen TuSF Python
![Page 2: Tools for fast numerical optimization in Pythonstephentu/presentations/pyopt.pdf · The domain of numerical programs is vastly different • User facing applications want to perform](https://reader030.vdocuments.mx/reader030/viewer/2022041014/5ec5383cc69ea076bd4df488/html5/thumbnails/2.jpg)
Goals• Python is a fun language for prototyping.
• C/C++ useful for optimizing performance.
• This talk: How far can we push the boundary?
![Page 3: Tools for fast numerical optimization in Pythonstephentu/presentations/pyopt.pdf · The domain of numerical programs is vastly different • User facing applications want to perform](https://reader030.vdocuments.mx/reader030/viewer/2022041014/5ec5383cc69ea076bd4df488/html5/thumbnails/3.jpg)
Warmup
• Why is the RHS over 2x faster?
![Page 4: Tools for fast numerical optimization in Pythonstephentu/presentations/pyopt.pdf · The domain of numerical programs is vastly different • User facing applications want to perform](https://reader030.vdocuments.mx/reader030/viewer/2022041014/5ec5383cc69ea076bd4df488/html5/thumbnails/4.jpg)
Warmup
0
1
2
012
![Page 5: Tools for fast numerical optimization in Pythonstephentu/presentations/pyopt.pdf · The domain of numerical programs is vastly different • User facing applications want to perform](https://reader030.vdocuments.mx/reader030/viewer/2022041014/5ec5383cc69ea076bd4df488/html5/thumbnails/5.jpg)
Warmup
• Also:
• LHS has many code inefficiencies, e.g. has to check if each element is a double (lots of branches).
• RHS can take advantage of optimized BLAS routines.
![Page 6: Tools for fast numerical optimization in Pythonstephentu/presentations/pyopt.pdf · The domain of numerical programs is vastly different • User facing applications want to perform](https://reader030.vdocuments.mx/reader030/viewer/2022041014/5ec5383cc69ea076bd4df488/html5/thumbnails/6.jpg)
Two important ideas• Keep in mind the memory layout.
• Avoid un-necessary code overhead.
• Memory is especially important for numerical software.
• FLOPS cheaper than cache misses.
![Page 7: Tools for fast numerical optimization in Pythonstephentu/presentations/pyopt.pdf · The domain of numerical programs is vastly different • User facing applications want to perform](https://reader030.vdocuments.mx/reader030/viewer/2022041014/5ec5383cc69ea076bd4df488/html5/thumbnails/7.jpg)
Latency numbers every programmer should know
http://www.eecs.berkeley.edu/~rcs/research/interactive_latency.html
![Page 8: Tools for fast numerical optimization in Pythonstephentu/presentations/pyopt.pdf · The domain of numerical programs is vastly different • User facing applications want to perform](https://reader030.vdocuments.mx/reader030/viewer/2022041014/5ec5383cc69ea076bd4df488/html5/thumbnails/8.jpg)
Latency numbers every programmer should know
http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-optimization-manual.html
![Page 9: Tools for fast numerical optimization in Pythonstephentu/presentations/pyopt.pdf · The domain of numerical programs is vastly different • User facing applications want to perform](https://reader030.vdocuments.mx/reader030/viewer/2022041014/5ec5383cc69ea076bd4df488/html5/thumbnails/9.jpg)
The domain of numerical programs is vastly different
• User facing applications want to perform tasks such as send HTTP request, make a database query, parse a string, etc.
• Numerical applications want to perform tasks such as matrix multiplication, take an SVD, solve Ax=b, etc.
![Page 10: Tools for fast numerical optimization in Pythonstephentu/presentations/pyopt.pdf · The domain of numerical programs is vastly different • User facing applications want to perform](https://reader030.vdocuments.mx/reader030/viewer/2022041014/5ec5383cc69ea076bd4df488/html5/thumbnails/10.jpg)
Applications• Machine learning
• Recommendation systems
• Scientific computing
• Image processing
• Algorithmic trading
• List goes on and on…
![Page 11: Tools for fast numerical optimization in Pythonstephentu/presentations/pyopt.pdf · The domain of numerical programs is vastly different • User facing applications want to perform](https://reader030.vdocuments.mx/reader030/viewer/2022041014/5ec5383cc69ea076bd4df488/html5/thumbnails/11.jpg)
Many great tools
Numba
![Page 12: Tools for fast numerical optimization in Pythonstephentu/presentations/pyopt.pdf · The domain of numerical programs is vastly different • User facing applications want to perform](https://reader030.vdocuments.mx/reader030/viewer/2022041014/5ec5383cc69ea076bd4df488/html5/thumbnails/12.jpg)
This talk• Cython + use cases.
• Numba + use cases.
• Cython/numba in action on real examples motivated from machine learning and optimization.
![Page 13: Tools for fast numerical optimization in Pythonstephentu/presentations/pyopt.pdf · The domain of numerical programs is vastly different • User facing applications want to perform](https://reader030.vdocuments.mx/reader030/viewer/2022041014/5ec5383cc69ea076bd4df488/html5/thumbnails/13.jpg)
Cython
![Page 14: Tools for fast numerical optimization in Pythonstephentu/presentations/pyopt.pdf · The domain of numerical programs is vastly different • User facing applications want to perform](https://reader030.vdocuments.mx/reader030/viewer/2022041014/5ec5383cc69ea076bd4df488/html5/thumbnails/14.jpg)
Cython• Glue between Python and C/C++.
• Two main purposes:
• Wrapping existing C/C++ libraries.
• Writing Python like code that gets compiled down to C++.
![Page 15: Tools for fast numerical optimization in Pythonstephentu/presentations/pyopt.pdf · The domain of numerical programs is vastly different • User facing applications want to perform](https://reader030.vdocuments.mx/reader030/viewer/2022041014/5ec5383cc69ea076bd4df488/html5/thumbnails/15.jpg)
Cython• Wrapping libraries manually is a pain
https://github.com/mblondel/svmlight-loader/blob/master/_svmlight_loader.cpp
![Page 16: Tools for fast numerical optimization in Pythonstephentu/presentations/pyopt.pdf · The domain of numerical programs is vastly different • User facing applications want to perform](https://reader030.vdocuments.mx/reader030/viewer/2022041014/5ec5383cc69ea076bd4df488/html5/thumbnails/16.jpg)
Cython• Helps reduce the amount of boilerplate when
wrapping.
http://docs.cython.org/src/tutorial/clibraries.html
![Page 17: Tools for fast numerical optimization in Pythonstephentu/presentations/pyopt.pdf · The domain of numerical programs is vastly different • User facing applications want to perform](https://reader030.vdocuments.mx/reader030/viewer/2022041014/5ec5383cc69ea076bd4df488/html5/thumbnails/17.jpg)
Cython• Reducing boilerplate is a big deal!
• As professional software devs, doing the former already feels painful.
• Imagine how much worse it would be if you were not an expert developer.
![Page 18: Tools for fast numerical optimization in Pythonstephentu/presentations/pyopt.pdf · The domain of numerical programs is vastly different • User facing applications want to perform](https://reader030.vdocuments.mx/reader030/viewer/2022041014/5ec5383cc69ea076bd4df488/html5/thumbnails/18.jpg)
Cython• Second purpose: a Python like language for writing
code that gets compiled.
http://docs.cython.org/src/tutorial/cython_tutorial.html
![Page 19: Tools for fast numerical optimization in Pythonstephentu/presentations/pyopt.pdf · The domain of numerical programs is vastly different • User facing applications want to perform](https://reader030.vdocuments.mx/reader030/viewer/2022041014/5ec5383cc69ea076bd4df488/html5/thumbnails/19.jpg)
Cython
http://notes-on-cython.readthedocs.org/en/latest/std_dev.html
![Page 20: Tools for fast numerical optimization in Pythonstephentu/presentations/pyopt.pdf · The domain of numerical programs is vastly different • User facing applications want to perform](https://reader030.vdocuments.mx/reader030/viewer/2022041014/5ec5383cc69ea076bd4df488/html5/thumbnails/20.jpg)
Cython• Drawback: workflow is not as transparent anymore,
need to remember to re-compile (very easy to forget!)
• Drawback: software distribution can get annoying, for all the same reasons distributing binaries is annoying.
• pip install can be made to work, but fingers crossed user has a compatible C++ compiler.
![Page 21: Tools for fast numerical optimization in Pythonstephentu/presentations/pyopt.pdf · The domain of numerical programs is vastly different • User facing applications want to perform](https://reader030.vdocuments.mx/reader030/viewer/2022041014/5ec5383cc69ea076bd4df488/html5/thumbnails/21.jpg)
Numba
![Page 22: Tools for fast numerical optimization in Pythonstephentu/presentations/pyopt.pdf · The domain of numerical programs is vastly different • User facing applications want to perform](https://reader030.vdocuments.mx/reader030/viewer/2022041014/5ec5383cc69ea076bd4df488/html5/thumbnails/22.jpg)
Numba• Really cool project from Continuum.
• Write regular Python, but decorate functions with @jit.
• Watch magic happen.
![Page 23: Tools for fast numerical optimization in Pythonstephentu/presentations/pyopt.pdf · The domain of numerical programs is vastly different • User facing applications want to perform](https://reader030.vdocuments.mx/reader030/viewer/2022041014/5ec5383cc69ea076bd4df488/html5/thumbnails/23.jpg)
Numba
Over 100x faster by adding 4 characters of source!
![Page 24: Tools for fast numerical optimization in Pythonstephentu/presentations/pyopt.pdf · The domain of numerical programs is vastly different • User facing applications want to perform](https://reader030.vdocuments.mx/reader030/viewer/2022041014/5ec5383cc69ea076bd4df488/html5/thumbnails/24.jpg)
Numba• Drawback: Code that can be JIT-ed is a somewhat
limited subset of Python.
• Drawback: Hard to debug perf problems.
![Page 25: Tools for fast numerical optimization in Pythonstephentu/presentations/pyopt.pdf · The domain of numerical programs is vastly different • User facing applications want to perform](https://reader030.vdocuments.mx/reader030/viewer/2022041014/5ec5383cc69ea076bd4df488/html5/thumbnails/25.jpg)
Applications
![Page 26: Tools for fast numerical optimization in Pythonstephentu/presentations/pyopt.pdf · The domain of numerical programs is vastly different • User facing applications want to perform](https://reader030.vdocuments.mx/reader030/viewer/2022041014/5ec5383cc69ea076bd4df488/html5/thumbnails/26.jpg)
Hidden markov models• Hidden markov model (HMM): Latent (unknown)
state Z_t and observed sequence X_t, where Z_t are assumed to be a Markov chain, and X_t depends only on Z_t
http://blog.oliverparson.co.uk/2011/06/using-single-hidden-markov-model-to.html
![Page 27: Tools for fast numerical optimization in Pythonstephentu/presentations/pyopt.pdf · The domain of numerical programs is vastly different • User facing applications want to perform](https://reader030.vdocuments.mx/reader030/viewer/2022041014/5ec5383cc69ea076bd4df488/html5/thumbnails/27.jpg)
Hidden markov models• Learning problem: Given observed sequence (X_1,
…, X_T), estimate both the transition and emission probabilities.
• Most common algorithm is expectation-maximization (EM).
http://www.cs.berkeley.edu/~stephentu/writeups/mixturemodels.pdf
![Page 28: Tools for fast numerical optimization in Pythonstephentu/presentations/pyopt.pdf · The domain of numerical programs is vastly different • User facing applications want to perform](https://reader030.vdocuments.mx/reader030/viewer/2022041014/5ec5383cc69ea076bd4df488/html5/thumbnails/28.jpg)
Hidden markov models• Specialized to HMMs, the updates look like:
http://www.cs.berkeley.edu/~stephentu/writeups/hmm-baum-welch-derivation.pdf
![Page 29: Tools for fast numerical optimization in Pythonstephentu/presentations/pyopt.pdf · The domain of numerical programs is vastly different • User facing applications want to perform](https://reader030.vdocuments.mx/reader030/viewer/2022041014/5ec5383cc69ea076bd4df488/html5/thumbnails/29.jpg)
Hidden markov models
![Page 30: Tools for fast numerical optimization in Pythonstephentu/presentations/pyopt.pdf · The domain of numerical programs is vastly different • User facing applications want to perform](https://reader030.vdocuments.mx/reader030/viewer/2022041014/5ec5383cc69ea076bd4df488/html5/thumbnails/30.jpg)
Hidden markov models
![Page 31: Tools for fast numerical optimization in Pythonstephentu/presentations/pyopt.pdf · The domain of numerical programs is vastly different • User facing applications want to perform](https://reader030.vdocuments.mx/reader030/viewer/2022041014/5ec5383cc69ea076bd4df488/html5/thumbnails/31.jpg)
Hidden markov models• The result: fairly fast implementation with a nice
Pythonic interface, with minimal effort.
![Page 32: Tools for fast numerical optimization in Pythonstephentu/presentations/pyopt.pdf · The domain of numerical programs is vastly different • User facing applications want to perform](https://reader030.vdocuments.mx/reader030/viewer/2022041014/5ec5383cc69ea076bd4df488/html5/thumbnails/32.jpg)
data-microscopes• Project to bring non-parametric Bayesian models to
Python land.
datamicroscopes.github.io
![Page 33: Tools for fast numerical optimization in Pythonstephentu/presentations/pyopt.pdf · The domain of numerical programs is vastly different • User facing applications want to perform](https://reader030.vdocuments.mx/reader030/viewer/2022041014/5ec5383cc69ea076bd4df488/html5/thumbnails/33.jpg)
data-microscopes• Core inference procedure is Gibbs sampling.
• In a nutshell: iterative condition on all minus coordinate of your current estimate, and sample the one left out.
• For our mixture models, requires computing
![Page 34: Tools for fast numerical optimization in Pythonstephentu/presentations/pyopt.pdf · The domain of numerical programs is vastly different • User facing applications want to perform](https://reader030.vdocuments.mx/reader030/viewer/2022041014/5ec5383cc69ea076bd4df488/html5/thumbnails/34.jpg)
data-microscopes• User facing API written in Python— e.g. model
specification, config options, etc.
• Core inference engine written in C++.
• Glued together with Cython.
• Uses protobuf to pass complex data structures around (kind of hacky).
![Page 35: Tools for fast numerical optimization in Pythonstephentu/presentations/pyopt.pdf · The domain of numerical programs is vastly different • User facing applications want to perform](https://reader030.vdocuments.mx/reader030/viewer/2022041014/5ec5383cc69ea076bd4df488/html5/thumbnails/35.jpg)
data-microscopes
![Page 36: Tools for fast numerical optimization in Pythonstephentu/presentations/pyopt.pdf · The domain of numerical programs is vastly different • User facing applications want to perform](https://reader030.vdocuments.mx/reader030/viewer/2022041014/5ec5383cc69ea076bd4df488/html5/thumbnails/36.jpg)
data-microscopes• The result is a very nice Pythonic interface, backed
by a powerful C++ implementation.
![Page 37: Tools for fast numerical optimization in Pythonstephentu/presentations/pyopt.pdf · The domain of numerical programs is vastly different • User facing applications want to perform](https://reader030.vdocuments.mx/reader030/viewer/2022041014/5ec5383cc69ea076bd4df488/html5/thumbnails/37.jpg)
Low rank matrix recovery• We focus now on a class of optimization problems
with applications to recommendation systems.
min
X2Rn1⇥n2rank(X) subj. to hAi, Xi = bi, i = 1, ...,m .
![Page 38: Tools for fast numerical optimization in Pythonstephentu/presentations/pyopt.pdf · The domain of numerical programs is vastly different • User facing applications want to perform](https://reader030.vdocuments.mx/reader030/viewer/2022041014/5ec5383cc69ea076bd4df488/html5/thumbnails/38.jpg)
Low rank matrix recovery• Predominant algorithm for solving these problems
is the following iterative local search:
• Hold y, sigma fixed, and minimize w.r.t R
• Hold R fixed, and update y, sigma accordingly.
L(R, y,�) := hC,RRT i �mX
i=1
yi(hAi, RRT i � bi) +�
2
mX
i=1
(hAi, RRT i � bi)2
![Page 39: Tools for fast numerical optimization in Pythonstephentu/presentations/pyopt.pdf · The domain of numerical programs is vastly different • User facing applications want to perform](https://reader030.vdocuments.mx/reader030/viewer/2022041014/5ec5383cc69ea076bd4df488/html5/thumbnails/39.jpg)
Low rank matrix recovery• Minimizing w.r.t R is a numerically intensive task, so
we would like to offload as much work as possible.
• Scipy’s fmin_l_bfgs_b is a good candidate, requiring only gradients and function evaluations.
• Hence, implement gradients / function evals in Numba.
rRL(R, y,�) = 2CR� 2mX
i=1
yiAiR+ 2�mX
i=1
(hAi, RRT i � bi)AiR
![Page 40: Tools for fast numerical optimization in Pythonstephentu/presentations/pyopt.pdf · The domain of numerical programs is vastly different • User facing applications want to perform](https://reader030.vdocuments.mx/reader030/viewer/2022041014/5ec5383cc69ea076bd4df488/html5/thumbnails/40.jpg)
Low rank matrix recovery
![Page 41: Tools for fast numerical optimization in Pythonstephentu/presentations/pyopt.pdf · The domain of numerical programs is vastly different • User facing applications want to perform](https://reader030.vdocuments.mx/reader030/viewer/2022041014/5ec5383cc69ea076bd4df488/html5/thumbnails/41.jpg)
To conclude• Can get the best of both worlds, even for numerical
software, by using great tools such as Cython, Numba.
• Often, only a small kernel needs to be fast, and is not hard to identify.
• Many thanks to the great developers who build such useful, practical tools!
![Page 42: Tools for fast numerical optimization in Pythonstephentu/presentations/pyopt.pdf · The domain of numerical programs is vastly different • User facing applications want to perform](https://reader030.vdocuments.mx/reader030/viewer/2022041014/5ec5383cc69ea076bd4df488/html5/thumbnails/42.jpg)
ReferencesIn case you were interested in the algorithmic aspects of the talk:
Mixture models, clustering, HMMs:
K. Murphy. Machine learning: a probabilistic perspective. MIT Press, 2012.
Dirichlet processes, non-parametric Bayes:
R. Neal. Markov chain sampling methods for dirichlet process mixture models. Tech report, U of Toronto, 1998.
Y. W. Teh et al., Hierarchical dirichlet processes. J. Am. Stat. Assoc, 2006.
Algorithms for low rank matrix recovery:
S. Burer and R. Monteiro. A nonlinear programming algorithm for solving semidefinite programs via low-rank factorization. Mathematical Programming, 2001.
L. Vandenberghe and S. Boyd. Semidefinite programming. SIAM Review, 1996.
S. Tu and J. Wang. Practical first order methods for large scale semidefinite programming. Unpublished, 2014.