python at yhat (august 2013)
DESCRIPTION
TRANSCRIPT
Python at YhatDev StackUpAugust 2013
Agenda
About Yhat
How we use Python
Questions
We need to reduce churn. Okay. I'll look into it.
Lots of conversations like this
I figured out that....some complex stuff about vector space that'll improve...
....and that's how we'll reduce churn.
Sounds good. Let's do that...
The "a ha" moment isn't the end.
Now what?
Any of you know what Gradient Boosting is?
So when can we go live with the new model?
What goes on in the Kludge?
Rewriting CodeBatch JobsPMML
How can we...
- eliminate implementation time
How can we...
- eliminate implementation time - let data scientists use their favorite tools
How can we...
- eliminate implementation time - let data scientists use their favorite tools
...without altering your workflow
How do we do this?
How do we do this?
great for analysis● Built for analysis and statistics● Everything is tabular● Active community; 4000+ packages
great for analysis● Built for analysis and statistics● Everything is tabular● Active community; 4000+ packages
bad for applications● Not web friendly● Everything is tabular● Slow● A list of R grievances:
○ https://github.com/tdsmith/aRrg
Hooking R up to Python
R code
R code > Compile to Bytecode
R code > Compile to Bytecode > Execute from Python
{ “data”: {
“foo”: 100, “bar”: 200
}}
Incoming data for prediction Make prediction
from Python using compiled R
R code > Compile to Bytecode > Execute from Python
Returned via REST API
Prediction sent back to Python webserver{
“prob”: 0.95}
{}
approach
Same Python server
{}
approach
Plug in different scientific environments
{ } “prob”: 0.87
approach
Predictions sent back up the chain and to the client
Result
● Ensures cross environment validation● Extensible to other languages
yhathq.com@YhatHQ
blog.yhathq.com