data science-summit mtl 2015 - the end of it departments and data-science empowered by ipython...
TRANSCRIPT
![Page 2: Data science-summit MTL 2015 - The end of IT departments and data-science empowered by ipython notebook](https://reader031.vdocuments.mx/reader031/viewer/2022030319/58ea1d4b1a28abf9018b4883/html5/thumbnails/2.jpg)
![Page 3: Data science-summit MTL 2015 - The end of IT departments and data-science empowered by ipython notebook](https://reader031.vdocuments.mx/reader031/viewer/2022030319/58ea1d4b1a28abf9018b4883/html5/thumbnails/3.jpg)
![Page 4: Data science-summit MTL 2015 - The end of IT departments and data-science empowered by ipython notebook](https://reader031.vdocuments.mx/reader031/viewer/2022030319/58ea1d4b1a28abf9018b4883/html5/thumbnails/4.jpg)
Plan1. Topics
○ Why the end of IT departments will help data-scientists ○ Data-Science empowered by ipython notebooks
2. Use cases○ Algo trading ○ Clustering visualization○ Confusion matrix visualization○ Outlier inspection○ Session clustering (idstats)○ Amazing data-science platform: Quantopian
![Page 5: Data science-summit MTL 2015 - The end of IT departments and data-science empowered by ipython notebook](https://reader031.vdocuments.mx/reader031/viewer/2022030319/58ea1d4b1a28abf9018b4883/html5/thumbnails/5.jpg)
QAJust another barrier of entry
Reminder: Data MaturityBarriers of entry Levels
ML ● Sampling● Big-Data
Level 5 | Level 1 | Level 2 | Level 3 | Level 4
![Page 6: Data science-summit MTL 2015 - The end of IT departments and data-science empowered by ipython notebook](https://reader031.vdocuments.mx/reader031/viewer/2022030319/58ea1d4b1a28abf9018b4883/html5/thumbnails/6.jpg)
The end of IT departments
● Car > 30K● Gaz+parking = 5k● max speed = 180 KM/h● avg speed = 10 km/h● ROI = 29%
● bike < 1K● max speed = 45 km/h● avg speed = 30 km/h● ROI = 3000%
IT department
![Page 7: Data science-summit MTL 2015 - The end of IT departments and data-science empowered by ipython notebook](https://reader031.vdocuments.mx/reader031/viewer/2022030319/58ea1d4b1a28abf9018b4883/html5/thumbnails/7.jpg)
IT department only argument
![Page 8: Data science-summit MTL 2015 - The end of IT departments and data-science empowered by ipython notebook](https://reader031.vdocuments.mx/reader031/viewer/2022030319/58ea1d4b1a28abf9018b4883/html5/thumbnails/8.jpg)
Strategies to get rid of IT department*
*If don't cooperate, too slow, have always an excuse -> union approach
1. Bypass them/ignore -> workarounds
http://fraka6.blogspot.ca/2014/08/dev-principle-you-should-apply-every.html
![Page 9: Data science-summit MTL 2015 - The end of IT departments and data-science empowered by ipython notebook](https://reader031.vdocuments.mx/reader031/viewer/2022030319/58ea1d4b1a28abf9018b4883/html5/thumbnails/9.jpg)
Strategies to get rid of IT department*
*If don't cooperate, too slow, have always an excuse -> union approach
1. Bypass them/ignore -> workarounds2. Play their game -> Help them hang themselves
![Page 10: Data science-summit MTL 2015 - The end of IT departments and data-science empowered by ipython notebook](https://reader031.vdocuments.mx/reader031/viewer/2022030319/58ea1d4b1a28abf9018b4883/html5/thumbnails/10.jpg)
Strategy: Play the gamedon't fight
1. Dialogue = explain goals 2. Listen proposal3. Explain why it's not a good idea if its not4. Do as they say (don't fight too much) -> Try5. Evaluate: Failure + cost + lost 3 months6. Who will be fired?
![Page 11: Data science-summit MTL 2015 - The end of IT departments and data-science empowered by ipython notebook](https://reader031.vdocuments.mx/reader031/viewer/2022030319/58ea1d4b1a28abf9018b4883/html5/thumbnails/11.jpg)
The NLU pipelinevirtual assistant
Why?● Measure,Understand and Improve Virtual Assistant User Experience
What?● Measure user experience (task completion), retention, ...● Understand good/bad user experience ->
○ Speech○ UX○ Dialog○ User○ Client vs server side○ Latency….
![Page 12: Data science-summit MTL 2015 - The end of IT departments and data-science empowered by ipython notebook](https://reader031.vdocuments.mx/reader031/viewer/2022030319/58ea1d4b1a28abf9018b4883/html5/thumbnails/12.jpg)
IT layer: R&D hadoop cluster
SQL layer of abstraction
Hook -> hadoop streaming
![Page 13: Data science-summit MTL 2015 - The end of IT departments and data-science empowered by ipython notebook](https://reader031.vdocuments.mx/reader031/viewer/2022030319/58ea1d4b1a28abf9018b4883/html5/thumbnails/13.jpg)
Data-Science empowered by ipython notebook
wsgi
Proto to Prod
Exploration to Proto
![Page 14: Data science-summit MTL 2015 - The end of IT departments and data-science empowered by ipython notebook](https://reader031.vdocuments.mx/reader031/viewer/2022030319/58ea1d4b1a28abf9018b4883/html5/thumbnails/14.jpg)
IPython NotebookThe IPython Notebook is an interactive computational environment, in which you can combine code execution, rich
text, mathematics, plots and rich media, as shown in this example session:
ipython notebook
http://fraka6.blogspot.ca/2015/04/how-to-create-your-ipython-datascience.html
extend to all language
![Page 15: Data science-summit MTL 2015 - The end of IT departments and data-science empowered by ipython notebook](https://reader031.vdocuments.mx/reader031/viewer/2022030319/58ea1d4b1a28abf9018b4883/html5/thumbnails/15.jpg)
notebook - train a algo trading strategytrading/example/run.py (pytrade->pandas + sklearn + theanets + matplotlib)
![Page 16: Data science-summit MTL 2015 - The end of IT departments and data-science empowered by ipython notebook](https://reader031.vdocuments.mx/reader031/viewer/2022030319/58ea1d4b1a28abf9018b4883/html5/thumbnails/16.jpg)
notebook integration in git
![Page 17: Data science-summit MTL 2015 - The end of IT departments and data-science empowered by ipython notebook](https://reader031.vdocuments.mx/reader031/viewer/2022030319/58ea1d4b1a28abf9018b4883/html5/thumbnails/17.jpg)
Default queriesPCA, LLE, isomap, t-sne etc. -> mlboost/clustering/visu.py (scikit-learn+matplotlib)
http://fraka6.blogspot.ca/2013/04/simplifying-clustering-visualization.html
![Page 18: Data science-summit MTL 2015 - The end of IT departments and data-science empowered by ipython notebook](https://reader031.vdocuments.mx/reader031/viewer/2022030319/58ea1d4b1a28abf9018b4883/html5/thumbnails/18.jpg)
Confusion matrices visu/mlboost/util/sklearn_confusion_matrix.py (sklearn+matplotlib)
http://fraka6.blogspot.ca/2013/05/generating-confusion-matrix-great.html
![Page 19: Data science-summit MTL 2015 - The end of IT departments and data-science empowered by ipython notebook](https://reader031.vdocuments.mx/reader031/viewer/2022030319/58ea1d4b1a28abf9018b4883/html5/thumbnails/19.jpg)
Simple way to inspect outliers?mlboost/clustering/visu.py (matplotlib+scipy)
http://fraka6.blogspot.ca/2013/09/a-simple-way-to-identify-outliers-and.html
![Page 20: Data science-summit MTL 2015 - The end of IT departments and data-science empowered by ipython notebook](https://reader031.vdocuments.mx/reader031/viewer/2022030319/58ea1d4b1a28abf9018b4883/html5/thumbnails/20.jpg)
How to see session clusters?mlboost/utils/idstats.py (mlboost)
http://fraka6.blogspot.ca/2013/09/a-simple-way-to-identify-outliers-and.html
![Page 21: Data science-summit MTL 2015 - The end of IT departments and data-science empowered by ipython notebook](https://reader031.vdocuments.mx/reader031/viewer/2022030319/58ea1d4b1a28abf9018b4883/html5/thumbnails/21.jpg)
The quantopian use caseCommunity+Research->Experiment->deploy
![Page 22: Data science-summit MTL 2015 - The end of IT departments and data-science empowered by ipython notebook](https://reader031.vdocuments.mx/reader031/viewer/2022030319/58ea1d4b1a28abf9018b4883/html5/thumbnails/22.jpg)
The quantopian use caseCommunity+Research->Experiment->deploy
![Page 23: Data science-summit MTL 2015 - The end of IT departments and data-science empowered by ipython notebook](https://reader031.vdocuments.mx/reader031/viewer/2022030319/58ea1d4b1a28abf9018b4883/html5/thumbnails/23.jpg)
The quantopian use caseCommunity+Research->Experiment->deploy
![Page 24: Data science-summit MTL 2015 - The end of IT departments and data-science empowered by ipython notebook](https://reader031.vdocuments.mx/reader031/viewer/2022030319/58ea1d4b1a28abf9018b4883/html5/thumbnails/24.jpg)
The quantopian use caseCommunity+Research->Experiment->deploy
![Page 25: Data science-summit MTL 2015 - The end of IT departments and data-science empowered by ipython notebook](https://reader031.vdocuments.mx/reader031/viewer/2022030319/58ea1d4b1a28abf9018b4883/html5/thumbnails/25.jpg)
The quantopian use caseCommunity+Research->Experiment->deploy
![Page 26: Data science-summit MTL 2015 - The end of IT departments and data-science empowered by ipython notebook](https://reader031.vdocuments.mx/reader031/viewer/2022030319/58ea1d4b1a28abf9018b4883/html5/thumbnails/26.jpg)
The quantopian use caseCommunity+Research->Experiment->deploy
![Page 27: Data science-summit MTL 2015 - The end of IT departments and data-science empowered by ipython notebook](https://reader031.vdocuments.mx/reader031/viewer/2022030319/58ea1d4b1a28abf9018b4883/html5/thumbnails/27.jpg)
Quantopian = leader in data-science platform & fintech revolution
Self-disrupt or be disrupted
![Page 28: Data science-summit MTL 2015 - The end of IT departments and data-science empowered by ipython notebook](https://reader031.vdocuments.mx/reader031/viewer/2022030319/58ea1d4b1a28abf9018b4883/html5/thumbnails/28.jpg)
Python as a leverage
![Page 29: Data science-summit MTL 2015 - The end of IT departments and data-science empowered by ipython notebook](https://reader031.vdocuments.mx/reader031/viewer/2022030319/58ea1d4b1a28abf9018b4883/html5/thumbnails/29.jpg)
Conclusion -> disrupt or be disrupted
● IT department = constraint to efficient data-science○ IT -> business solution but also biggest problem ○ IT departments will die it's not an if but when ○ Last argument = Security○ Strategy = outsource (amazon) or be inefficient ○ Why they hire old CIO …
○ IPython notebook = efficient exploration● Follow the lead of quantopian
○ Community+ python(Research->Experiment->deploy)● To be data-driven, we need data efficiency at any cost