![Page 1: Building A Machine Learning Platform At Quora (1)](https://reader030.vdocuments.mx/reader030/viewer/2022021423/587e14ab1a28abbc2e8b4f81/html5/thumbnails/1.jpg)
Building a Machine Learning Platform at Quora
Nikhil Garg @nikhilgarg28
@Quora @MLconf 11/11/16
The Quora Answer To “Build vs Buy” For ML Platforms
![Page 2: Building A Machine Learning Platform At Quora (1)](https://reader030.vdocuments.mx/reader030/viewer/2022021423/587e14ab1a28abbc2e8b4f81/html5/thumbnails/2.jpg)
● At Quora since 2012
● Currently leading two ML engineering teams:
○ Content Quality
○ ML Platform
A bit about me...
@nikhilgarg28
![Page 3: Building A Machine Learning Platform At Quora (1)](https://reader030.vdocuments.mx/reader030/viewer/2022021423/587e14ab1a28abbc2e8b4f81/html5/thumbnails/3.jpg)
![Page 4: Building A Machine Learning Platform At Quora (1)](https://reader030.vdocuments.mx/reader030/viewer/2022021423/587e14ab1a28abbc2e8b4f81/html5/thumbnails/4.jpg)
To Grow And Share World’s Knowledge
![Page 5: Building A Machine Learning Platform At Quora (1)](https://reader030.vdocuments.mx/reader030/viewer/2022021423/587e14ab1a28abbc2e8b4f81/html5/thumbnails/5.jpg)
![Page 6: Building A Machine Learning Platform At Quora (1)](https://reader030.vdocuments.mx/reader030/viewer/2022021423/587e14ab1a28abbc2e8b4f81/html5/thumbnails/6.jpg)
![Page 7: Building A Machine Learning Platform At Quora (1)](https://reader030.vdocuments.mx/reader030/viewer/2022021423/587e14ab1a28abbc2e8b4f81/html5/thumbnails/7.jpg)
Over 100 million monthly uniques
Millions of questions & answers
In hundreds of thousands of topics
Supported by 80 engineers
![Page 8: Building A Machine Learning Platform At Quora (1)](https://reader030.vdocuments.mx/reader030/viewer/2022021423/587e14ab1a28abbc2e8b4f81/html5/thumbnails/8.jpg)
What Slows Down ML Innovation?
![Page 9: Building A Machine Learning Platform At Quora (1)](https://reader030.vdocuments.mx/reader030/viewer/2022021423/587e14ab1a28abbc2e8b4f81/html5/thumbnails/9.jpg)
● Pipeline jungles
● Lots of glue code to get data in/out of general
purpose packages.
● Strong coupling between business logic, data, ML
algorithms and configuration.
Curse Of Complexity
![Page 10: Building A Machine Learning Platform At Quora (1)](https://reader030.vdocuments.mx/reader030/viewer/2022021423/587e14ab1a28abbc2e8b4f81/html5/thumbnails/10.jpg)
● Online vs offline
● Production vs experimentation
● C++ vs Python
● Engineering vs research
● ...even more glue code and pipeline jungles.
Clash Of Titans
![Page 11: Building A Machine Learning Platform At Quora (1)](https://reader030.vdocuments.mx/reader030/viewer/2022021423/587e14ab1a28abbc2e8b4f81/html5/thumbnails/11.jpg)
● Hard to reuse existing features, data, algorithms,
tooling etc.
● Too costly to even get off the ground.
Getting New Applications Off The Ground
http://www.qvidian.com/blog/resistance-to-change-sales-organizations
![Page 12: Building A Machine Learning Platform At Quora (1)](https://reader030.vdocuments.mx/reader030/viewer/2022021423/587e14ab1a28abbc2e8b4f81/html5/thumbnails/12.jpg)
Many Faces Of Chaos
![Page 13: Building A Machine Learning Platform At Quora (1)](https://reader030.vdocuments.mx/reader030/viewer/2022021423/587e14ab1a28abbc2e8b4f81/html5/thumbnails/13.jpg)
One ring to bring them all and in
the darkness bind them!
![Page 14: Building A Machine Learning Platform At Quora (1)](https://reader030.vdocuments.mx/reader030/viewer/2022021423/587e14ab1a28abbc2e8b4f81/html5/thumbnails/14.jpg)
Collection of systems to sustainably increase the
business impact of ML at scale.
Machine Learning Platform
![Page 15: Building A Machine Learning Platform At Quora (1)](https://reader030.vdocuments.mx/reader030/viewer/2022021423/587e14ab1a28abbc2e8b4f81/html5/thumbnails/15.jpg)
ML Platform: Build or Buy?
![Page 16: Building A Machine Learning Platform At Quora (1)](https://reader030.vdocuments.mx/reader030/viewer/2022021423/587e14ab1a28abbc2e8b4f81/html5/thumbnails/16.jpg)
The Quora Answer: Build
For Seven Reasons
![Page 17: Building A Machine Learning Platform At Quora (1)](https://reader030.vdocuments.mx/reader030/viewer/2022021423/587e14ab1a28abbc2e8b4f81/html5/thumbnails/17.jpg)
Reason # 7
Just Can’t Buy Everything!
![Page 18: Building A Machine Learning Platform At Quora (1)](https://reader030.vdocuments.mx/reader030/viewer/2022021423/587e14ab1a28abbc2e8b4f81/html5/thumbnails/18.jpg)
● No matter how powerful the platform is, still need to
maintain some form of integration
● This thin integration layer then becomes the platform.
● Real questions --
○ How much does this in-house layer delegate?
○ How much control does it have over delegation?
.
Degree Of Integration & Delegation
![Page 19: Building A Machine Learning Platform At Quora (1)](https://reader030.vdocuments.mx/reader030/viewer/2022021423/587e14ab1a28abbc2e8b4f81/html5/thumbnails/19.jpg)
Reason # 6
Fast Scalable Production Systems
![Page 20: Building A Machine Learning Platform At Quora (1)](https://reader030.vdocuments.mx/reader030/viewer/2022021423/587e14ab1a28abbc2e8b4f81/html5/thumbnails/20.jpg)
End-To-End Online Production Systems
● External platforms at best can deploy “predictive models”, as
services, not end-to-end online systems
● Gains come from optimizing the whole pipeline, not just
algorithms.
● Latency: tens of milliseconds. Managing sharding, batching, data
locality, caching, streaming, stragglers, graceful degradation...
● Real world systems -- boosts, diversity constraints, holes in data,
skipping stages, hard filters… sounds familiar?
Candidate Generation
Feature Extraction
Scoring
Post Processing
Data
![Page 21: Building A Machine Learning Platform At Quora (1)](https://reader030.vdocuments.mx/reader030/viewer/2022021423/587e14ab1a28abbc2e8b4f81/html5/thumbnails/21.jpg)
Reason # 5
Blurry Line Between Experimentation & Production
![Page 22: Building A Machine Learning Platform At Quora (1)](https://reader030.vdocuments.mx/reader030/viewer/2022021423/587e14ab1a28abbc2e8b4f81/html5/thumbnails/22.jpg)
● We want the same code/systems/tools to
work for both experimentation &
production.
● But we need to carefully “control” the
production code to keep it be fast.
● So need to “control” offline
experimentation systems too.
Candidate Generation
Feature Extraction
Scoring
Post Processing
Data
Candidate Generation
Feature Extraction
Training
![Page 23: Building A Machine Learning Platform At Quora (1)](https://reader030.vdocuments.mx/reader030/viewer/2022021423/587e14ab1a28abbc2e8b4f81/html5/thumbnails/23.jpg)
Reason # 4
Openly Using Open Source
![Page 24: Building A Machine Learning Platform At Quora (1)](https://reader030.vdocuments.mx/reader030/viewer/2022021423/587e14ab1a28abbc2e8b4f81/html5/thumbnails/24.jpg)
![Page 25: Building A Machine Learning Platform At Quora (1)](https://reader030.vdocuments.mx/reader030/viewer/2022021423/587e14ab1a28abbc2e8b4f81/html5/thumbnails/25.jpg)
● Logistic Regression
● Elastic Nets
● Random Forests
● Gradient Boosted Decision Trees
● Matrix Factorization
● (Deep) Neural Networks
● LambdaMart
● Clustering
● Random walk based methods
● Word Embeddings
● LDA
● ...
Production ML Algorithms At Quora
Candidate Generation
Feature Extraction
Training/Scoring
Post Processing
Data
![Page 26: Building A Machine Learning Platform At Quora (1)](https://reader030.vdocuments.mx/reader030/viewer/2022021423/587e14ab1a28abbc2e8b4f81/html5/thumbnails/26.jpg)
● Open source is great -- lots of great technologies!
● Commerical ML platforms are also open sourcing stuff.
● Learning and cherry-picking favorite parts from ANY
open source systems.
● May write our own algorithms too (e.g QMF)
● Building own platform = controlling the delegation, not
lack of delegation
![Page 27: Building A Machine Learning Platform At Quora (1)](https://reader030.vdocuments.mx/reader030/viewer/2022021423/587e14ab1a28abbc2e8b4f81/html5/thumbnails/27.jpg)
Reason # 3
Commercial Platforms’ OfferingsAre Not Super Valuable To Us
![Page 28: Building A Machine Learning Platform At Quora (1)](https://reader030.vdocuments.mx/reader030/viewer/2022021423/587e14ab1a28abbc2e8b4f81/html5/thumbnails/28.jpg)
● Main offerings of external platforms are:
○ Lower operational overhead of running machines
○ Out-of-box distributed training.
● Operational overhead
○ Gets amortized over time
○ Shared with non-ML infrastructure.
● Can often train most models in a single multi-core machine.
.
![Page 29: Building A Machine Learning Platform At Quora (1)](https://reader030.vdocuments.mx/reader030/viewer/2022021423/587e14ab1a28abbc2e8b4f81/html5/thumbnails/29.jpg)
Reason # 2
Blurry Line Between ML & Product Dev
![Page 30: Building A Machine Learning Platform At Quora (1)](https://reader030.vdocuments.mx/reader030/viewer/2022021423/587e14ab1a28abbc2e8b4f81/html5/thumbnails/30.jpg)
● Answer ranking
● Feed ranking
● Search ranking
● User recommendations
● Topic recommendations
● Duplicate questions
● Email Digest
● Request Answers
● Trending now
● Topic expertise prediction
● Spam, abuse detection
● ….
Blurry Line Between ML/Non-ML Product
![Page 31: Building A Machine Learning Platform At Quora (1)](https://reader030.vdocuments.mx/reader030/viewer/2022021423/587e14ab1a28abbc2e8b4f81/html5/thumbnails/31.jpg)
Blurry Line Between ML/Non-ML Data
Users
AnswersQuestions
Topics Votes
Follow
Ask
Write
Cast
Have
Contain Get
CommentsGet
Follow
Write
Have Have
Billions of relationships and words
![Page 32: Building A Machine Learning Platform At Quora (1)](https://reader030.vdocuments.mx/reader030/viewer/2022021423/587e14ab1a28abbc2e8b4f81/html5/thumbnails/32.jpg)
Blurry Line Between ML/Non-ML Codebase
● Integration with other utility libraries/services
e.g A/B testing, debug tools, monitoring, alerting, data
transfer, ...
● Empowering all product engineers to do ML.
![Page 33: Building A Machine Learning Platform At Quora (1)](https://reader030.vdocuments.mx/reader030/viewer/2022021423/587e14ab1a28abbc2e8b4f81/html5/thumbnails/33.jpg)
Reason # 1
ML As Quora’s Core Competency
![Page 34: Building A Machine Learning Platform At Quora (1)](https://reader030.vdocuments.mx/reader030/viewer/2022021423/587e14ab1a28abbc2e8b4f81/html5/thumbnails/34.jpg)
● ML gives us a strategic competitive advantage.
● Want to control and develop deep expertise in the
whole stack.
● Quora has a long term focus -- investment in
platform more than pays off in the long term.
● Single most important reason to build ML Platform!
ML: Critical For Our Strategic Focus
Relevance
Quality Demand
![Page 35: Building A Machine Learning Platform At Quora (1)](https://reader030.vdocuments.mx/reader030/viewer/2022021423/587e14ab1a28abbc2e8b4f81/html5/thumbnails/35.jpg)
Summary
![Page 36: Building A Machine Learning Platform At Quora (1)](https://reader030.vdocuments.mx/reader030/viewer/2022021423/587e14ab1a28abbc2e8b4f81/html5/thumbnails/36.jpg)
● Anyone doing non-trivial ML needs an ML platform to
sustain innovation at scale.
● Build vs buy decision is not all-or-nothing.
● Surface area and importance of ML are deciding factors
in the build vs buy decision.
![Page 37: Building A Machine Learning Platform At Quora (1)](https://reader030.vdocuments.mx/reader030/viewer/2022021423/587e14ab1a28abbc2e8b4f81/html5/thumbnails/37.jpg)
Nikhil Garg
@nikhilgarg28
Thank You!
YES, WE ARE HIRING :)