new machine learning challenges at criteo
TRANSCRIPT
Copyright © 2015 Criteo
New machine learning challenges at Criteo
Olivier Koch
Engineering Program Manager, Criteo
Rythm Meetup
June 15, 2016
Copyright © 2015 Criteo
Banners… what else?
2
Advertiser Publisher
Copyright © 2015 Criteo
Machine learning applications at Criteo
• Bidding (2nd price auctions)
• Product recommendation
• Banner look and feel selection
Copyright © 2015 Criteo
Machine learning at Criteo
• Supervised learning using standard regression methods / optimization algorithms (SGD, L-BFGS)
• Distribution on Hadoop (MapReduce, Spark)
• 3B displays / day
• 40 PB of data -- 15,000 servers
• 7 data centers worldwide
Copyright © 2015 Criteo
Data sparsity
10 000 displays
lead to
50 clicks
lead to
1 sale
Copyright © 2015 Criteo
Now what?
Copyright © 2015 Criteo
Challenges in online advertising
• We have an impact on users
• A user is seen more than 20 times a day in average
• Every bid has an influence on our competitors
• We want to provide a better online advertising experience
• Personalized
• Cross-device
• Long tail (new users, new products)
Copyright © 2015 Criteo
Machine learning challenges
• Optimal bidding strategies under uncertainty -- reinforcement learning, policy learning
• Probabilistic match of devices
• Classification/prediction of time series
• Long tail (users, products) -- transfer learning, factorization
• Offline metrics – counterfactual analysis
Copyright © 2015 Criteo
The good news
• New generations of algorithms
• NLP (word embeddings), reinforcement learning, policy learning, deep networks
• Releases of ML infrastructures
• Caffe on Spark, TensorFlow, Torch, PhotonML, GPUs inside clusters
→ strong traction in the academic/industrial community
Copyright © 2015 Criteo
The good news (c’ed)
• A lot of data is available
• Interactions with banners : clicks
• Interactions with products/advertisers : sales, baskets, home views, listings, visit history
→ faster decision-making in AB test, feature engineering of ML models
• New data is coming : mobile, cross-device, (offline)
→ we need to make sense of it
Copyright © 2015 Criteo
Conclusions
• Machine learning applies well to online advertising at scale
• Yet we still need to improve the users’ experience significantly
• The community is pushing new algorithms and new infrastructures forward
• Lots of new data is coming : we need to make sense of it