jun li, peng zhang, yanan cao, ping liu, li guo chinese academy of sciences state grid energy...

Jun Li, Peng Zhang, Yanan Cao, Ping Liu, Li GuoChinese Academy of SciencesState Grid Energy Institute, China

Efficient Behavior Targeting Using SVM Ensemble Indexing

Behavior Targeting (BT) uses users’ historical behavior data to select the most relevant ads for display.

Example from Yahoo! Research

Behavior targeting

User behavior data

Targeted users

Regression for BT

Poisson Regression model (Ye Chen, eBay, 2009). x: ad clicks and views, page views, search

queries and clicks. y: click-through rate (CTR).

Ye Chen et al., Large-scale behavior targeting (KDD’09 best paper award)

View data

Click data

Poisson dis.

Poisson reg.

on view

Poisson reg. on

ad catego

Limitations

Limitations: parameter tuning is very difficult. the Poisson assumption is not always true for real-world

behavior data. Clicks are typically several orders of magnitude fewer

than views. User interests are not always fixed, but rather transient.

Classification for BT SVM for classification

Example 1: 3 users on Nikon (www.nikon.com)’s ad a

View data

Click data

ad catego

View and click

data(+)

View but no click data(-)

SVM for classificat

Challenges 1,2,3

Classification for BT

Ensemble SVM on data streams

Merits no complicated parameters no statistical assumptions Dynamic model on data streams

Challenge 4

Limitations

Time cost is heavy for online computing ensemble prediction

time cost: A (advertisers)*W(ensemble size)*N(support vectors)*T(features)

Example 2: We collect 2 million behavior events (W = 10) in 1 minute, and prediction result costs 53 minutes.

Solutions

Construct Index structure for Ensemble SVM.

Why the index work ?Trade space for time. shared features among multiple support vectorsthe sparse structure of support vectors

Support vector

Text termsFeatures

Document

Ensemble SVM

Document set

P. Zhang et al., knowledge index for online data streams

( KDD 2011 & ICDM 2011)

The index structure

The SVM-index structure Example 3: based on example 1,

consider a SVM with 3 support vectors

Ensemble informati

Support vectors

Inverted hashing

Time complexity O(T)

The index structure

Operations– Search: Predict the label of each incoming user data x,

• Step 1: searches support vectors in the left inverted indexes

• Step 2: calculate x’s class label

– Insert: Integrate new classifiers into ensemble– Delete: Drop outdated classifiers from ensemble

Memory

See our source codes.

Experiments

Data sets Search engine data

• Comparisons– Possion– E-SVM– E-Index (our method)

Observations

Comparisons

E-index has sub-linear prediction time

E-SVM consumes more memory

Comparisons

Ensemble models are more accurate than Poisson regression model

Comparisons

The index method can significantly improve the efficiency, especially when the ensemble size is

large.

Related Work

Behavior targeting Regression models vs. classification models

Stream indexing Boolean expression indexing in

Publish/subscribe systems

Ensemble models Concept drifting

Conclusions

Contributions Identify and address the prediction efficiency

problem for ensemble models for behavior targeting.

Convert ensemble SVM model to a document set, and propose a new type of invert text index structure to achieve sub-linear prediction time.

Future work Index more complicated SVM models with non-

linear kernels.

For source code, visit our websitestreamming.org/homepages/lijun.html

Questions?

jun li, peng zhang, yanan cao, ping liu, li guo chinese academy of sciences state grid energy...

click datasvm

ensemble svm model

knowledge index

index work

realworld behavior data

svm ensemble indexingbehavior

complicated svm models

btensemble svm

Documents

svm reference

svm replicas

yaşar kemal - yanan ormanlarda 50 gün

lecture 6: support vector machine - github pagesi@sdu.edu.cn...

sun tzu zhao yanan

fast and accurate refined nystrom based kernel svm · fast...

meningkatkan kinerja pela yanan bongkar muat …

svm tutorial

presented by: megumi nanya yanan hou patrick kelm

svm tr53 svm tr107 - star foils tecniche/svm tr...svm tr53...

powerpoint templateudls/slides/udls-yanan-shaanxi.pdf ·...

analisis kualit as pela yanan pada bagian administrasi

svm campaign

new · 2004. 11. 16. · svm -o 155- 10 etrioii'/ 4.1 svm...

fm2022 yanan zhao

svm classifier

yanan bĠzdĠk sĠz kÖmÜr sandiniz

influencing factors in the success of agricultural ... ·...

topographic and geometric controls on glacier … and...

lecture12 - svm