1 internet advertising ramana yerneni, yahoo! labs [email protected] august 17, 2010
Post on 21-Dec-2015
215 views
TRANSCRIPT
1
Internet Advertising
Ramana Yerneni, Yahoo! Labs
August 17, 2010
2
Advertisers and Publishers
• Goal of advertisers: reach the target users in a relevant context
• Goal of publishers: derive revenue by presenting ads, along with content, to target users
3
Advertising Media
• Variety of media• Print: Ads in newspapers, magazines, …
• TV: Show ads on TV programs
• Internet: Display ads on Web pages
• Offline vs. Online advertising• Print and TV are considered offline
• Internet advertising is considered online
4
Search vs. Display Advertising
• Keyword search• Strong signal of user intent
• Show ads along with search results
• Advertisers set up campaigns by indicating target keywords
• Ad opportunities are characterized by the keywords specified by search users
5
Search vs. Display Advertising
• Display ads• User profiles and content context are crucial
• Display ads embedded in content pages
• Advertisers specify target user profiles and the desired content context
• When a user visits a Web page, the ad opportunity is characterized by the profile of the user and the content context of the Web page
6
Ad Opportunities
• Specified by attribute values• User information (e.g., Gender, Age)
• Content information (e.g., Web page)
• Context information (e.g., Zip code)
• Example:• Content = finance, Gender = male, Age = 21-
30, Zip = 95051, Platform = IPhone, Interest = sports, …
7
Ad Campaigns
• Specify a constraint that each qualifying ad opportunity must satisfy
• (Attr1 = value1 and Attr2 = value2) or (Attr3 = value3 and Attr4 != value4) …
• Examples:• Gender = male and Zip = 95051
• Age = 31-40 and Zip != 95051
8
Campaign Specification
• Duration is an important constraint• Start and end timestamps are specified
• Volume goals• Campaigns typically specify number of ad
opportunities they seek
9
Setting up a Campaign
• Determine inventory for specified ad campaign
• Forecast ad opportunities that match the campaign
• Factor in contention from other existing campaigns
• Compute the price of the targeted ad opportunities
• Book campaign• If sufficient inventory available, set up the ad
campaign in the system
10
Serving Ads to Campaigns
• An ad opportunity arises when a user visits a Web page
• Identify the set of matching campaigns• Attribute values in ad opportunity satisfy the
constraints specified by these campaigns
• Select a campaign, and an ad to serve• Based on campaign goals for volume of
impressions, budget considerations, etc.
• Deliver the ad and log the event
11
Display-Advertising System
CampaignSetup
Campaign List
Ad Logs
AdServing
Ad Opportunity
Campaign Ad
Campaign Query/Booking
Query/Booking Results
13
Inventory Forecasting
• Need to forecast future ad opportunities that match targeted ad campaigns
• First step in the campaign setup process
• Once the set of matching opportunities are forecast, the available inventory, its allocation and pricing aspects are determined
14
Forecast Accuracy
• Over-forecasting causes failure to execute on advertiser campaigns
• Penalties for not fulfilling the campaign goals
• Under-forecasting leads to loss of revenue for publisher
• Not being able to monetize all ad opportunities
15
Performance Requirements
• Real-time considerations• Latency requirements of the order of 100 ms
• Inventory space is enormous• Trillions of ad opportunities in play
16
Projecting from History
• Time-series analysis• Start with logs of ad opportunities
• Generate historical time-series
• Project onto the future
• Trend factors• Weekly patterns: day-to-day variance within a
week
• Seasonal variations: e.g., Christmas shopping
17
Computational Challenges
• Set of inventory queries not known a priori
• Enormous number of possible queries• Precomputation time is too large
• Storage of time series information prohibitively expensive
18
Solution Approach
• Narrow the space of profiles for time-series computation to a subset of attributes
• Derive forecast for advertiser queries on the fly
• Example• Given query: Content = finance and Gender = male and
Duration = [12/01/10 – 12/31/10]• Forecast inventory for core profile: Content = finance
and Duration = [12/01/10 – 12/31/10]• Derive forecast for given query by using a scaling factor
for Gender = male (perhaps 0.5)
19
Forecast Derivation
• Scaling-factor computation for attribute conditions beyond core profiles
• Scaling factors sensitive to core profiles• Consider scaling factor for Gender = male• 60% of users on sports sites may be males• 40% of users on shopping sites may be males • For a given query “Content = finance and Gender
= male and Duration = [12/01/10 – 12/31/10]”, if we have time series for “Content = finance and Duration = [12/01/10 – 12/31/10]”, what scaling factor should be employed?
20
Scaling Factors based on Sampling
• Correlation across multiple conditions• Deriving forecasts for multiple conditions (e.g.,
Gender = male and Age = 31-40) is tricky
• Using simplistic independence assumptions leads to significant errors in computing scaling factors
• Sampling ad opportunities to represent correlation
• Scaling-factor computation based on sample matches for full query and for core-profile query can be much more accurate
21
Forecast-Computation Flow
• Offline computation• Precompute time series, based on historical logs,
for core profiles• Generate sample ad opportunities to enable
forecast derivation
• Online query processing• Generate forecast for the query’s core profile• Using sample ad opportunities for full-query and
core-profile, determine scaling factor• Determine forecast for the query based on the
core-profile forecast and the scaling factor
22
Forecasting System Diagram
GenerateForecastModel
(offline)
Ad Logs
Core-Profile Time Series
Opportunity Samples
ProvideInventoryForecast(online)
Inventory Query
Forecasted Inventory
23
Performance Challenges
• Offline computation• Billions of ad opportunities per day: time-series
computation is complex and arduous
• Solution approach: incremental computation of time series
• Online query processing• Need large number of sample opportunities to
cover a large space of queries
• Solution approach: memory-resident bit-vector indexing, partitioned across multiple servers
24
Accuracy Challenges
• Opportunity sampling• Highly-targeted queries may have few or no
samples
• Time-series computation• Low-volume core profiles can have significant
variance in future projection
• Forecast derivation• Large “distance” between inventory queries and
corresponding core profiles can lead to significant forecast errors
26
Ad Opportunities
• Attribute values for user, content and context information
• Also included are attributes like timestamp
• Example • Timestamp = 12/12/10 10:30:15, Content =
finance, Gender = male, Age = 21-30, Zip = 95051, Platform = IPhone, Interest = sports, …
27
Campaign-Opportunity Matching
• Matching campaigns have constraints on attribute values
• Campaigns typically specify a few constraints
• Example• Duration = [12/01/10 – 12/31/10]; Content =
finance and Gender = male and Age = 21-30
28
Campaign-Matching Challenge
• Matching a campaign to an opportunity is a complex operation
• Campaign constraints can be complex Boolean expressions
• Ad opportunities can have large number of attributes
• Low-latency requirement• Given an opportunity, need to identify all matching
campaigns fast (within a few milliseconds)
• Large number (millions) of campaigns in play
29
Naive Solution
• Scan all campaigns• For each campaign, examine if it matches the ad
opportunity
• Problem• Takes too long: large number of attributes to
match; large number of campaigns to examine
• Typically, cannot scale beyond a small number of (say 100) campaigns, with the low-latency requirement
30
Semi-naïve Solution
• Scan a relevant subset of campaigns• Set up buckets for each attribute-value condition (feature)
that appears in some campaign
• Populate the buckets with campaigns that specify the attribute-value condition
• Scan the campaigns in the buckets that match the attribute values in the ad opportunity
• Problem• each opportunity can specify a large number of (say 100)
attribute values; each feature bucket can have a large number of campaigns (e.g., Gender = male)
31
Campaign-Indexing Insights
• Most campaigns have few constraints
• Most campaigns specify constraints in simple conjunctions
• Indexing campaigns that are conjunctions can be done efficiently
• Can easily extend the solution to cover DNF-structured campaigns
32
Inverted Indexing
• Set up buckets for features and identify the buckets that match the ad opportunity
• March down the lists of campaigns in these buckets in an organized manner
• Order each list by campaign id
• Skip through each list to find campaigns that appear in the target number of lists
33
Feature Buckets
Campaign C1 : Content = finance and Age = 31-40
Campaign C2 : Content = sports and Age = 31-40
Campaign C3 : Content = finance and Age = 31-40 and Zip = 95051
Campaign C4 : Content = finance and Age = 21-30
Content = finance
Age = 31-40
Zip = 95051
Age = 21-30
Content = sports
C1
C1 C2
C2
C4
C4
C3
C3
C3
34
Opportunity Query
Campaign C1 : Content = finance and Age = 31-40
Campaign C2 : Content = sports and Age = 31-40
Campaign C3 : Content = finance and Age = 31-40 and Zip = 95051
Campaign C4 : Content = finance and Age = 21-30
Content = finance
Age = 31-40
Zip = 95051
Age = 21-30
Content = sports
C1
C1 C2
C2
C4
C4
C3
C3
C3
Opportunity : Content = finance, Gender = female, Age = 31-40, Zip = 95051, Platform = IPhone
Match: {C1, C3}
Union: {C1, C2, C3, C4} has false positives
Intersection: {C3} has false negatives
35
Postprocessing Computation
• Not all constraints of campaigns are easy to index
• E.g., the Duration constraint
• First, compute the set of campaigns satisfying the indexed constraints
• Then, process other constraints in a postprocessing phase
36
Campaign-Matching System
GenerateCampaign
Index(offline)
Campaign List
Meta data
MatchCampaigns
(online)
Ad Opportunity
Matching Campaigns
Feature buckets
37
Campaign-Matching Challenges
• Scale• Number of attributes in impressions • Number of campaigns
• Constraint complexity• Number of attributes in campaign constraints• Complexity of the boolean expressions in
constraints
• Complexity of the conditions• Range and other operators• Confidence intervals
39
Summary
• Focus area: Online/Internet Display Advertising
• Modeling ad campaigns and ad opportunities
• Forecasting inventory for campaigns
• Allocation and pricing aspects
• Matching opportunities and campaigns
40
References
• Forecasting High-Dimensional Data • Proceedings of SIGMOD 2010
• Indexing Boolean Expressions• Proceedings of VLDB 2009
• Adaptive Bidding for Display Advertising• Proceedings of WWW 2009