the marketer's dilemma

Upload: maria-anna-manalo

Post on 12-Oct-2015

97 views

Category:

Documents


6 download

DESCRIPTION

Journal Article

TRANSCRIPT

  • The Marketer's Dilemma:Focusing on a Target or a Demographic?

    The Utility of Data-integration Techniques

    MrKE HESSNielsenMichael.Hess@nielsen.

    com

    PETE [email protected]

    Data-integration techniques can be useful tools as marketers continue to innprove

    overall efficiency and return on investment. This is true because of the value of the

    techniques themselves and also because the current advertising market, based on

    demographic buying, has major opportunities for arbitrage in the range of 10 percentto 25 percent (where in that range depends on the nature of the vertical). The currentstudy reviews different methods of data integration in pursuing such negotiations.

    INTRODUCTIONAdvertisers, agencies, and content providers allare looking for improvement in the placement ofadvertisements in content. If an advertiser canreach more of its customers and potential custom-ers by spending less money, or an agency can helpan advertiser to do the same, this yields a positiveeffect on the advertiser's bottom line. Conversely,a content supplier can enhance its value if it candemonstrate that its content is attractive to par-ticular types of people (e.g., those disposed to aparticular brand or category, or even a particular.psychographic target).

    In this quest for improved advertising effi-ciency and return on investment (ROI), a numberof different methods have evolved. Most market-ers and their agencies use targeting rather thanmass-marketing strategies (Sharp, 2010). Beyondthis, many agencies have their own "secret-sauce"formulas whereby they adjust the value of anadvertising buy as a function of how much"engagement" can be attributed to that vehicle,whether it be a specific television programor a magazine title. A more recent in-marketapproachexemplified by TRA (Harvey, 2012) andNielsen Catalina Serviceshas also shown thatbuying can be improved through the identificationof programs that have more brand and categoryheavy users.

    The authors' own work since 2007 with data-integration techniques has shown that fused data

    DOI: 10.2501/JAR-53-2-231-236

    sets also can improve targeting efficiency by arange from about 10 percent to 25 percent depend-ing on the category vertical. A number of firmsemploy data fusion and integration techniqueson the provider side (e.g., Nielsen, Telmar, Kantar,and Simmons) and the agency business (Hess andFadeyeva, 2008).

    In this study, the authors share some of the defi-nitions and empirical generalizations that haveaccumulated in the past five years of working withthese techniques.

    The practical application of data integrationalready has begun to appear in the marketplace.A large snack-manufacturing company presentedsome of its findings ata recent Advertising ResearchFoundation (ARF) conference (Lion, 2009); a globalsoftware supplier took the stage at a Consumer-360event (Nielsen C-360, 2011); and a media-planningand buying agericy has indicated that it is using itscustom fusion data set to verify and fine-tune com-mitments made in the 2012 Upfront and in all ofits competitive pitches for new business (personalcommunication to M. Hess, 2012).

    In the next section, the various data-integrationtechniques are defined, and some of the advan-tages and disadvantages of each are discussed.

    TYPES OF DATA INTEGRATIONThere are three broad types of data integrationused in media and consumer research for advertis-ing planning.

    June 2 0 1 3 JDURIIRL OF HDUEdTISIIIG RESEHRCH 2 3 1

  • WHAT WE KNOW ABOUT ADVERTISING II

    EIVIPIRICAL GENERALIZATIONAnalysis with integrated data sets and the national people meter panel has shown usthat if an advertising buy is made based on a marketing target and the programs thatits members viewrather than against a demographic targetthere is empirically arange of between 10 percent and 25 percent improvement in the efficiency of that buy.This marketing target can be based either on consumption pattern segmentation (e.g.,heavy/light category users) or on psychographic/lifestyle segmentation (e.g., prudentsavers versus financial risk takers).

    Directly Matched DataData sets are matched using a common key(e.g., name and address, or cookies). Veryoften, this requires the use of personallyidentifiable information, and appropriateprivacy measures must be in place. Someof the key technical aspects that must beevaluated are completeness and accuracyof matching.

    For marketing purposes, databasesthat are integrated via direct-matchingof address are often referred to as single-source data, but there is a distinctionbetween true single-source and this formof integrated data as the completeness andaccuracy of the match are usually not per-fect. However, it can be considered to bethe next best thing to single source assum-ing the datasets being integrated are ofgood quality and relevance.

    An example of this sort of database isthe Nielsen Catalina Services integrationof Catalina frequent shopper data withtelevision data obtained from NielsenNational People Meter data and ReturnPath Set Top Box data.

    Unit-Level (e.g., respondent-level)AscriptionIn many cases, direct matching of datais unfeasible, perhaps because of pri-vacy concerns or because the intersectionbetween the data sets is minimal (this isusually the case with samples, where pop-ulation sampling fractions are very small);assuming no exclusion criteria for researcheligibility, the chance of a respondent

    being in two samples with sampling frac-tions of 1/10,000 is 1 in 100 million.

    In these cases, statistical ascription tech-niques can be used to impute data. Forexample, product-purchase data can beascribed onto the members of a researchpanel that measures television audiences,using common variables on the televisionpanel and a product-purchase database toguide the ascription. This enables viewinghabits of product users to be estimated.

    Data fusion is one example of a unit-level ascription technique that is increas-ingly being used to create integrateddatabases. (The topic is discussed in moredetail later in this article.)

    Some of the advantages of this approach:

    There is no additional burden on therespondent. Because the ascription is sta-tistical, it can be applied to anonymizeddata. Additional data are obtained with-out affecting existing response rates orworsening respondent fatigue.

    There are no privacy concerns. Alongwith the previous point, it makes this aparticularly valuable approach to add-ing additional data fields to media cur-rency measurements, which typicallyhave tight constraints on respondentaccess and measurement specifications.

    As the ascription is applied at the urt/respondent level, the database createddelivers complete analytic flexibility.A particularly relevant and valuable

    consequence of this for media databasesis that advertising reach and frequencyanalyses can be created.

    The cost of ascription is low in com-parison to the cost of additional primaryresearch.

    Caveats associated with this approach:

    Ascription techniques contain the pos-sibility of model bias. This needs to becarefully assessed. Model validation isessential.

    In the majority of cases, ascriptionmodels have aggregate- rather thanrespondent-level validity. For example,a model that overlays brand purchasingonto a television measurement panelmay not be able to predict the actualbrand purchases of an individual house-hold on the panel, but it will be able tareliably predict the viewing of brandpurchasers as a group. This meansthat the approach is relevant to advei-tising planning but less applicable totest-control ROI analyses where directassessment of purchase versus exposureis required.

    Aggregate-Level IntegrationAggregate-level integration uses segmen-tation to group and then link types cfrespondent on data sets. The segmentationtypically uses combinations of demograph-ics and geography, though any informationcommon to the data sets can be employed.

    An example of a commonly used seg-mentation is Prizm, which segments thepopulation into 60 geo-demographicgroups. An assessment of viewing habitsof brand users can be obtained by iden-tifying Prizm codes strongly associatedwith particular brands (using a consumerpanel) and looking at viewing traits associ-ated with these groups (using a television

    2 3 2 JOUIRL or flDUERTISlOG RESEflflCH June 2 0 1 3

  • THE MARKETER'S DILEMMA: FOCUSING ON A TARGET OR A DEMOGRAPHIC?

    panel with Prizm classification). Alterna-tively, purchase, propensity scores acrossall segments can be calculated on the con-sumer panels and used as media weightson television audiences.

    Advantages of this approach:

    Segmentations can cover a widescopelinking data sets throughgeo-demographic segmentation, forexample, allows consumer and mediaresearch databases to be connected andsubsequently linked with geographicaldata such as retail areas.

    Understanding a brand through the lensof a suitably constructed segmentationdelivers insights beyond basic purchasefacts, perhaps guiding advertising crea-tivity as well as media touch-points.

    Limitations of this approach:

    Segmentations, by nature, assumehomogeneity within segments, and thisdelivers less precision and less sensitiv-ity than other approaches.

    Because the integration of data sourcesis not unit/respondent level, there arerestrictions on analysis: in particular,campaign reach and frequency.

    The Pros and Cons of Each ApproachDirect match, unit-level ascription, andaggregate-level ascription can' be consid-ered as a tool for users of research, to beused in the appropriate way (See Table 1).For example, respondent-level ascriptionof brand user attributes on a televisionpanel may be used to plan advertisingfor a specific brand target; a direct-matchdatabase may then be used to estimateadvertising effectiveness of the cam-paign; product distribution tactics may beinformed by the use of geo-demographicsegmentation.

    TABLE i "Overview of Integration Approaches

    Direct Match (e.g..Address Matching)

    Applications Advertising ROI

    Media Reach andFrequency

    Media Planning

    Ad Sales

    Accuracy/ High - near singleprecision source

    Unit-Level Ascription(e.g., Data Fusion)

    Media Reach andFrequency

    Media Planning

    Ad Sales

    Dependent on model:can be near singlesource

    Aggregate Level(e.g.. Segment Matching)

    Media Planning

    Ad Sales

    Relating media and salesactivity to geographicallocations e.g., stores.catchment areas

    Dependent onsegmentation but typicallylower than unit-levelascription

    Caveats Privacy Aggregate-levelvalidity: not suited todirect ROI estimation

    Completeness and Model BiasAccuracy of Matching

    Aggregate-level validity:not suited to direct ROIestimation

    Reach and Frequency notavailable

    Assumption of homogeneitywithin segments reducessensitivity

    DATA FUSiONThe term data fusion is used to describemany different data-integration methods.The most conunon definition, and the onewe shall use in this study, is as follows:"Data fusion is a respondent-level integra-tion of two or more survey databases tocreate a simulated single source data set."

    Essentially two surveys (or panels) aremerged at the respondent level to create asingle database (e.g., the U.S. Nielsen tele-vision/Internet Data Fusion overlays datafrom the Nielsen Online Audience Meas-urement Panel onto the National PeopleMeter television Audience MeasurementPanel, creating a database of respondentswith television viewing measures andonline usage measures).

    The Data Fusion Process(TV/Internet Fusion)

    TV PanelCommon

    CharacteristicsTV Viewing

    Internet PanelCommon

    CharacteristicsOnline Use

    Data Fusion(Matching via Common

    Characteristics)

    Integrated DataCommon Characteristics

    TV Viewing and Online Use

    June 2 0 1 3 JDURnHL OF HDUERTISIDG RESERRCH 2 3 3

  • WHAT WE KNOW ABOUT ADVERTISING II

    The term data fusion is used to describe many

    different data-integration methods.

    Linking VariabiesThe creation of this single databasematches respondents on common vari-ables to lir\k the data sets. Common vari-ables (also known as "linking variables"or "fusion hooks") typically are demo-graphic, geographic, and media-related.For example, men aged 18 to 24 years, infull-time employment within a certaingeographical region who have a particulardefined set of media habits (defined acrossthe two' panels), may be matched acrossthe two databases.

    The importance of linking variables inthe data fusion cannot be overstressed.In the case of media-based data fusion,Nielsen data fusions adhere to the gener-ally accepted idea that linking variablesmust encompass more than standarddemographic measures to ensure reliabil-ity of results.

    The importance of employing measuresdirectly related to the phenomena beginfused (in this case, television viewing) wasemphasized by Suzanne Rassler (2002) inStatistical Matching:

    Within media and consuming data thetypical demographic and socioeconomicvariables will surely not completely explainmedia exposure and consuming behavior.Variables already concertiing media expo-sure and consuming behavior have to beasked as well. Thus, the common variablesalso have to contain variables concerningtelevision and consuming behaviors....

    Linking variables are the key to the sta-tistical validity of the fusion, which oper-ates on the assumption of conditional

    independence; in the case of the televi-sion/Internet fusion, this would mean thatvariations in the way that television view-ing and online use interact are randomwithin each group of respondents definedby the interlaced common variables.

    Where this condition does not hold,model regression to mean occurs, andthere wili be some bias in the fused results.This bias can be estimated using fold-overtests or comparison to single-source data(if available) and is an important part ofassessing a data fusion's validity andutility.

    In addition, a smart fusion practitioneralso will test the congruence of the link-ing variables across the two databaseschecking that the two sample structuresare matched well enough to enable thefusion to work well and assessing thecloseness of matching of the two samplespost fusion.

    Matching the SamplesIn practice, it is rarely possible to find amatch for every respondent across everycharacteristic in the linking variable set.In the absence of a perfect match, theobjective, therefore, becomes finding thebest match. And although fusion algo-rithms vary, this requirement typically isachieved using statistical distance meas-urements (including assessment of the rel-ative importance of the linking variablesin predicting behavior) and identifying therespondents with the smallest distance.

    At the same time, checks should occurin the fusion algorithm to ensure that thefusion uses all the respondents in bothsamples as equitably as possible. In some

    cases, the two samples to be fused mayhave very different sample sizes, and con-sideration needs to be given to how to bestuse the sampleswhether ail respondentswill contribute to the fused database orjust the closest matches to create a data-base with a respondent base equal in sizeto the smaller of the two samples. Thisdecision often is driven by logistical fac-tors such as the analysis system capabili-ties rather than being a purely statisticalconsideration.

    VaiidationData fusion has been used in mediaresearch for planning purposes for morethan 20 years, and a body of knowledgehas been built up over that time. Valu-able guidance as to the validity levels thatmay hold given various data-integrationapproaches also can be found in industryguidelines developed by the AdvertisingResearch Foundation (2003).

    Validation studies have demonstratedthat data fusion provides vahd resultswith acceptably low levels of model biasassuming the following hold:

    The samples are well defined and struc-turally similar;

    there is a sufficient set of relevant link-ing variables; and

    the fusion matches the samples closelyacross the linking variables.

    The authors of the current article believethat it is important to validate every datafusion across these three criteria and tocreate formal fold-over validation testsand/or single-source comparisons wherepossible. In addition, offering methodo-logical transparency and welcoming exter-nal validation of data fusion processeshave contributed to greater acceptance ofdata fusion by the industry. As such, themethod is viewed by many as a useful toolin the researchers' tool box.

    234 m m i or nouERTisiiiG DESEHRCH June 2013

  • THE MARKETER'S DILEMMA: FOCUSING ON A TARGET OR A DEMOGRAPHIC?

    ANALYSIS OF LEARNINGS ANDEMPIRICAL GENERALIZATIONSAlthough the authors have been work-ing in this space since 2007, it is not easyto obtain specific learning from every dataintegration due to the proprietary nature ofthe service. The generalizafions below areoffered in the spirit of industry advance-ment while, at the same dme, protective ofthe proprietary aspects of the outcomes.

    Analysis with integrated data sets andthe national people meter panel has shownus that if an advertising buy is made basedon a marketing target and the programsthat its members view, rather than on ademographic target, there is empirically arange of 10 percent to 25 percent improve-ment in the efficiency of that buy.

    This marketing target can be basedeither on consumption pattern segmen-tation (e.g., heavy/light category users)or on psychographic/lifestyle segmenta-tion (e.g., prudent savers versus financialrisk takers). An increase in efficiency isexplained as follows:

    A campaign planned to deliver X demo-graphic GRPs will deliver Y brand targetGRPs. An alternate plan can be developedthat delivers X demographic GRPs and Zbrand target GRPs wheh Z > X. Equiva-lently an alternate plan can be developed todeliver X2 demographic GRPs and Y brandtarget GRPs where X2

  • WHAT WE KNOW ABOUT ADVERTISING II

    also is important for cross-platform cam-paigns. If the reach, for example, againstthe marketing target is already enhancedvia this approach as part of the televisionbuy, the Key Performance Indicator (KPI)of the cross-platform might be based moreon frequency and recency than on an effortto attain additional unduplicated reach.

    CONCLUSION

    In sum, the authors believe that data-integration techniques are acting as thelatest wave of services that are bringinggreater overall efficiency and, in tum, ROIto the industry. They follow in the foot-steps of predictive new product models inthe 1970s and 1980s, and marketing-mixmodeling in the 1990s and 2000s.

    MIKE HESS is evp in Nielsen's Media Analytics group.

    He aiso serves as the Nielsen spokesperson

    for Social Television and is currentiy directing a

    comprehensive anaiysis of the relationship between

    social buzz and television ratings. Before joiningNielsen in 2011, Hess was research director for

    the media agencies of Carat and OMD. Hess's

    publications inciude an American Association of

    Advertising Agencies-sponsored monograph on "Short

    and Long Term Effects of Advertising and Promotion"

    (2002), and a review of quantitative methods inadvertising research for the Rftieth Anniversary issue

    of the Journal of Advertising Research (2011). Hecurrently acts as project co-lead for the quantificationof brand equity for the MASB and this year became a

    trustee of the Marketing Sciences Institute.

    PETE DOE is svp/data integration at Nielsen. In that

    role, he has global responsibility for Nielsen's

    data-fusion methodologies and is involved with such

    data-integration methods as STB modeled ratings and

    oniine hybrid audiences. Prior to moving to the United

    States in 2003. Doe was a board director at RSMB

    television research in the United Kingdom, where he

    worked on the BARB television audience measurement

    currency and numerous data-fusion projects.

    REFERENCES

    ADVERTISING RESEARCH FOUNDATION. ARF

    Guideiines for Data Integration. AdvertisingResearch Foundation, 2003.

    COLLINS, ]., and P. DOE. Making Best Use

    of Brand Target Audiences Print and Digital

    Research Forum. San Francisco, CA, 2011.

    HARVEY, B., panelist at the Wharton Empirical

    Generalizations Conference-II, Philadelphia,

    May 31,2012.

    HESS, M . , and I. FADEYEVA. ARF Forum on Data

    Fusion and Integration. New York: Advertising

    Research Foundation, 2008.

    LION, S. Marketing Laws in Action. AM 4.0. New

    York, NY: Advertising Research Foundation,

    2009. *

    NIELSEN ANNUAL CUSTOMER C - 3 6 0 CONFER-

    ENCE. Orlando, June 2011.

    RASSLER, S. Statistical Matching: A FrequentisfTheory, Practical Applications, and Alternatizx

    Bayesian Approaches. New York: Springer-Verlag,

    2002.

    SHARP, B. HOW Brands Grow. Australia and New

    Zealand: Oxford University Press, 2010.

    2 3 6 JouRom or HDUEIITISIIIG RESEHUCH June 2013

  • Copyright of Journal of Advertising Research is the property of Warc LTD and its contentmay not be copied or emailed to multiple sites or posted to a listserv without the copyrightholder's express written permission. However, users may print, download, or email articles forindividual use.