google analytics and adwords optimisation with gnu r

49
Google Analytics and AdWords optimisation with GNU R Hinnerk Gnutzmann & Piotr Śpiewanowski flexponsive UG Booster Conference, 9th March 2016

Upload: piotr-spiewanowski

Post on 24-Jan-2017

366 views

Category:

Marketing


2 download

TRANSCRIPT

Page 1: Google Analytics and AdWords optimisation with GNU R

Google Analytics and AdWords optimisation withGNU R

Hinnerk Gnutzmann & Piotr Śpiewanowski

flexponsive UG

Booster Conference, 9th March 2016

Page 2: Google Analytics and AdWords optimisation with GNU R

About flexponsive UG

• e-commerce consulting• Big Data focus• Qualitative user testing• Academic (PhD in economics) and programming background

Contact• mailto: [email protected]• web: https://www.flexponsive.net/• t: @flexponsive

Page 3: Google Analytics and AdWords optimisation with GNU R

Topic of the day

• Marketing outcomes• difficult to define• even more difficult to measure

• Before Big Data: “Half the money I spend on advertising is wasted;the trouble is I don’t know which half.” (John Wanamaker, 1838 -1922)

• With Big Data: “AdWords brand keyword ads have no measurableshort-term benefis” (Blake et al., 2015) - 100% wasted?

• Open Questions:• Incrementality Debate: Do AdWords campaings cannibalise organictraffic?

• Quality: Are bought visitors good or bad customers?• Heterogenity: Campaign effects differ between customers?

Page 4: Google Analytics and AdWords optimisation with GNU R

Agenda

1. Case study Brand Keyword: The Secret of vanishing AdWords ROI2. What can we do?

• attribution models• controlled experiments• GNU R & Analytics: A Dream Team

3. How to do that?• Google Core Reporting API & GNU R• GA Query Explorer• Configuring an experiment in AdWords

4. Analysis with GNU R• Data wrangling, sampling, etc.• GA replicate metrics• Regression Analysis

5. Case Study II: adClicks and rain in Bergen

Page 5: Google Analytics and AdWords optimisation with GNU R

Example - Skandiabanken

Page 6: Google Analytics and AdWords optimisation with GNU R

Example - Skandiabanken

Page 7: Google Analytics and AdWords optimisation with GNU R

What happened?

• the AdWord is highly relevant to the search• Navigational Query: The visitor wants to visit Skandiabanken.• Customer knows the bank and maybe even has a service in mind

• Result: Probably the best keyword in the account• Excellent CTR• Very good conversion on-site• CPC perhaps not so high

• Any questions?

• Organic result is the same!• What would you click if there was no ad?

Page 8: Google Analytics and AdWords optimisation with GNU R

What happened?

• the AdWord is highly relevant to the search• Navigational Query: The visitor wants to visit Skandiabanken.• Customer knows the bank and maybe even has a service in mind

• Result: Probably the best keyword in the account• Excellent CTR• Very good conversion on-site• CPC perhaps not so high

• Any questions?

• Organic result is the same!• What would you click if there was no ad?

Page 9: Google Analytics and AdWords optimisation with GNU R

What happened?

• the AdWord is highly relevant to the search• Navigational Query: The visitor wants to visit Skandiabanken.• Customer knows the bank and maybe even has a service in mind

• Result: Probably the best keyword in the account• Excellent CTR• Very good conversion on-site• CPC perhaps not so high

• Any questions?

• Organic result is the same!• What would you click if there was no ad?

Page 10: Google Analytics and AdWords optimisation with GNU R

Example - Skandiabanken (without AdWords)

Page 11: Google Analytics and AdWords optimisation with GNU R

ROI

Problem: SEM expenditure a function not only of the campaign, but alsoof the behavior and intent of consumer

Page 12: Google Analytics and AdWords optimisation with GNU R

The eBay study

• Blake et al. (2015), “Consumer Heterogeneity and Paid SearchEffectiveness: A Large Scale Field Experiment”

• Field Experiment: Does AdWords work for eBay?

• Very controversial results:1. Conventional methods used to measure the causal (incremental)

impact of SEM vastly overstate its effect.2. True effectiveness of SEM is small for a well-known company like eBay3. Click substition: When the brand keyword AdWord disappeared,

almost all the users click on the organic result4. Informative Advertising: AdWords work if a visitor gains additional

information through advertisement - AdWords had almost no effect onrevenues from existing customers - They found their own way to eBay!

Page 13: Google Analytics and AdWords optimisation with GNU R

What can be done? Attribution modelling

But how to know the true channel’s impact?

Page 14: Google Analytics and AdWords optimisation with GNU R

Attribution modelling

• a way to divide the “credit” for a sale between different marketingchannels

• if you don’t know what attribution model you are using, it’s “lastclick” => you believe the sale only depends on the last ad thecustomer saw before purchasing

• probably that’s not true: perhaps the customer had been followingthe company blog for a long time, heard friends talk off-line about theproduct, or saw many banner ads on different sides before making apurchase

• problem: no good way to decide how to “attribute” between differentmarketing channels

• results depend a lot on assumptions, which you cannot test

• similar problem: if you advertise your brick-and-mortar store on TVand on radio, what drives the customer to your store?

Page 15: Google Analytics and AdWords optimisation with GNU R

What can be done? Controlled experiments

• Select by random treatment and control group, for example:• Per user: A / B Testing• By Geographical Region

• Assumption: Without experiment, both groups behave similarly• Evaluation: difference in differences

• difference in the control group: Noise• difference in treatment group: Effect + Noise

• Metrics: ∆TREATED − ∆UNTREATED

• Advantages of a geographical experiment:• no multi-device tracking necessary• easy integration with external data

• Caveat: Geographical groups really need to be comparable(e.g. commuters)

Page 16: Google Analytics and AdWords optimisation with GNU R

Difference in Differences

Page 17: Google Analytics and AdWords optimisation with GNU R

GNU R and Google Analytics: Dream Team

1. Selection of the treated and control group• Install R, generate a sample with GNU R• Export: Copy & paste to AdWords

2. Data collection• Google Analytics already configured

3. Aggregation and query• In the cloud: Google Analytics Query Explorer• Integration with RGoogleAnalytics

4. Evaluation: Estimation and Visualization• All necessary functions available as packages in R

Page 18: Google Analytics and AdWords optimisation with GNU R

About R

• Programming language and software environment for statisticalcomputing and graphics, a dialect of S

• Quite lean; functionality is divided into modular packages• Graphics better than in most stat packages.• Useful for interactive work, but contains a powerful programming

language for developing new tools (user -> programmer)• Very active and vibrant user community; R-help and R-devel mailing

lists and Stack Overflow• Markdown packages for reproducable research and automated

reporting• It’s free!

Page 19: Google Analytics and AdWords optimisation with GNU R

Install R

• Open Source for Windows / Mac / Linux etc.• GNU R: https://www.r-project.org/• RStudio IDE: http://www.rstudio.com

• Cheat Sheets to help!• R Reference Card• RStudio cheatsheets

• Package management via CRAN

install.packages('RGoogleAnalytics',repos = "http://cran.no.r-project.org");

install.packages('plm',repos = "http://cran.no.r-project.org");

install.packages('ggplot2',repos = "http://cran.no.r-project.org");

Page 20: Google Analytics and AdWords optimisation with GNU R

Selecting Treatment Group

download.file('https://goo.gl/qVgiYp',destfile='geoid.csv');

#Kommune level selection, but Fylke level also possibleregions <- read.csv('geoid.csv');norway<-regions[which(regions$Country.Code == 'NO'& regions$Target.Type == 'County'& regions$Status == 'Active'),];

set.seed(1);

norway$isTreatment <- sample(c(0,1),nrow(norway), replace =T)

write.csv(norway, file='norway.csv');

# paste into AdWordswriteLines(as.vector(norway[which(norway$isTreatment == '1'),]$Canonical.Name),file('treatment.csv'));

Page 21: Google Analytics and AdWords optimisation with GNU R

Configuring Google AdWords I

Page 22: Google Analytics and AdWords optimisation with GNU R

Configuring Google AdWords II

Page 23: Google Analytics and AdWords optimisation with GNU R

Configuring Google AdWords III

Page 24: Google Analytics and AdWords optimisation with GNU R

Configuring Google AdWords IV

Page 25: Google Analytics and AdWords optimisation with GNU R

Configuring Google AdWords V

Page 26: Google Analytics and AdWords optimisation with GNU R

Done!!

Page 27: Google Analytics and AdWords optimisation with GNU R

Wait

. . . for the results

Page 28: Google Analytics and AdWords optimisation with GNU R

Google Analytics Core Reporting API & R

1. Create an “app”• Google Developers page• Enable Google Analytics API• Create Credentials: OAuth client ID, Application type: Other• Result: Client ID and Client Secret

2. Find your GA Profile ID

Page 29: Google Analytics and AdWords optimisation with GNU R

Setting up GNU R

client.id <- 'xxxxxxxxxxxxxxx.apps.googleusercontent.com';client.secret <- 'xxxxxxxxxxxxxxx';analyticsProfileId <- '111111111';

# redirect to google, paste, coderequire(RGoogleAnalytics);token <- Auth(client.id, client.secret)

# savesave(token, file = 'gatoken.txt');

# next timetoken <- load("./gatoken.txt")ValidateToken(token);

Page 30: Google Analytics and AdWords optimisation with GNU R

Create a query

query.list <- Init(start.date = "2015-10-01",end.date = "2016-02-29",dimensions = "ga:region,ga:date,ga:medium",metrics = "ga:sessions,ga:transactionRevenue",filter = "ga:country==Norway",max.results = 50000,sort = "-ga:date,ga:region",table.id = paste0("ga:",analyticsProfileId));

ga.query <- QueryBuilder(query.list);ga.data <- GetReportData(ga.query, token);

Page 31: Google Analytics and AdWords optimisation with GNU R

Real Data Example - www.flexponsive.net

kable(head(ga.data))

region date medium country sessions transactionRevenueBrussels 20160229 referral Belgium 1 0State of Parana 20160229 referral Brazil 1 0Baden-Wurttemberg 20160229 organic Germany 1 0Baden-Wurttemberg 20160229 referral Germany 1 0Rhineland-Palatinate 20160229 referral Germany 1 0(not set) 20160229 (none) Hong Kong 5 0

Page 32: Google Analytics and AdWords optimisation with GNU R

Tip: Query Explorer

Page 33: Google Analytics and AdWords optimisation with GNU R

Tip2: Dimensions & Metrics Explorer

Page 34: Google Analytics and AdWords optimisation with GNU R

Tip3: Avoiding sampling

> ga.data <- GetReportData(ga.query, token)Status of Query:The API returned 1393 resultsThe query response contains sampled data. It is based onXX.XX % of your visits. You can split the query day-wisein order to reduce the effect of sampling.

Set split_daywise = T in the GetReportData functionNote that split_daywise = T will automatically ....

• “Sampling occurs automatically when more than 500,000 sessions(25M for Premium) are collected for a report, allowing GoogleAnalytics to generate reports more quickly for those large data sets.”

Page 35: Google Analytics and AdWords optimisation with GNU R

Data Integration

• Wide Format: for each region and time a row• Long Format: Region / time / dimension one line (EAV)

require (reshape2);

## Loading required package: reshape2

w <- reshape (ga.data, timevar = 'medium',idvar = c( 'region', 'date'), direction = 'wide');

Page 36: Google Analytics and AdWords optimisation with GNU R

Data Integration: Almost finished

• Merge: Who is in which group?

ds <- merge (w, norway[, c ( 'Name', 'isTreatment')],by.x = 'region', by.y = 'Name', all.x = T)

• Data set is ready!• Comfortable DSL for data manipulation• Use packages to minimize code

Page 37: Google Analytics and AdWords optimisation with GNU R

Case Study: Wanderlust

Page 38: Google Analytics and AdWords optimisation with GNU R

Case Study: Wanderlust

• an app “developed” for this presentation• mysterious weekend getaway and short holidays booking engine• supports inventory management of hotels and airlines• seasonal demand fluctuations

Page 39: Google Analytics and AdWords optimisation with GNU R

Evaluation

• Simulated data for illustration: 3 summer months• 1st August: experiment starts in 10 random provinces (fylke) -AdWords stopped

• 1st August: start of school, search volume falls everywhere by 50%

• Scenario: 100% of visitors click organically when the AdWord invisible• Randomization has decided:

• Sor-Trondelag (Trondheim): In the treatment group - from 1st Augustno AdWords

• Hordaland (Bergen): In the control group - AdWords continue

Page 40: Google Analytics and AdWords optimisation with GNU R

Revenues in Sor-Trondelag (treatment)

60

80

100

120

Jun 01 Jun 15 Jul 01 Jul 15 Aug 01 Aug 15 Sep 01date

tran

sact

ionR

even

ue.to

tal

Page 41: Google Analytics and AdWords optimisation with GNU R

Revenues in Hordaland (control)

60

80

100

120

Jun 01 Jun 15 Jul 01 Jul 15 Aug 01 Aug 15 Sep 01date

tran

sact

ionR

even

ue.to

tal

Page 42: Google Analytics and AdWords optimisation with GNU R

Revenues in both Fylke

60

80

100

120

Jun 01 Jun 15 Jul 01 Jul 15 Aug 01 Aug 15 Sep 01date

tran

sact

ionR

even

ue.to

tal

region Hordaland Sor−Trondelag

Page 43: Google Analytics and AdWords optimisation with GNU R

ROI Calculation - standard regression

require(stargazer);out <- lm(transactionRevenue.total ~ isTreatment.cpc,

data = sd.w)stargazer(out, header=FALSE, type='latex')

Table 2

Dependent variable:transactionRevenue.total

isTreatment.cpc −48.358∗∗∗

(1.350)

Constant 111.350∗∗∗

(0.569)

Observations 1,748R2 0.424Adjusted R2 0.423Residual Std. Error 21.560 (df = 1746)F Statistic 1,282.996∗∗∗ (df = 1; 1746)

Note: ∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01

Page 44: Google Analytics and AdWords optimisation with GNU R

ROI Calculation - standard regression

• Standard OLS regression with binary variable == comparing means• But not the right ones. In this case:

Revenues = β0 + β1 ∗ treatment

• The treatment takes value 1 for the treatment group after theAdWords were stopped in Sor-Trondelag, otherwise 0

• As a result β1 represents the difference between the average revenuesin Sor-Trondelag in August and average revenues in Hordaland andSor-Trondelag in June and July

• That’s clearly now what we are looking for!!

Page 45: Google Analytics and AdWords optimisation with GNU R

Difference in Differences

Page 46: Google Analytics and AdWords optimisation with GNU R

ROI Calculation - Differences in Differences

require(plm)out <- plm(transactionRevenue.total ~ isTreatment.cpc,

data=sd.w, index=c("region", "date"), model="between")stargazer(out, header=FALSE, type='latex')

Table 3

Dependent variable:transactionRevenue.total

isTreatment.cpc 0.189(0.254)

Constant 102.741∗∗∗

(0.062)

Observations 19R2 0.032Adjusted R2 0.028F Statistic 0.556 (df = 1; 17)

Note: ∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01

Page 47: Google Analytics and AdWords optimisation with GNU R

ROI Calculation - Difference in Differences

• Difference in Differences estimator using fixed effects model withbinary varaibles allows to calculate the true effect of the treatment

• Econometrically we estimate this equation:

Revenues = β0 + β1 ∗ treatment + β2 ∗ before + γ ∗ fylke

• fylke is a matrix of binary variables for each district• before is a binary variable takes value 0 in a period in which AdWordswere running in all districts and value 1 in period in which experimentwas started in some regions

• treatment takes value 1 for the treatment group in the preiod inwhich the experimetn was started, i.e. after the AdWords werestopped in Sor-Trondelag, otherwise 0

• The estimation result reveals the true impact of AdWords onrevenues in this data set

Page 48: Google Analytics and AdWords optimisation with GNU R

Discussion

• The Missing counterfactual - we do not know what else could behappening - help: Experiment

• Challenge: Big Data without Big Code - Google Analytics & GNU R -Very rich toolbox

• Result: Differences in Differences can work - note assumptions

Page 49: Google Analytics and AdWords optimisation with GNU R

Table of Contents

Intro

Brand KeywordsThe eBay StudyCalculating the true ROI

Brand keywords with RConfiguring ExperimentUsing Google Analytics API

AdWords experiment: an exampleRegression Results