marketing analytics with r disclaimer: all logos, photos, etc. used in this presentation are the...
TRANSCRIPT
![Page 1: Marketing Analytics with R Disclaimer: All logos, photos, etc. used in this presentation are the property of their respective copyright owners and are](https://reader035.vdocuments.mx/reader035/viewer/2022070409/56649e745503460f94b74bce/html5/thumbnails/1.jpg)
Marketing Analytics with R
Disclaimer:• All logos, photos, etc. used in this presentation are the property of their respective copyright owners and are used here for educational purposes only
© Stephan Sorger 2013. www.StephanSorger.com; Marketing Analytics: Analytics with R; R.1
![Page 2: Marketing Analytics with R Disclaimer: All logos, photos, etc. used in this presentation are the property of their respective copyright owners and are](https://reader035.vdocuments.mx/reader035/viewer/2022070409/56649e745503460f94b74bce/html5/thumbnails/2.jpg)
Statistical Analysis Software: Introduction
Topic Definition
Definition Software designed for in-depth analysisUnlike MS Excel (general purpose spreadsheet)
Origins SAS conceived in 1966 by Anthony J. BarrPlaced statistical procedures in formatted file framewk
Uses Advanced statistical techniquesNonlinear functions; Multiple regression; Conjoint
Advantages Powerful; Accurate; Specific tools
Disadvantages Command line interface; steep learning curveVery expensive
© Stephan Sorger 2013. www.StephanSorger.com; Marketing Analytics: Analytics with R; R.2
![Page 3: Marketing Analytics with R Disclaimer: All logos, photos, etc. used in this presentation are the property of their respective copyright owners and are](https://reader035.vdocuments.mx/reader035/viewer/2022070409/56649e745503460f94b74bce/html5/thumbnails/3.jpg)
Statistical Analysis Software: Supplier Companies
Topic Definition
Statistical Software SAS: Market leader, especially in Fortune 500SPSS: Strong in education market (IBM)R: Open source Others: StatPac, StatSoft STATISTICA, etc.
Business Intelligence Overall Size: 2013: $13.8B; 2016: $17.1BIBM Cognos (2011: 12.1% of market)Microsoft BI (2011: 8.7% of market)Oracle Hyperion 2011 (2011: 15.6% of market)SAP Business Objects (2011: 23.6% of market)SAS Business Intelligence (2011: 12.6% of market)SPSS Modeler (2011: 0.4% of market)Others: GoodData, Panorama, Tableau, etc.
© Stephan Sorger 2013. www.StephanSorger.com; Marketing Analytics: Analytics with R; R.3
Gartner Press Release, “Gartner Says WorldWide Business Intelligence Software Revenue to Grow 7% in 2013.” February 19, 2013. http://www.gartner.com/newsroom/id/2340216SAS Press Release, “SAS in Leaders Quadrant for Business Intelligence Platforms.” February 3, 2010.http://www.sas.com/news/preleases/biplatformsgartnerleader.html
![Page 4: Marketing Analytics with R Disclaimer: All logos, photos, etc. used in this presentation are the property of their respective copyright owners and are](https://reader035.vdocuments.mx/reader035/viewer/2022070409/56649e745503460f94b74bce/html5/thumbnails/4.jpg)
Statistical Analysis Software: Supplier Companies
© Stephan Sorger 2013. www.StephanSorger.com; Marketing Analytics: Analytics with R; R.4
Kalakota, Ravi. PracticalAnalytics.Wordpress.Com.” Gartner Says - BI and Analytics a $12.2B Market.” April 24, 2011.http://practicalanalytics.wordpress.com/2011/04/24/gartner-says-bi-and-analytics-a-10-5-bln-market/
Tableau MicrosoftOracle IBM
SAS
SAPActuate
Panorama
GoodData
Jaspersoft
Challengers Leaders
Niche Visionaries
Gartner Magic QuadrantBusiness IntelligenceApril 2011(Excerpts)
Ability toExecute
Completeness of Vision
![Page 5: Marketing Analytics with R Disclaimer: All logos, photos, etc. used in this presentation are the property of their respective copyright owners and are](https://reader035.vdocuments.mx/reader035/viewer/2022070409/56649e745503460f94b74bce/html5/thumbnails/5.jpg)
Statistical Analysis Software: Supplier Companies
© Stephan Sorger 2013. www.StephanSorger.com; Marketing Analytics: Analytics with R; R.5
Kalakota, Ravi. PracticalAnalytics.Wordpress.Com.” Gartner Says - BI and Analytics a $12.2B Market.” April 24, 2011.http://practicalanalytics.wordpress.com/2011/04/24/gartner-says-bi-and-analytics-a-10-5-bln-market/
ReportingWhatHappened?
Business IntelligenceTechnologiesTDWI Model
Complexity
AnalysisWhy did it happen?
MonitoringWhat’s happening now?
PredictionWhat might happen?
Predictive analytics
Dashboards, Scorecards
OLAP, Visualization tools
Query, reporting, and search tools
Business ValueLow
High
High
![Page 6: Marketing Analytics with R Disclaimer: All logos, photos, etc. used in this presentation are the property of their respective copyright owners and are](https://reader035.vdocuments.mx/reader035/viewer/2022070409/56649e745503460f94b74bce/html5/thumbnails/6.jpg)
Statistical Analysis Software: Major Suppliers
Criteria SAS SPSS R
Market Fortune 500 Universities UniversitiesFocus Power Ease of use PriceUser Power user Student Price-sensitiveOrigins Industry Education Open SourceLearning Difficult Moderate ModerateCost $86,600/yr+ $16,000/yr+ FreeUI Command Line Point & Click Command LineDatabase 32,768 var. 1 file at a timeGraphics SAS/Graph High quality Different packagesAnalogy Microsoft Apple Linux
© Stephan Sorger 2013. www.StephanSorger.com; Marketing Analytics: Analytics with R; R.6
UCLA, Statistical Software Packages Comparison, ats.ucla.edu:http://www.ats.ucla.edu/stat/mult_pkg/compare_packages.htm
MineQuest Business Analytics, “Cost of Licensing WPS 3.0 vs. SAS 9.3.” February 2013.http://www.minequest.com/downloads/Pricing_Comparisons_Between_WPS_and_SAS.pdfIBM SPSS Statistics website, “Buy IBM SPSS Statistics Now”http://www-01.ibm.com/software/analytics/spss/products/statistics/buy-now.html
![Page 7: Marketing Analytics with R Disclaimer: All logos, photos, etc. used in this presentation are the property of their respective copyright owners and are](https://reader035.vdocuments.mx/reader035/viewer/2022070409/56649e745503460f94b74bce/html5/thumbnails/7.jpg)
R: Introduction
Topic Description
Description Free statistical computing and graphics software packageWidely used among statisticians and data minersIncreased popularity in 2010 - on
History Started in 1993Implementation of the S programming language (1976)S offered interactive alternative to Fortran programsS developed by John M. Chambers of Stanford UniversityR developed by Ross Ihaka and Robert Gentleman“R” from Ross & Robert, as well as play on “S”
Commercial Revolution Analytics offers enterprise version ($)
© Stephan Sorger 2013. www.StephanSorger.com; Marketing Analytics: Analytics with R; R.7
References:1. Venables, W.N., Smith, D.M., “An Introduction to R.” Version 3.0.1. May 16, 2013.http://www.cran.r-project.org/doc/manuals/R-intro.pdf
![Page 8: Marketing Analytics with R Disclaimer: All logos, photos, etc. used in this presentation are the property of their respective copyright owners and are](https://reader035.vdocuments.mx/reader035/viewer/2022070409/56649e745503460f94b74bce/html5/thumbnails/8.jpg)
R: Introduction
Topic Description
Features Variety of statistical and graphical techniquesDistributed through GNU GPL (General Public License)GNU: Gnu’s Not Unix; Recursive acronym
Advantages FreePowerfulExtensible through functions and extensionsR community noted for its active contributionsDifferent graphical user interfaces (GUIs) available
Disadvantages Can be slow and memory-hungryUses command line interpreter; No native GUI
© Stephan Sorger 2013. www.StephanSorger.com; Marketing Analytics: Analytics with R; R.8
References:1. Venables, W.N., Smith, D.M., “An Introduction to R.” Version 3.0.1. May 16, 2013.http://www.cran.r-project.org/doc/manuals/R-intro.pdf
![Page 9: Marketing Analytics with R Disclaimer: All logos, photos, etc. used in this presentation are the property of their respective copyright owners and are](https://reader035.vdocuments.mx/reader035/viewer/2022070409/56649e745503460f94b74bce/html5/thumbnails/9.jpg)
R: Basics
Topic Description
Commands Based on UNIX; case sensitiveCommands separated by “;” or by newlineCompound expression in braces: “{ and }”Comments designated by hashtag: #Comment
Data Structure Vector Assignment:> x <- c (1, 2, 3, 4, 5.8)> : Prompt at beginning of line<- : Assignment operatorc() : Function c
Class “Numeric”; “Logical”; “Character”; “List”
Reading Data “read.table()” function> HousePrice <- read.table(“houses.data”)
© Stephan Sorger 2013. www.StephanSorger.com; Marketing Analytics: Analytics with R; R.9
References:1. Venables, W.N., Smith, D.M., “An Introduction to R.” Version 3.0.1. May 16, 2013.http://www.cran.r-project.org/doc/manuals/R-intro.pdf
![Page 10: Marketing Analytics with R Disclaimer: All logos, photos, etc. used in this presentation are the property of their respective copyright owners and are](https://reader035.vdocuments.mx/reader035/viewer/2022070409/56649e745503460f94b74bce/html5/thumbnails/10.jpg)
R: Basics
Topic Description
Class “Numeric”; “Logical”; “Character”; “List”
Reading Data “read.table()” function> HousePrice <- read.table(“houses.data”)
Function R features a rich set of functionsStatistics functions: mean(x); median(x); range(x); etc.Arithmetic functions: 4^2; log (10); sqrt (16)
Plots > hist(x)# generates a default histogram> plot(x,y)# generates a quick x-y plot> quartz(height=4, width=10)# make a wide window
© Stephan Sorger 2013. www.StephanSorger.com; Marketing Analytics: Analytics with R; R.10
References:1. Venables, W.N., Smith, D.M., “An Introduction to R.” Version 3.0.1. May 16, 2013.http://www.cran.r-project.org/doc/manuals/R-intro.pdf
![Page 11: Marketing Analytics with R Disclaimer: All logos, photos, etc. used in this presentation are the property of their respective copyright owners and are](https://reader035.vdocuments.mx/reader035/viewer/2022070409/56649e745503460f94b74bce/html5/thumbnails/11.jpg)
R: Getting Started
Topic Description
Download R Windows:http://cran.r-project.org/bin/windows/base/Mac:http://cran.r-project.org/bin/macosx/
Launch R Double-click to launchWill see prompt in “R Console”>
New Script Select File > New ScriptEditor will openArrange Editor window on left; Console on right
© Stephan Sorger 2013. www.StephanSorger.com; Marketing Analytics: Analytics with R; R.11
Untitled—R Editor R Console
> |
![Page 12: Marketing Analytics with R Disclaimer: All logos, photos, etc. used in this presentation are the property of their respective copyright owners and are](https://reader035.vdocuments.mx/reader035/viewer/2022070409/56649e745503460f94b74bce/html5/thumbnails/12.jpg)
R: Getting Started
Topic Description
Enter Vector <- = “Equal to”; [<- looks like arrow]Example: vector<-c(2, 4, 6, 8)
Run Line Execute (run) lineHighlight line on R editor; Click on “Run Line” icon;3rd from leftWill see “vector” entered in console
© Stephan Sorger 2013. www.StephanSorger.com; Marketing Analytics: Analytics with R; R.12
Open ScriptSave Script
Run LineReturn focus to Console
Untitled—R Editor
vector<-c(2,4,6,8)
R Console
> vector<-c(2,4,6,8)
RGui Icons
![Page 13: Marketing Analytics with R Disclaimer: All logos, photos, etc. used in this presentation are the property of their respective copyright owners and are](https://reader035.vdocuments.mx/reader035/viewer/2022070409/56649e745503460f94b74bce/html5/thumbnails/13.jpg)
R: Getting Started
Topic Description
Statistics Find statisticsmean(vector) <RUN LINE> (mean)var(vector) <RUN LINE> (variance)sd(vector) <RUN LINE> (standard deviation)
© Stephan Sorger 2013. www.StephanSorger.com; Marketing Analytics: Analytics with R; R.13
Untitled—R Editor
vector<-c(2,4,6,8)mean(vector)var(vector)sd(vector)
R Console
> vector<-c(2,4,6,8)> mean(vector)[1] 5> var(vector)[1] 6.6667> sd(vector)[1] 2.5819
![Page 14: Marketing Analytics with R Disclaimer: All logos, photos, etc. used in this presentation are the property of their respective copyright owners and are](https://reader035.vdocuments.mx/reader035/viewer/2022070409/56649e745503460f94b74bce/html5/thumbnails/14.jpg)
R: Getting Started
Topic Description
Directory Load data file to R; typically enter as CSV
CSV File Comma-Separated Values; “Save As” CSV in Excel
Example Datafile.csvA, B, C, D (identifiers)1, 2, 3, 4 (data for observation #1)2, 4, 6, 8 (data for observation #2)…
Load Data Drag csv file and drop into R ConsoleR will show filepath: “C:\\My Documents\\R Files\\...”Type filename and read command into R Editor Example<-read.csv(“C:\\My Documents…”, header=T); Run
© Stephan Sorger 2013. www.StephanSorger.com; Marketing Analytics: Analytics with R; R.14
Untitled—R Editor
Datafile<-read.csv(“C:\\My ..”,header=T)
R Console
> load(“C:\\My Documents\\...”
![Page 15: Marketing Analytics with R Disclaimer: All logos, photos, etc. used in this presentation are the property of their respective copyright owners and are](https://reader035.vdocuments.mx/reader035/viewer/2022070409/56649e745503460f94b74bce/html5/thumbnails/15.jpg)
R: Getting Started
Topic Description
Directory Alternative approach: Set up working directory for datasetWorking directory allows shorter filepathsWindows: See “Windows Explorer help” for more infoMac: See “Finder help” for more info
Data Structured dataset commands: str; summary; fix
str() StructureShows structure of Datafile; “data.frame: 4 obs. of 4 variables”
summary() SummaryShows summary: Min; Max; Mean; Median
fix() FixShows data structure in matrix form to change (fix) entries
© Stephan Sorger 2013. www.StephanSorger.com; Marketing Analytics: Analytics with R; R.15
Untitled—R Editor
str(Datafile)summary(Datafile)fix(Datafile)
R Console
> (shows structure of datafile)> (shows summary of datafile)> (allows fixing of datafile)
![Page 16: Marketing Analytics with R Disclaimer: All logos, photos, etc. used in this presentation are the property of their respective copyright owners and are](https://reader035.vdocuments.mx/reader035/viewer/2022070409/56649e745503460f94b74bce/html5/thumbnails/16.jpg)
R: Getting Started
Topic Description
Help Get help with “read.csv” command?(read.csv)help(read.csv)
Help Results help(read.csv); shows defaults:read.csv(file, header=TRUE, sep=,”, quote=“\”,dec=“.”, fill=TRUE, comment.char=“”, …)Followed by explanations of commands and parameters
© Stephan Sorger 2013. www.StephanSorger.com; Marketing Analytics: Analytics with R; R.16
Untitled—R Editor
help(read.csv)
R Console
<Opens new window with help>
![Page 17: Marketing Analytics with R Disclaimer: All logos, photos, etc. used in this presentation are the property of their respective copyright owners and are](https://reader035.vdocuments.mx/reader035/viewer/2022070409/56649e745503460f94b74bce/html5/thumbnails/17.jpg)
R: Getting Started
Topic Description
Packages Load packages when functions are missing
Load Select “Packages” Load Package from RGui top menuSelect CRAN mirror: USA (CA 1), UK (London), Vietnam, etc.
Install Select “Packages” Install PackageSelect Package from scrolling list: lm() [regression analysis], ..
© Stephan Sorger 2013. www.StephanSorger.com; Marketing Analytics: Analytics with R; R.17
Untitled—R Editor R Console
> chooseCRANmirror()> utils::menuInstallPkgs()
File Edit Packages Windows Help
![Page 18: Marketing Analytics with R Disclaimer: All logos, photos, etc. used in this presentation are the property of their respective copyright owners and are](https://reader035.vdocuments.mx/reader035/viewer/2022070409/56649e745503460f94b74bce/html5/thumbnails/18.jpg)
R: Regression
Topic Description
Data Created dataset “RealData” of real estate valuesData captures Price, House Size, and Lot Size for 20 housesConvert data to CSV format; Excel: “Save As” csv
Load Drag and drop data into R ConsoleR Console: Copy filepath nameR Editor: Paste filepath name; add read.csv commandR Editor: Run Line
Structure Check structure of datasetstr(RealData)‘data.frame’: 20 observations of 3 variables:Price: num 6 5.8 5.6 …; House: num 6.9 8 …; Lot: num 42.7…
© Stephan Sorger 2013. www.StephanSorger.com; Marketing Analytics: Analytics with R; R.18
Untitled—R Editor
RealData<-read.csv(“C:\\My ..”,header=T)
R Console
> load(“C:\\My Documents\\...”
![Page 19: Marketing Analytics with R Disclaimer: All logos, photos, etc. used in this presentation are the property of their respective copyright owners and are](https://reader035.vdocuments.mx/reader035/viewer/2022070409/56649e745503460f94b74bce/html5/thumbnails/19.jpg)
R: Regression
Topic Description
Dependent Set Price equal to Dependent variable
Explanatory Price is a function of Explanatory variables House and Lot
Equation Price = c1 + c2*(House Size) + c3*(Lot Size)
lm Regression analysis in R; stands for Linear Model
Syntax lm(Dependent~Independent+Independent, Dataset)
Equation lm(Price~House+Lot,RealData)Type into R Editor; Run LineSee results in R Console
© Stephan Sorger 2013. www.StephanSorger.com; Marketing Analytics: Analytics with R; R.19
Untitled—R Editor
lm(Price~House+Lot,RealData)
R Console
> lm(Price+House+Lot,RealData)(Intercept) House Lot-0.55415 0.64680 0.02763
![Page 20: Marketing Analytics with R Disclaimer: All logos, photos, etc. used in this presentation are the property of their respective copyright owners and are](https://reader035.vdocuments.mx/reader035/viewer/2022070409/56649e745503460f94b74bce/html5/thumbnails/20.jpg)
R: Regression
Topic Description
Results Compare results from R with those from ExcelMethod Coefficient House Size Lot SizeExcel -0.554 +0.646 +0.027R -0.55415 +0.64680 +0.02763
Interpretations R results same as those from ExcelHouse size important factor when assessing priceLot size not as important
Statistics Option of calculating regression statisticsRealReg<-lm(Price~House+Lot,RealData)summary(RealReg)Gives significance codes, R-squared, F-statistics
© Stephan Sorger 2013. www.StephanSorger.com; Marketing Analytics: Analytics with R; R.20
Untitled—R Editor
lm(Price~House+Lot,RealData)
R Console
> lm(Price+House+Lot,RealData)(Intercept) House Lot-0.55415 0.64680 0.02763