hortonworks & sas · (crm,*erp,*clickstream,*logs)* rdbms* edw* mpp* business* analy4cs* custom...
TRANSCRIPT
Page 1 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Hortonworks & SAS Analytics everywhere.
Page 2 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
A change in focus.
allow organizations to shift interactions from…
Reactive Post Transaction
Proactive Pre Decision
…to Real-time Personalization From static branding
…to repair before break From break then fix
…to Designer Medicine From mass treatment
…to Automated Algorithms From Educated Investing
…to 1x1 Targeting From mass branding
A shift in Advertising
A shift in Financial Services
A shift in Healthcare
A shift in Retail
A shift in Telco
We estimate that within 3 years 50% of the worlds data will reside on Hadoop….
Data is doubling in size every 2 – 3 years…. Traditional or not?
APPLICAT
IONS
DATA
SYSTEM
REPOSITORIES
SOURC
ES
Exis4ng Sources (CRM, ERP, Clickstream, Logs)
RDBMS EDW MPP
Business Analy4cs
Custom Applica4ons
Packaged Applica4ons
Source: IDC
2.8 ZB in 2012
85% from New Data Types
15x Machine Data by 2020
40 ZB by 2020
Hadoop stores and processes the data your customers currently do not or cannot…. 1: Cost profile. 2: Data Structure.
OLTP, ERP, CRM Systems
Unstructured documents, emails
Clickstream
Server logs
Sen>ment, Web Data
Sensor. Machine Data
Geoloca>on
Hadoop enables scalable compute & storage with a compelling cost profile….
MPP
SAN
Engineered System
NAS
HADOOP
Cloud Storage
$0 $20,000 $40,000 $60,000 $80,000 $180,000
Fully-loaded Cost Per Raw TB of Data (Min–Max Cost)
Hadoop enables scalable compute & storage for all data structures….
✚
Determine list of ques4ons
Design solu4ons
Collect structured data
Ask ques4ons from list
Detect addi4onal ques4ons
Current Reality Apply schema on write
Dependent on IT
Repeatable Process: SQL
Augment w/ Hadoop
Apply schema on read
Support range of access patterns to data stored in HDFS: polymorphic access
HADOOP Iterate
over structure Transform and Analyze
Batch Interactive Real-time
Right Engine, Right Job
In-memory
The Net Result: A modern data architecture capable of storing, processing, correlating, analysing, matching, aggregating, searching and exposing….
….all data & insights….
Page 9 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
…….when integrated with the right tools capable of delivering the right results
Page 10 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
DATA
SYSTEM
REPOSITORIES
SOURC
ES
RDBMS EDW MPP
OLTP, ERP, CRM
Documents, Emails
Web Logs, Click Streams
Social Networks
Machine Generated
Sensor Data
Geo-‐loca>on Data
Gov
erna
nce
&
Inte
grat
ion
Secu
rity
Ope
ratio
ns
Data Access
Data Management
APPLICAT
IONS OLTP, ERP, CRM Systems
Unstructured documents, emails
Clickstream
Server logs
Sen>ment, Web Data
Sensor. Machine Data
Geoloca>on
The Modern Data Architecture is a Plus +1.
Enterprise Miner
Base SAS
Page 11 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
SAS accesses and extracts data from Hadoop to a SAS server for processing, and writes results back.
SAS accesses and processes Hadoop data on SAS Servers while keeping the data and computations massively parallel.
SAS processes data directly in the Hadoop cluster.
From.... With.... and In… Hadoop
Page 12 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
SAS/ACCESS to Hadoop
Enterprise Miner
Base SAS Data
Management
disk
ANY?! ANY?! ANY?! ANY?!
SAS + from Hadoop
Page 13 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Access to Hadoop
• Uses Existing SAS Interfaces • Standard Libname syntax • PROC HADOOP • Datastep and Proc SQL translated to Hive • Filename support • Execute Pig Scripts and MapReduce • Push-down of certain procedures • Custom SerDe
Page 14 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
SAS + with Hadoop | SAS Rack architecture SAS Rack
Enterprise Hadoop
Page 15 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
MPI
SASHDAT SASHDAT SASHDAT SASHDAT
memory
disk
Root Node
Visual Analytics
In-memory Statistics for
Hadoop
SAS + in Hadoop | in-memory analytics (and BI)
Visual Statistics
LASR LASR LASR LASR
Page 16 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Turns Big Data Into Real- time Customer Insights
Challenge: Unable to analyze huge amounts of data to optimize and improve real-time customer insights • Understand audience: Having the largest volume of data sets, audience segments/profile in Canada while
leading the Canadian marketplace in privacy and governance. • Find Audience: Being leaders in identifying and targeting audiences across channels, platforms and devices. • Engage Audience: Driving engagement across platforms and formats. • Measure Audience: Exceeding client expectations with transparent reporting and accurate attribution models.
Solution Rogers’ Media Audience Platform: Integration of all data collected across organizations • Query all data in one location:
– Blend of online and offline data, subscription, ecommerce, loyalty programs, etc. • Land massive click stream log files:
– 100+ M records / day – 30 million unique IDs / month
• Use 100% of the data for Analysis and Visualization instead of smaller random samples (over sampling)
Telcos • Rogers Media is a subsidiary of
Rogers Communications, which owns Canada's largest publishing company.
• Has more than 70 consumer and
business publications.
• Rogers Media Inc. also owns 54 radio stations, and several television properties including terrestrial television stations and cable television channels.
Page 17 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Resources
Customer Video: Rogers Media discusses SAS and Hadoop
Webinars: SAS and the Modern Data Architecture SAS and Hortonworks use cases
Demos: SAS Visual Analytics, Ingest SAS to Hive
www.hortonworks.com/SAS
Page 18 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Thank you. Questions