hortonworks & sas · (crm,*erp,*clickstream,*logs)* rdbms* edw* mpp* business* analy4cs* custom...

18
Page 1 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Hortonworks & SAS Analytics everywhere.

Upload: others

Post on 08-Aug-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Hortonworks & SAS · (CRM,*ERP,*Clickstream,*Logs)* RDBMS* EDW* MPP* Business* Analy4cs* Custom Applica4ons* Packaged* Applica4ons* Source: IDC ... Engineered System NAS HADOOP Cloud

Page 1 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Hortonworks & SAS Analytics everywhere.

Page 2: Hortonworks & SAS · (CRM,*ERP,*Clickstream,*Logs)* RDBMS* EDW* MPP* Business* Analy4cs* Custom Applica4ons* Packaged* Applica4ons* Source: IDC ... Engineered System NAS HADOOP Cloud

Page 2 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

A change in focus.

allow organizations to shift interactions from…

Reactive Post Transaction

Proactive Pre Decision

…to Real-time Personalization From static branding

…to repair before break From break then fix

…to Designer Medicine From mass treatment

…to Automated Algorithms From Educated Investing

…to 1x1 Targeting From mass branding

A shift in Advertising

A shift in Financial Services

A shift in Healthcare

A shift in Retail

A shift in Telco

Page 3: Hortonworks & SAS · (CRM,*ERP,*Clickstream,*Logs)* RDBMS* EDW* MPP* Business* Analy4cs* Custom Applica4ons* Packaged* Applica4ons* Source: IDC ... Engineered System NAS HADOOP Cloud

We estimate that within 3 years 50% of the worlds data will reside on Hadoop….

Page 4: Hortonworks & SAS · (CRM,*ERP,*Clickstream,*Logs)* RDBMS* EDW* MPP* Business* Analy4cs* Custom Applica4ons* Packaged* Applica4ons* Source: IDC ... Engineered System NAS HADOOP Cloud

Data is doubling in size every 2 – 3 years…. Traditional or not?

APPLICAT

IONS  

DATA

   SYSTEM  

REPOSITORIES  

SOURC

ES  

Exis4ng  Sources    (CRM,  ERP,  Clickstream,  Logs)  

RDBMS   EDW   MPP  

Business    Analy4cs  

Custom  Applica4ons  

Packaged  Applica4ons  

Source: IDC

2.8  ZB  in  2012  

85%  from  New  Data  Types  

15x  Machine  Data  by  2020  

40  ZB  by  2020  

Page 5: Hortonworks & SAS · (CRM,*ERP,*Clickstream,*Logs)* RDBMS* EDW* MPP* Business* Analy4cs* Custom Applica4ons* Packaged* Applica4ons* Source: IDC ... Engineered System NAS HADOOP Cloud

Hadoop stores and processes the data your customers currently do not or cannot…. 1: Cost profile. 2: Data Structure.

OLTP,  ERP,  CRM  Systems  

Unstructured  documents,  emails  

Clickstream  

Server  logs  

Sen>ment,  Web  Data  

Sensor.  Machine  Data  

Geoloca>on  

Page 6: Hortonworks & SAS · (CRM,*ERP,*Clickstream,*Logs)* RDBMS* EDW* MPP* Business* Analy4cs* Custom Applica4ons* Packaged* Applica4ons* Source: IDC ... Engineered System NAS HADOOP Cloud

Hadoop enables scalable compute & storage with a compelling cost profile….

MPP

SAN

Engineered System

NAS

HADOOP

Cloud Storage

$0 $20,000 $40,000 $60,000 $80,000 $180,000

Fully-loaded Cost Per Raw TB of Data (Min–Max Cost)

Page 7: Hortonworks & SAS · (CRM,*ERP,*Clickstream,*Logs)* RDBMS* EDW* MPP* Business* Analy4cs* Custom Applica4ons* Packaged* Applica4ons* Source: IDC ... Engineered System NAS HADOOP Cloud

Hadoop enables scalable compute & storage for all data structures….

       

Determine  list  of  ques4ons  

Design  solu4ons  

Collect  structured  data  

Ask  ques4ons  from  list  

Detect  addi4onal  ques4ons  

Current Reality Apply schema on write

Dependent on IT

Repeatable Process: SQL

Augment w/ Hadoop

Apply schema on read

Support range of access patterns to data stored in HDFS: polymorphic access

HADOOP Iterate

over structure Transform and Analyze

Batch Interactive Real-time

Right Engine, Right Job

In-memory

Page 8: Hortonworks & SAS · (CRM,*ERP,*Clickstream,*Logs)* RDBMS* EDW* MPP* Business* Analy4cs* Custom Applica4ons* Packaged* Applica4ons* Source: IDC ... Engineered System NAS HADOOP Cloud

The Net Result: A modern data architecture capable of storing, processing, correlating, analysing, matching, aggregating, searching and exposing….

….all data & insights….

Page 9: Hortonworks & SAS · (CRM,*ERP,*Clickstream,*Logs)* RDBMS* EDW* MPP* Business* Analy4cs* Custom Applica4ons* Packaged* Applica4ons* Source: IDC ... Engineered System NAS HADOOP Cloud

Page 9 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

…….when integrated with the right tools capable of delivering the right results

Page 10: Hortonworks & SAS · (CRM,*ERP,*Clickstream,*Logs)* RDBMS* EDW* MPP* Business* Analy4cs* Custom Applica4ons* Packaged* Applica4ons* Source: IDC ... Engineered System NAS HADOOP Cloud

Page 10 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

DATA

   SYSTEM  

REPOSITORIES  

SOURC

ES  

RDBMS   EDW   MPP  

OLTP,  ERP,  CRM    

Documents,    Emails  

Web  Logs,  Click  Streams  

Social  Networks  

Machine  Generated  

Sensor  Data  

Geo-­‐loca>on  Data  

Gov

erna

nce

&

Inte

grat

ion

Secu

rity

Ope

ratio

ns

Data Access

Data Management

APPLICAT

IONS   OLTP,  ERP,  CRM  Systems  

Unstructured  documents,  emails  

Clickstream  

Server  logs  

Sen>ment,  Web  Data  

Sensor.  Machine  Data  

Geoloca>on  

The Modern Data Architecture is a Plus +1.

Enterprise Miner

Base SAS

Page 11: Hortonworks & SAS · (CRM,*ERP,*Clickstream,*Logs)* RDBMS* EDW* MPP* Business* Analy4cs* Custom Applica4ons* Packaged* Applica4ons* Source: IDC ... Engineered System NAS HADOOP Cloud

Page 11 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

SAS accesses and extracts data from Hadoop to a SAS server for processing, and writes results back.

SAS accesses and processes Hadoop data on SAS Servers while keeping the data and computations massively parallel.

SAS processes data directly in the Hadoop cluster.

From.... With.... and In… Hadoop

Page 12: Hortonworks & SAS · (CRM,*ERP,*Clickstream,*Logs)* RDBMS* EDW* MPP* Business* Analy4cs* Custom Applica4ons* Packaged* Applica4ons* Source: IDC ... Engineered System NAS HADOOP Cloud

Page 12 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

SAS/ACCESS to Hadoop

Enterprise Miner

Base SAS Data

Management

disk

ANY?! ANY?! ANY?! ANY?!

SAS + from Hadoop

Page 13: Hortonworks & SAS · (CRM,*ERP,*Clickstream,*Logs)* RDBMS* EDW* MPP* Business* Analy4cs* Custom Applica4ons* Packaged* Applica4ons* Source: IDC ... Engineered System NAS HADOOP Cloud

Page 13 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Access to Hadoop

•  Uses Existing SAS Interfaces •  Standard Libname syntax •  PROC HADOOP •  Datastep and Proc SQL translated to Hive •  Filename support •  Execute Pig Scripts and MapReduce •  Push-down of certain procedures •  Custom SerDe

Page 14: Hortonworks & SAS · (CRM,*ERP,*Clickstream,*Logs)* RDBMS* EDW* MPP* Business* Analy4cs* Custom Applica4ons* Packaged* Applica4ons* Source: IDC ... Engineered System NAS HADOOP Cloud

Page 14 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

SAS + with Hadoop | SAS Rack architecture SAS Rack

Enterprise Hadoop

Page 15: Hortonworks & SAS · (CRM,*ERP,*Clickstream,*Logs)* RDBMS* EDW* MPP* Business* Analy4cs* Custom Applica4ons* Packaged* Applica4ons* Source: IDC ... Engineered System NAS HADOOP Cloud

Page 15 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

MPI

SASHDAT SASHDAT SASHDAT SASHDAT

memory

disk

Root Node

Visual Analytics

In-memory Statistics for

Hadoop

SAS + in Hadoop | in-memory analytics (and BI)

Visual Statistics

LASR LASR LASR LASR

Page 16: Hortonworks & SAS · (CRM,*ERP,*Clickstream,*Logs)* RDBMS* EDW* MPP* Business* Analy4cs* Custom Applica4ons* Packaged* Applica4ons* Source: IDC ... Engineered System NAS HADOOP Cloud

Page 16 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Turns Big Data Into Real- time Customer Insights

Challenge: Unable to analyze huge amounts of data to optimize and improve real-time customer insights •  Understand audience: Having the largest volume of data sets, audience segments/profile in Canada while

leading the Canadian marketplace in privacy and governance. •  Find Audience: Being leaders in identifying and targeting audiences across channels, platforms and devices. •  Engage Audience: Driving engagement across platforms and formats. •  Measure Audience: Exceeding client expectations with transparent reporting and accurate attribution models.

Solution Rogers’ Media Audience Platform: Integration of all data collected across organizations •  Query all data in one location:

–  Blend of online and offline data, subscription, ecommerce, loyalty programs, etc. •  Land massive click stream log files:

–  100+ M records / day –  30 million unique IDs / month

•  Use 100% of the data for Analysis and Visualization instead of smaller random samples (over sampling)

Telcos •  Rogers Media is a subsidiary of

Rogers Communications, which owns Canada's largest publishing company.

•  Has more than 70 consumer and

business publications.

•  Rogers Media Inc. also owns 54 radio stations, and several television properties including terrestrial television stations and cable television channels.

Page 17: Hortonworks & SAS · (CRM,*ERP,*Clickstream,*Logs)* RDBMS* EDW* MPP* Business* Analy4cs* Custom Applica4ons* Packaged* Applica4ons* Source: IDC ... Engineered System NAS HADOOP Cloud

Page 17 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Resources

Customer Video: Rogers Media discusses SAS and Hadoop

Webinars: SAS and the Modern Data Architecture SAS and Hortonworks use cases

Demos: SAS Visual Analytics, Ingest SAS to Hive

www.hortonworks.com/SAS

Page 18: Hortonworks & SAS · (CRM,*ERP,*Clickstream,*Logs)* RDBMS* EDW* MPP* Business* Analy4cs* Custom Applica4ons* Packaged* Applica4ons* Source: IDC ... Engineered System NAS HADOOP Cloud

Page 18 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Thank you. Questions