big data predictive analytics with revolution r enterprise (gartner bi summit 2014)

19
Big Data Predictive Analytics with Revolution R Enterprise David Smith Gartner BI Conference, April 2014 Chief Community Officer @revodavid

Upload: revolution-analytics

Post on 27-Aug-2014

16.214 views

Category:

Software


1 download

DESCRIPTION

Presented by David Smith, Chief Community Officer, Revolution Analytics at Garner Business Intelligence and Analytics Summit, April 2014. In this presentation, I'll introduce the open source R language — the modern standard for Data Science — and the enhanced performance, scalability and ease-of-use capabilities of Revolution R Enterprise. Customer case studies will illustrate Revolution R Enterprise as a component of the real-time analytics deployment process, via integration with Hadoop, database warehousing systems and Cloud platforms, to implement data-driven end-user applications.

TRANSCRIPT

Page 1: Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit 2014)

Big Data Predictive Analyticswith Revolution R EnterpriseDavid Smith

Gartner BI Conference, April 2014

Chief Community Officer@revodavid

Page 2: Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit 2014)

2

OUR COMPANY

The leading providerof advanced analytics software and services

based on open source R, since 2007

OUR SOFTWARE

The only Big Data, Big Analytics software platform based on the data science

language R

KUDOS

VisionaryGartner Magic Quadrantfor Advanced Analytics

Platforms, 2014

Page 3: Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit 2014)

What is R? Most widely used data analysis software

• Used by 2M+ data scientists, statisticians and analysts Most powerful statistical programming language

• Flexible, extensible and comprehensive for productivity Create beautiful and unique data visualizations

• As seen in New York Times, Twitter and Flowing Data Thriving open-source community

• Leading edge of analytics research Fills the talent gap

• New graduates prefer R

R is Hotbit.ly/r-is-hot

WHITE PAPER

Page 4: Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit 2014)

Exploding growth and demand for R

R is the highest paid IT skill R most-used data science language

after SQL R is used by 70% of data miners R is #15 of all programming languages R growing faster than any other data

science language R is the #1 Google Search for

Advanced Analytics software R has more than 2 million users

worldwide

R Usage GrowthRexer Data Miner Survey, 2007-2013

70% of data miners report using R

R is the first choice of moredata miners than any other software

Source: www.rexeranalytics.com

Page 5: Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit 2014)

5

Technical Support for Open Source RAdviseR™ from Revolution Analytics

Technical support for open source R, from the R experts.

24x7 email and phone support On-line case management and knowledgebase Access to technical resources, documentation and user forums Exclusive on-line webinars from community experts Guaranteed response times

Also available: expert hands-on and on-line training for R, from Revolution Analytics AcademyR.

www.revolutionanalytics.com/AdviseRwww.revolutionanalytics.com/AcademyR

Page 6: Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit 2014)

Revolution R Enterprise

High Performance, Scalable Analytics Portable Across Enterprise Platforms Easier to Build & Deploy Analytics

is….the only big data big analytics platform based on open source R

6

Page 7: Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit 2014)

7

Big Data In-memory bound Hybrid memory & disk scalability

Operates on bigger volumes & factors

Speed of Analysis

Single threaded Parallel threading Shrinks analysis time

Enterprise Readiness

Community support Commercial support Delivers full service production support

Analytic Breadth & Depth

5000+ innovative analytic packages

Leverage open source packages plus Big Data ready packages

Supercharges R

Commercial Viability

Risk of deployment of open source

GPL-compatible licensing

Eliminate risk with open source

Enhancing Open Source R for the Enterprise

Page 8: Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit 2014)

COMBINE INTERMEDIATE RESULTS

8

Powering Next Generation AnalyticsParallel External Memory Algorithms

Page 9: Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit 2014)

9

Unique PEMAs: Parallel, external-memory algorithms

High-performance, scalable replacements for R/SAS analytic functions

Parallel/distributed processing eliminates CPU bottleneck

Data streaming eliminates memory size limitations

Works with in-memory and disk-based architectures

Eliminates Performance and Capacity Limits of Open Source R and Legacy SAS

Page 10: Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit 2014)

All of Open Source R plus: Big Data scalability High-performance analytics Development and deployment

tools Data source connectivity Application integration framework Multi-platform architecture Support, Training and Services

10

is the Big Data Big Analytics Platform

Page 11: Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit 2014)

R+C

RA

N

Rev

oR

DistributedR

ScaleR

ConnectR

DeployRDevelopR

DESIGNED FOR SCALE, PORTABILITY & PERFORMANCE

In the Cloud Amazon AWS

Workstations & Servers WindowsRed Hat and SUSE Linux

Clustered Systems IBM Platform LSFMicrosoft HPC

EDW IBM NetezzaTeradata

Hadoop HortonworksCloudera

11

Write Once.Deploy Anywhere.

Page 12: Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit 2014)

Write Once Deploy Anywhere

rxSetComputeContext("local") # DEFAULT

rxSetComputeContext(RxHadoopMR(<data, server environment arguments>))

# Summarize and calculate descriptive statistics from the data airDS data setadsSummary = rxSummary(~ArrDelay+CRSDepTime+DayOfWeek, data = airDS)

# Fit Linear Regression Model arrDelayLm1 = rxLinMod(ArrDelay ~ DayOfWeek, data = airDS); summary(arrDelayLm1)

rxSetComputeContext(RxHpcServer(<data, server environment arguments>))

rxSetComputeContext(RxLsfCluster(<data, server environment arguments>))

Same code to be run anywhere …..

Local System (default)

Set the desired compute context for code execution…..

rxSetComputeContext(RxTeradata(<data, server environment arguments>))

Page 13: Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit 2014)

13

In-Hadoop Big Data Big Analytics

Eliminate data movement latency

Speed model development

Use commodity Hadoop nodes as analytics engine

Name Node

Data NodeData Node Data NodeData Node Data Node

Job Tracker

Task Tracker

Task Tracker

Task Tracker

Task Tracker

Task Tracker

MapReduce

HDFS

Page 14: Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit 2014)

14

Revolution Analytics coupled with the Teradata Unified Data Architecture acceleratesbig data analytics with the R language.

+

In-Database Analytics: Parallel R in-database for big

data analytics on Teradata Build parallel R models

completely in R Use Teradata appliance as

analytics engine No need to move data

Teradata 14.10

+Revolution R Enterprise V7

Page 15: Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit 2014)

15

RRE7 in the Cloud

Revolution R Enterprise 7, on the industry-leading cloud platform Pay as you go, priced by cores x hours

– No long-term commitment required Launch Windows and Linux servers on demand

– Windows 2008 R2 with DevelopR– RHEL 6 with RStudio Server Professional– Server instances from 2 – 32 cores– Analyze data sets up to 2 TB

Convenient, consistent and reliable– Available globally, accessible anywhere– Forum-based support with registration

Free 14-day trial available

CLOUD SERVERS

$0.70PER CORE/HOUR

PLUS AWS INFRASTRUCTURE COSTS

Page 16: Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit 2014)

Revolution R Enterprise EcosystemIntegration with the Big Data Analytics Stack

Deployment / Consumption

Data / Infrastructure

Advanced Analytics

ETL

SI / Service MSP / DSP

16

Page 17: Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit 2014)

How Customers Revolutionize their Business

Power

“We’ve combined Revolution R Enterprise and Hadoop to build and deploy customized exploratory data analysis and GAM survival models for our marketing performance management and attribution platform. Given that our data sets are already in the terabytes and are growing rapidly, we depend on Revolution R Enterprise’s scalability and power – we saw about a 4x performance improvement on 50 million records. It works brilliantly.”   - CEO, John Wallace, DataSong

4X performance 50M records scored daily

Scalability

“We’ve been able to scale our solution to a problem that’s so big that most companies could not address it. If we had to go with a different solution we wouldn’t be as efficient as we are now.” - SVP Analytics, Kevin Lyons, eXelate

TB’s data from 200+ data sources10’s thousands attributes100’s millions of scores daily

2X data 2X attributes no impact on performance

Performance

“We need a high-performance analytics infrastructure because marketing optimization is a lot like a financial trading. By watching the market constantly for data or market condition updates, we can now identify opportunities for our clients that would otherwise be lost.” - Chief Analytics Officer, Leon Zemel, [x+1]

Page 18: Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit 2014)

Why Revolution R Enterprise?

18

Platform Independence

Take Big Cost Out of Big Data

Supercharge R for Massive Data

Power R for the Enterprise

Page 19: Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit 2014)

Thank YouDavid SmithChief Community [email protected]