big data hadoop briefing hosted by cisco, wwt and mapr: mapr overview presentation

31
© 2014 MapR Technologies 1 © 2014 MapR Technologies

Upload: ervogler

Post on 27-Jan-2015

119 views

Category:

Technology


2 download

DESCRIPTION

Learn more about how MapR gives you the most technologically advanced distribution for Hadoop, with the product, services, and partner network to ensure production success and continued success.

TRANSCRIPT

Page 1: Big Data Hadoop Briefing Hosted by Cisco, WWT and MapR: MapR Overview Presentation

© 2014 MapR Technologies 1 © 2014 MapR Technologies

Page 2: Big Data Hadoop Briefing Hosted by Cisco, WWT and MapR: MapR Overview Presentation

© 2014 MapR Technologies 2

MapR Overview

BIG DATA

BEST PRODUCT

BUSINESS IMPACT

Hadoop Top Ranked

Production

Success

Page 3: Big Data Hadoop Briefing Hosted by Cisco, WWT and MapR: MapR Overview Presentation

© 2014 MapR Technologies 3 © 2014 MapR Technologies

3 Trends

Forcing a revolution in enterprise architecture

Page 4: Big Data Hadoop Briefing Hosted by Cisco, WWT and MapR: MapR Overview Presentation

© 2014 MapR Technologies 4

Industry Leaders Compete and Win with Data 1 TREND

More Data Beats Better Algorithms

Collecting interaction data from ecommerce, social media, offline, and call centers

enables a “customer 360 view” and consumer intimacy

Competitive Advantage is Decided by 0.5%

Consumer financial services: 1% improvement in fraud means hundreds of millions of dollars

Advertising and retail: 0.5% improvement in lift means millions of dollars increase in profitability

Page 5: Big Data Hadoop Briefing Hosted by Cisco, WWT and MapR: MapR Overview Presentation

© 2014 MapR Technologies 5

Big Data is Overwhelming Traditional Systems

• Mission-critical reliability

• Transaction guarantees

• Deep security

• Real-time performance

• Backup and recovery

• Interactive SQL

• Rich analytics

• Workload management

• Data governance

• Backup and recovery

Enterprise Data

Architecture

2 TREND

ENTERPRISE USERS

OPERATIONAL SYSTEMS

ANALYTICAL SYSTEMS

PRODUCTION REQUIREMENTS

PRODUCTION REQUIREMENTS

OUTSIDE SOURCES

Page 6: Big Data Hadoop Briefing Hosted by Cisco, WWT and MapR: MapR Overview Presentation

© 2014 MapR Technologies 6

Hadoop: The Disruptive Technology at the Core of Big Data 3 TREND

JOB TRENDS FROM INDEED.COM

Jan ‘06 Jan ‘12 Jan ‘14 Jan ‘07 Jan ‘08 Jan ‘09 Jan ‘10 Jan ‘11 Jan ‘13

Page 7: Big Data Hadoop Briefing Hosted by Cisco, WWT and MapR: MapR Overview Presentation

© 2014 MapR Technologies 7 © 2014 MapR Technologies

And 3 Realities

Page 8: Big Data Hadoop Briefing Hosted by Cisco, WWT and MapR: MapR Overview Presentation

© 2014 MapR Technologies 8

OPERATIONAL SYSTEMS

ANALYTICAL SYSTEMS

ENTERPRISE USERS

1 REALITY

• Data staging

• Archive

• Data transformation

• Data exploration

• Streaming,

interactions

Hadoop Relieves the Pressure from Enterprise Systems

2 Interoperability

1 Reliability and DR

4 Supports operations

and analytics

3 High performance

Keys for Production Success

Page 9: Big Data Hadoop Briefing Hosted by Cisco, WWT and MapR: MapR Overview Presentation

© 2014 MapR Technologies 9

Hadoop is Being Used to Drive Small, Rapid Decisions 2 REALITY

High Arrival Rate Data • Clickstream • Social media • Sensor data, …

Business Impact • Revenue optimization • Risk mitigation • Operational efficiency

Page 10: Big Data Hadoop Briefing Hosted by Cisco, WWT and MapR: MapR Overview Presentation

© 2014 MapR Technologies 10

Architecture Matters for Success 3 REALITY

FOUNDATION

Page 11: Big Data Hadoop Briefing Hosted by Cisco, WWT and MapR: MapR Overview Presentation

© 2014 MapR Technologies 11

FOUNDATION

Architecture Matters for Success 3 REALITY

Data protection

& security

High performance

Multi-tenancy

Workload

management

Open standards

for integration

NEW APPLICATIONS SLAs TRUSTED INFORMATION LOWER TCO

Page 12: Big Data Hadoop Briefing Hosted by Cisco, WWT and MapR: MapR Overview Presentation

© 2014 MapR Technologies 12

World-Record Performance on Cisco UCS

PREVIOUS RECORD: 1.6 TB with 2200 nodes

1.65 TB IN 1 MINUTE

298 NODES

NEW MINUTESORT WORLD RECORD

MapR: With a Fraction of the Hardware

Previous Record

Get the most out of your

hardware infrastructure

Page 13: Big Data Hadoop Briefing Hosted by Cisco, WWT and MapR: MapR Overview Presentation

© 2014 MapR Technologies 13 © 2014 MapR Technologies

MapR: Hadoop Real World Examples

Page 14: Big Data Hadoop Briefing Hosted by Cisco, WWT and MapR: MapR Overview Presentation

© 2014 MapR Technologies 14

Largest Biometric Database

in the World

PEOPLE

20 BILLION BIOMETRICS

National identification

system in India for all

citizens

Fingerprint and retinal scan

images and citizen data

1 trillion+ ID verifications

per week, geographically

dispersed across 8 data

centers

About 600m “residents”

enrolled

Requires 100ms response

times; zero data loss and

cross-datacenter replication

Page 15: Big Data Hadoop Briefing Hosted by Cisco, WWT and MapR: MapR Overview Presentation

© 2014 MapR Technologies 15

Helping Farmers: Software and Insurance

• Help farmers protect and improve their farming operations

• Use machine learning to predict weather & other agribusiness elements

• Combine hyper-local weather monitoring, agronomic data modeling, and

high-resolution weather simulations

• Project weather for 2.5 years at every 20x20 plot across the US

• Climatology simulations need to quickly experiment at small scale and

then scale reliably

• MapR Hadoop to analyze >10 trillion data points from 2.5million sensors

• Faster machine learning performance enables more/faster simulations

• MapR M7 enables geospatial database backed by Amazon S3

OBJECTIVES

CHALLENGES

SOLUTION

Lower risk with new insurance products through better data analytics

Business Impact

“85% of farmer risk is weather-related. MapR has enabled us to provide a class of weather insurance

that was not available before, helping farmers protect their operations.” IT Director, Climate Corporation

Page 16: Big Data Hadoop Briefing Hosted by Cisco, WWT and MapR: MapR Overview Presentation

© 2014 MapR Technologies 16

Cisco was able to analyze service sales opportunities in 1/10 the time, at 1/10 the cost,

and generated $40 million in incremental service bookings in the first year.

Cisco: 360° Customer View Cisco uses integrated customer data to increase revenues

• Create shared view of customer & operations across 75,000 employees

• Increase revenue opportunities with sales partners

• Customer information was siloed in different divisions

• Customer interactions were inconsistent and not satisfying

• Missed opportunities for upselling/cross selling

• Use MapR to collect customer information across touch points

• Integrate billing, support, manufacturing, social media, websites, dial-in data

• Generate new sales leads internally and for partners

OBJECTIVES

CHALLENGES

SOLUTION

Architecture for

Sales Partner Opportunities

Business Impact

Page 17: Big Data Hadoop Briefing Hosted by Cisco, WWT and MapR: MapR Overview Presentation

© 2014 MapR Technologies 17

Financial Services: Recommendation Engine & Real-time Targeting Making personalized real-time offers to credit card customers

• Increase revenue and customer loyalty with real-time personalized offers

• Increases revenue and improves customer experience through real-time targeting

• A more flexible, scalable platform that’s a fraction of the cost of traditional technologies

• Ensures reliability with MapR’s high availability and disaster recovery features

• Many different CRM tools and siloed targeting engines

• Developers and analysts are unable to access all customer data

• Want to increase speed and relevance of recommendations

• MapR M7 centralizes analytics and operational apps on one platform

• Integrates all customer online and offline data into HBase in real-time:

card member spend graph, merchant data, location, and feedback

• Centralized customer data repository provides more accurate insights

• Uses Mahout machine learning to provide real-time personalized offers

OBJECTIVES

CHALLENGES

SOLUTION

Business Impact

GLOBAL FINANCIAL SERVICES

CORPORATION

Page 18: Big Data Hadoop Briefing Hosted by Cisco, WWT and MapR: MapR Overview Presentation

© 2014 MapR Technologies 18

Rubicon Project: Ad Optimization Rubicon Project runs a real-time automated advertising platform

• Create open ad platform for over 100K global advertising brands and over

500 of the world’s premium publishers

• To keep up with their rapid growth, they needed to move to a

fault-tolerant, high-availability Hadoop production system

• Hadoop had become central to their operations but they were having

problems with instability

• Their 330-node Hadoop cluster processes 1M records/second

• They chose MapR for enterprise features such as high availability, data

protection and recoverability, disaster recovery, redundancy, and support

OBJECTIVES

CHALLENGES

SOLUTION

“Our company cannot run without Hadoop and MapR. We rely on MapR’s self-healing

HA, disaster recovery and advanced monitoring features to conduct 90 billion real-time

auctions on our global transaction platform.” Jan Gelin, VP of Engineering, Rubicon Project

Business Impact

Page 19: Big Data Hadoop Briefing Hosted by Cisco, WWT and MapR: MapR Overview Presentation

© 2014 MapR Technologies 19

Operational Apps: Push Messaging Platform MapR: Enabling the “smartest, most aware, precise, easy-to-use, scalable,

secure and powerful push messaging platform on the planet"

• Enable organizations to build one-on-one brand relationships

• Push messaging and geo-location targeting that

• Support large numbers of customers in a multi-tenant platform

• Target specific consumers in real time with relevant offers

• Increase reliability of push messaging while lowering data center costs

OBJECTIVES

CHALLENGES

SOLUTION

• Increasing engagement and customer loyalty for 100’s of leading brands

• Reduced hardware footprint by 50%

• Consolidated 8 Hadoop clusters into 1 MapR cluster

Business Impact

• MapR Distribution for Hadoop with Apache HBase for operational workloads

• Data placement control enables efficient cluster resource management

Page 20: Big Data Hadoop Briefing Hosted by Cisco, WWT and MapR: MapR Overview Presentation

© 2014 MapR Technologies 20 © 2014 MapR Technologies

Enterprise Data Hub Case Studies

Page 21: Big Data Hadoop Briefing Hosted by Cisco, WWT and MapR: MapR Overview Presentation

© 2014 MapR Technologies 21

Data Warehouse Optimization Improve data services to customers while reducing enterprise architecture costs

• Provide cloud, security, managed services, data center, & comms

• Report on customer usage, profiles, billing, and sales metrics

• Improve service: Measure service quality and repair metrics

• Reduce customer churn – identify and address IP network hotspots

• Cost of ETL & DW storage for growing IP and clickstream data; >3 months

• Reliability & cost of Hadoop alternatives limited ETL & storage offload

• MapR Data Platform for data staging, ETL, and storage at 1/10th the cost

• MapR provided smallest datacenter footprint with best DR solution

• Enterprise-grade: NFS file management, consistent snapshots & mirroring

OBJECTIVES

CHALLENGES

SOLUTION

• Increased scale to handle network IP and clickstream data

• Reduced workload on DW to maintain reporting SLA’s to business

• Unlocked new insights into network usage and customer preferences

Business Impact

FORTUNE 100

TELCO

Page 22: Big Data Hadoop Briefing Hosted by Cisco, WWT and MapR: MapR Overview Presentation

© 2014 MapR Technologies 22

Mainframe Offload & Optimization Free up MIPS with Hadoop to Lower Cost and Modernize Data Architecture

• Reduce costs: defer expensive mainframe upgrades and reduce MIPS

• Maintain business SLA’s

• Open standards: convert gradually to next-gen data architecture (Hadoop)

• Connect and transform unique data formats (EBCDIC vs. ASCII)

• Skills shortage: Hadoop and mainframe (COBOL & JCL)

• Reliability and flexibility of alternate systems

• Syncsort connectivity and data conversions on MapR

• MapR uniquely handle small files without additional ETL steps to meet SLA

• MapR only Hadoop distribution with reliability mainframe customers expect

OBJECTIVES

CHALLENGES

SOLUTION

Reduce storage costs: Go from $100K/TB to $1K/TB by migrating data to Hadoop

Use MIPS wisely: Save average of $7K per MIPS by offloading batch jobs to Hadoop

Deliver powerful new insights: combine mainframe data with big data for deep insights

Business

Impact

Page 23: Big Data Hadoop Briefing Hosted by Cisco, WWT and MapR: MapR Overview Presentation

© 2014 MapR Technologies 23 © 2014 MapR Technologies

Security and Risk Mgmt. Case Studies

Page 24: Big Data Hadoop Briefing Hosted by Cisco, WWT and MapR: MapR Overview Presentation

© 2014 MapR Technologies 24

Solutionary: Managed Security Services Provider Threat detection on real-time streaming data via platform as a service (PaaS)

• To address their growing customer base by processing trillions of messages (petabyte)

per year while continuing to provide reliable security services

• To improve data analytics by leveraging newer, more granular unstructured data

sources

”MapR has taken Apache Hadoop to a new level of performance and manageability. It integrates into

our systems seamlessly to help us boost the speed and capacity of data analytics for our clients.”

- Dave Caplinger, Director of Architecture, Solutionary

• Expanding existing database solution to meet demand was cost prohibitive

• The existing technology could not process unstructured data at scale

• Replaced RDBMS with MapR M7 to scale while retaining reliability requirements

• Reduced time needed to investigate security events for relevance and impact

• Improved data analytics, enabling new services and security analytics

• 2x faster performance compared to competing solutions

OBJECTIVES

CHALLENGES

SOLUTION

Business Impact

Leader in Magic Quadrant

Page 25: Big Data Hadoop Briefing Hosted by Cisco, WWT and MapR: MapR Overview Presentation

© 2014 MapR Technologies 25

Zions Bank: From SIEM to Fraud Detection Cost effective security analytics and fraud detection on one platform

• To operationalize big data fraud detection: Fraud Operations and Security Analytics

team at Zions maintains data stores, builds statistical models to detect fraud, and then

uses these models to data mine and evaluate suspicious activity

• (Global bank fraud costs $200B annually)

“We initially got into centralizing all of our data from an information security perspective. We then saw

that we could use this same environment to help with fraud detection”

Michael Fowkes - SVP Fraud Operations and Security Analytics

• Existing technology infrastructure could not scale

• Timeliness of reports degraded over the last several years

• Chose MapR and cut storage costs by 50%

• Gained huge performance advantage – Querying time reduced from 24 hours to 30

min on 1.2 PB of data

• Leverage MapR scale for increased model accuracy and deeper insights

OBJECTIVES

CHALLENGES

SOLUTION

Business Impact

Page 26: Big Data Hadoop Briefing Hosted by Cisco, WWT and MapR: MapR Overview Presentation

© 2014 MapR Technologies 26

Cisco: Global Security Intelligence Operations (MSSP) Operational and analytical security applications on one platform

• To protect customer networks through early-warning intelligence & vulnerability analysis

• To better react to evolving security threats in real-time

• Collect additional telemetry data from customers' firewalls, intrusion prevention systems

• Different analytical teams derived security intelligence in silos and lacked synergy

• Inability to scale with existing infrastructure to a million events per second from nearly

100 different channels over tens of thousands of distributed sensors

OBJECTIVES

CHALLENGES

SOLUTION

Business Impact

• All analytic teams leverage a common platform leading to operational efficiencies • Capability to scale - aggregating and analyzing millions of data points in real time • Update customer networks with new threat footprints within a 2 to 5 minute window

• MapR M7: Central hub for all of the security analytics teams

• Stream, interactive, graph and batch processing on MapR with the flexibility to

perform closed-loop analytics across these functions in real time

• Key Features: Scale, enterprise-grade, operational efficiency and high performance

Page 27: Big Data Hadoop Briefing Hosted by Cisco, WWT and MapR: MapR Overview Presentation

© 2014 MapR Technologies 27

Cisco SIO Hadoop Stack

SENSOR DATA

FIREWALL

LOGS

INTRUSION

PROTECTION

SYSTEM LOGS

Globally Dispersed Datacenters

SECURITY

APPLIANCE LOGS

SQL Queries

and

Reporting

Batch

Processing

Graph

Processing

New Threat Footprint

within 2-5 min

Closed-Loop

Operations

Benefits: Unified platform for Analytics

Low Operational Costs

Faster Response Times

Better Algorithms

MapR M7 Distribution for Hadoop

1 million events/sec. Over 100 channels

Spark Streamin

g for known threats

& aggregation

Mahout, MLLib

Shark, Impala GraphX & TitanDB

Page 28: Big Data Hadoop Briefing Hosted by Cisco, WWT and MapR: MapR Overview Presentation

© 2014 MapR Technologies 28

MapR is the Hadoop Technology Leader

BIG DATA

HADOOP

Page 29: Big Data Hadoop Briefing Hosted by Cisco, WWT and MapR: MapR Overview Presentation

© 2014 MapR Technologies 29

MapR Distribution for Hadoop

MapR Data Platform (Random Read/Write)

Data Hub Enterprise Grade Operational

MapR-FS (POSIX)

MapR-DB (High-Performance NoSQL)

Security

YARN

Pig

Cascading

Spark

Batch

Spark Streaming

Storm*

Streaming

HBase

Solr

NoSQL & Search

Juju

Provisioning &

Coordination

Savannah*

Mahout

MLLib

ML, Graph

GraphX

MapReduce v1 & v2

APACHE HADOOP AND OSS ECOSYSTEM

EXECUTION ENGINES DATA GOVERNANCE AND OPERATIONS

Workflow & Data

Governance Tez*

Accumulo*

Hive

Impala

Shark

Drill*

SQL

Sentry* Oozie ZooKeeper Sqoop

Knox* Whirr Falcon* Flume

Data Integration & Access

HttpFS

Hue

NFS HDFS API HBase API JSON API

Page 30: Big Data Hadoop Briefing Hosted by Cisco, WWT and MapR: MapR Overview Presentation

© 2014 MapR Technologies 30

MapR Summary

BIG DATA

BEST PRODUCT

BUSINESS IMPACT

Hadoop Top Ranked

Production

Success

Page 31: Big Data Hadoop Briefing Hosted by Cisco, WWT and MapR: MapR Overview Presentation

© 2014 MapR Technologies 31

Q & A

@mapr maprtech

[email protected]

Engage with us!

MapR

maprtech

mapr-technologies