get started quickly with ibm's hadoop as a service

27
© 2015 IBM Corporation BigInsights on Cloud Hadoop-as-a-Service July 28 th , 2015

Upload: dashdb

Post on 14-Aug-2015

271 views

Category:

Technology


2 download

TRANSCRIPT

© 2015 IBM Corporation

BigInsights on Cloud Hadoop-as-a-Service July 28th, 2015

© 2015 IBM Corporation 2

Disclaimer

IBM’s statements regarding its plans, directions, and intent are subject to change or

withdrawal without notice at IBM’s sole discretion. Information regarding potential future

products is intended to outline our general product direction and it should not be relied on in

making a purchasing decision. The information mentioned regarding potential future products

is not a commitment, promise, or legal obligation to deliver any material, code or functionality.

Information about potential future products may not be incorporated into any contract. The

development, release, and timing of any future features or functionality described for our

products remains at our sole discretion.

© 2015 IBM Corporation 3

Agenda

• Evolution of the Big Data Analytics space

• Open Data Platform and IBM’s BigInsights

• Hadoop as a Service – BigInsights on Cloud Options

• IBM Analytics for Hadoop – Free, 14-day trial

• BigInsights for Apache Hadoop – Bare Metal option for Production

• Demo

• Questions & Answers

• Resources

© 2015 IBM Corporation 4

“At the World Economic Forum last month in Davos, Switzerland, Big Data was a marquee topic. A report by the forum, “Big Data, Big Impact,” declared data a new class of economic asset, like currency or gold.

“Companies are being inundated with data—from information on customer-buying habits to supply-chain efficiency. But many managers struggle to make sense of the numbers.”

“Increasingly, businesses are applying analytics to social media such as Facebook and Twitter, as well as to product review websites, to try to “understand where customers are, what makes them tick and what they want”, says Deepak Advani, who heads IBM’s predictive analytics group.”

“Big Data has arrived at Seton Health Care Family, fortunately accompanied by an analytics tool that will help deal with the complexity of more than two million patient contacts a year…”

“Data is the new oil.”

Clive Humby

The Oscar Senti-meter — a tool developed by the L.A. Times, IBM and the USC Annenberg Innovation Lab — analyzes opinions about the Academy Awards race shared in millions of public messages on Twitter.”

Big Data continues to be a hot topic in the market

“…now Watson is being put to work digesting millions of pages of research, incorporating the best clinical practices and monitoring the outcomes to assist physicians in treating cancer patients.”

© 2015 IBM Corporation 5

An automotive company is running a

series of experiments to better

understand and adapt to shifting

landscape of urban transportation by

streaming data from sensors on cars

using InfoSphere Streams to analyze it

on Hadoop using BigInsights on Cloud

Industrial manufacturer in the United

States reduces errors and the time

required for engine calibrations by 90

percent and improves reliability and new

product design by using sensors to collect

information on its products in the field and

analyzing it using InfoSphere BigInsights

Big Data implementations are driving real

business value for IBM customers

© 2015 IBM Corporation 6

Rich capabilities in IBM’s Big Data Portfolio mean

lower risk and more successful projects

On premise, Cloud, and “as a Service”

BigInsights

© 2015 IBM Corporation 7

Open Data Platform and IBM BigInsights

© 2015 IBM Corporation 8

Open Data Platform Initiative

Why is IBM involved?

Strong history of leadership in open source & standards

Supports our commitment to open source currency in all

future releases

Accelerates our innovation within Hadoop &

surrounding applications

Open Data Platform (ODP) vs. Apache Software

Foundation (ASF)

ODP supports the ASF mission

ASF provides a governance model around individual

projects without looking at ecosystem

ODP aims to provide a vendor-led consistent packaging

model for core Apache components as an ecosystem

All Standard Apache Open Source Components

HDFS

YARN

MapReduce

Ambari HBase

Spark

Flume

Hive Pig

Sqoop

HCatalog

Solr/Lucene

ODP

© 2015 IBM Corporation 9

SQL on Hadoop

Big SQL – optimized ANSI compliant SQL

Application Tooling

Toolkits and accelerators

Search & Entity Matching

Watson Explorer, Big Mach

Data Visualization

BigSheets spreadsheet interface

Predictive Modeling

Big R, Machine Learning

Text Analytics

Advanced text processing with AQL, Text

extraction web interface

Real-time Analytics

Streams

Data Governance and Security

DataClick, LDAP, Secure cluster

Storage Integration

GPFS - POSIX Distributed Filesystem

Enterprise Manageability

Adaptive MapReduce, Multi-tenant

scheduling

BigInsights for Apache Hadoop

IOP + IBM Value Adds = BigInsights

Knox

Ambari

Snappy

Open JDK

Avro

Solr

Oozie

Flume

Slider

Pig

Hadoop

HDFS/MapReduce/YARN*

Zookeeper

Parquet

HBase

IBM Open Platform (IOP)

Spark

Hive

Sqoop

ODP

© 2015 IBM Corporation 10

BigInsights Users & Role-Based Modules

IBM Open Platform

BigInsights for

Apache Hadoop

© 2015 IBM Corporation 11

BigInsights on Cloud

© 2015 IBM Corporation 12

IBM Open Platform uses Ambari

© 2015 IBM Corporation 13

BigInsights Home

© 2015 IBM Corporation 14

IBM BigInsights – BigSheets Spreadsheet style analysis tool for business users

Easily visualize big data using

rich built-in graphing and

analytic functions

© 2015 IBM Corporation 15

Big SQL in BigInsights

Data Sources

Hive Tables HBase Tables

BigSQL Engine

BigInsights

Application

SQL Language

JDBC / ODBC Driver

JDBC / ODBC Server

Native Sources

CSV SEQ

Parquet RC

AVRO ORC

JSON Custom

ANSI SQL 2011 Compliant

IBM’s SQL for Hadoop

• Makes Hadoop data accessible

to a wider audience

• Familiar, widely known syntax

• Leverage native Hadoop

data sources

Complements the Data

Warehouse

• Exploratory analytics

• Sandbox, Data Lake

Included in BigInsights

Use familiar SQL tools

• Cognos, SPSS, Tableau,

MicroStrategy

© 2015 IBM Corporation 16

Example of text analytic tooling: Graphical

interface to describe structure of various

textual formats – from log file data to natural

language. Users do not need to now AQL

IBM BigInsights – Text Analytics

Information Extraction Framework for Text Analytics

© 2015 IBM Corporation 17

R Clients

Embedded R Execution

R Packages

1

2

Explore, visualize, transform, and model big data using familiar R syntax and paradigm

Scale out R

Partitioning of large data (“divide”)

Parallel cluster execution of

pushed down R code (“conquer”)

All of this from within the R

environment (Jaql, Map/Reduce

are hidden from you)

Almost any R package can run in

this environment

Pull data

summaries to R

client

Or, push R

functions right

on the data

Data sources

R Packages

IBM BigInsights – Big R

End-to-end integration of R into BigInsights

© 2015 IBM Corporation 18

Prototype, create mash-ups in

the cloud for non-production use

Empowers developers to rapidly

drive insight from all data

Two-node Docker Instance

Enterprise features – BigSheets,

Big SQL, Text, and Big R

Delivered via IBM Bluemix

50 GB – input data space

Extendable, Free 14-day Trial

For Production deployments at scale

in the cloud

Delivers flexibility and efficiency

with BYOL and PAYG pricing

Scale to meet spikes in demand

without on-premise infrastructure

Perform enterprise-class, complex

analytics on Big Data Available via

the IBM Cloud Marketplace

Web-based UI for Sizing/Pricing

IBM BigInsights – Cloud deployment options

Manage less, analyze more

IBM Analytics for Hadoop BigInsights for Apache Hadoop

© 2015 IBM Corporation 19

IBM Analytics for Hadoop Details

Free 14-day trial on www.bluemix.net

© 2015 IBM Corporation 20

BigInsights for Apache Hadoop – Options

Secure, Dedicated Bare-metal

Infrastructure

IBM Open Platform

BigInsights for

Apache Hadoop

© 2015 IBM Corporation 21

IBM BigInsights on Cloud – Security

Dedicated, isolated environment for every client

Administrative control owned by customer at Hadoop

and BigInsights level

Native HDFS encryption; optional Guardium encryption

Firewalls provide perimeter security and private network isolation

Aiming for ISO 27K1 compliance in 2015

Example Configuration…

Non-shared physical machines for added security & performance

© 2015 IBM Corporation 22

BigInsights on Cloud

Demonstration

© 2015 IBM Corporation 23

The IBM Difference

IBM delivers the foundation for Big Data – now and in the future

Embraces open source

Establishes standards

Integrates with familiar interfaces and established systems

Delivers advanced analytic capabilities

IBM is the only vendor providing…

Hadoop as a Managed Service in the Cloud

A single company providing Hadoop-base software, cloud and services

Provides expertise to help you on your journey

6,000 partners

Analytics services and solution centers

© 2015 IBM Corporation 24

IBM BigInsights on Cloud – unique capability

Built-in Twitter Decahose service

Scaled down random sample of Twitter Firehose

Easily land Twitter data into BigInsights HDFS

Manipulate and visualize data using BigSheets

Incorporate sentiment data into analytic models

Easily store and accommodate vast data sets

© 2015 IBM Corporation 25

Check out more data management services at www.bluemix.net

Cloudant dashDB BigInsights on

Cloud DB2 on Cloud

© 2015 IBM Corporation 26

Big Data University – Free Training http://bigdatauniversity.com/

Powered by Hadoop http://wiki.apache.org/hadoop/PoweredBy

Free Trial Software (both for on-premise and cloud) http://www-01.ibm.com/software/data/infosphere/hadoop/trials.html

YouTube Videos

Watson

• The Science Behind the Answer (~7 minutes)

• Watson: Final Jeopardy (~11 minute summary)

Big Data Channel

• http://www.youtube.com/user/ibmbigdata

Resources

© 2015 IBM Corporation 27

Thank You

Merci

Grazie

Gracias Obrigado

Danke

Japanese

French

German

Italian

Spanish

Portuguese

Traditional Chinese

Simplified Chinese

Romanian

Multumesc

Turkish

Teşekkür ederim

English