big data and hadoop - key drivers, ecosystem and use cases

16
© Wikibon 2011 | Confidential www.wikibon.org [[The Wikibon Project]] Big Data and Hadoop: Key Drivers, Ecosystem and Use Cases November 2011

Upload: jeff-kelly

Post on 24-Jan-2015

574 views

Category:

Technology


4 download

DESCRIPTION

Overview of the Big Data market.

TRANSCRIPT

Page 1: Big Data and Hadoop - key drivers, ecosystem and use cases

© Wikibon 2008 © Wikibon 2011 | Confidential www.wikibon.org

[[The Wikibon Project]]

Big Data and Hadoop: Key Drivers, Ecosystem and Use Cases November 2011

Page 2: Big Data and Hadoop - key drivers, ecosystem and use cases

© Wikibon 2008 © Wikibon 2011 | Confidential www.wikibon.org

What is Big Data?

2

Big Data n Data sets whose size, type and/or speed make them impractical to process and analyze with traditional database technologies and related data management tools.

Page 3: Big Data and Hadoop - key drivers, ecosystem and use cases

© Wikibon 2008 © Wikibon 2011 | Confidential www.wikibon.org

Why is Big Data Important?

3

Big  Data  is  the  new  de.initive  source  of  competitive  advantage  across  industries  …

…  For  those  organizations  that  embrace  Big  Data,  the  possibilities  for  innovation,  improved  agility,  and  increased  pro.itability  are  nearly  endless.

Page 4: Big Data and Hadoop - key drivers, ecosystem and use cases

© Wikibon 2008 © Wikibon 2011 | Confidential www.wikibon.org

Three Key Big Data Drivers

4

1.  Volume, Variety, Velocity

2.  Hardware Commoditization

3.  Cloud Computing

Page 5: Big Data and Hadoop - key drivers, ecosystem and use cases

© Wikibon 2008 © Wikibon 2011 | Confidential www.wikibon.org

Characteristics of Big Data

5

Page 6: Big Data and Hadoop - key drivers, ecosystem and use cases

© Wikibon 2008 © Wikibon 2011 | Confidential www.wikibon.org

Sources of Big Data

6

Page 7: Big Data and Hadoop - key drivers, ecosystem and use cases

© Wikibon 2008 © Wikibon 2011 | Confidential www.wikibon.org

Hadoop

7

Open source framework for processing, storing and analyzing Big Data.

Fundamental concept: Rather than banging away at one, huge block of data with a single machine, Hadoop breaks up Big Data into multiple parts so each part can be processed and analyzed in parallel.

Page 8: Big Data and Hadoop - key drivers, ecosystem and use cases

© Wikibon 2008 © Wikibon 2011 | Confidential www.wikibon.org

Hadoop: The Pros and Cons

8

First the pros … Hadoop is a time- and cost-effective approach to store, process and analyze large volumes of unstructured data allowing for new and unprecedented types of analytics.

Now the cons … Hadoop is complex and difficult to deploy and manage; there’s a dearth of Hadoop-savvy engineers and Data Scientists on the job market; the risk of forking and vendor lock-in remains.

Page 9: Big Data and Hadoop - key drivers, ecosystem and use cases

© Wikibon 2008 © Wikibon 2011 | Confidential www.wikibon.org

Hadoop: The Pros and Cons cont.

9

More pros … Many bright minds contributing to Hadoop resulting in rapid development and an ecosystem of vendors emerging to make Hadoop enterprise-ready.

Page 10: Big Data and Hadoop - key drivers, ecosystem and use cases

© Wikibon 2008 © Wikibon 2011 | Confidential www.wikibon.org

The Big Data Ecosystem

10

Page 11: Big Data and Hadoop - key drivers, ecosystem and use cases

© Wikibon 2008 © Wikibon 2011 | Confidential www.wikibon.org

Big Data Pioneers

11

•  Largest Hadoop instance on the planet … 40,000 nodes handling 200+ PB of data.

•  Used to support research

for ad systems and Web search.

•  Match ads with users, detect spam in Yahoo! Mail, pick relevant top stories.

Page 12: Big Data and Hadoop - key drivers, ecosystem and use cases

© Wikibon 2008 © Wikibon 2011 | Confidential www.wikibon.org

Big Data Pioneers cont.

12

•  Two major clusters processing and storing over 30 PB of data.

•  Uses HDFS to store copies of internal log and dimension data.

•  Developed Hive to perform large-scale analytics on user data.

•  Using HBase to store, manage and retrieve Facebook Messenger data.

Page 13: Big Data and Hadoop - key drivers, ecosystem and use cases

© Wikibon 2008 © Wikibon 2011 | Confidential www.wikibon.org

Big Data Pioneers cont.

13

•  Uses Hadoop to support “People You May Know” feature.

•  Tailors its search engine to return most relevant results for recruiters, employers and job seekers.

•  Created a visualization tool to allow users to explore their professional network to discover hidden patterns.

Page 14: Big Data and Hadoop - key drivers, ecosystem and use cases

© Wikibon 2008 © Wikibon 2011 | Confidential www.wikibon.org

Big Data in Financial Services

14

•  Over 30,000 databases and 15,000 applications spread across 7 business units.

•  Using Hadoop as the basis of its Common Data Platform.

•  Looking to establish 360 degree view of customer for upsell and cross-sell opportunities.

Page 15: Big Data and Hadoop - key drivers, ecosystem and use cases

© Wikibon 2008 © Wikibon 2011 | Confidential www.wikibon.org

Big Data in Financial Services cont.

15

•  Risk management and analysis to understand financial exposure.

•  Detecting fraudulent transactions and potentially criminal activity.

•  Conduct sentiment analysis on social media data.

Page 16: Big Data and Hadoop - key drivers, ecosystem and use cases

© Wikibon 2008 © Wikibon 2011 | Confidential www.wikibon.org

Thank You

16

Jeffrey F. Kelly Principal Research Contributor

The Wikibon Project

[email protected] @jeffreyfkelly

www.wikibon.org www.siliconangle.com