putting business intelligence to work on hadoop data stores
TRANSCRIPT
![Page 1: Putting Business Intelligence to Work on Hadoop Data Stores](https://reader037.vdocuments.mx/reader037/viewer/2022110204/55d50fc0bb61eb632e8b45a6/html5/thumbnails/1.jpg)
Putting BusinesPutting BusinesWork on HadoWork on Hado
Ian Fyfe, Chief Techno
© 2010, Pentaho. All Rights Reserved. www.pentaho.com. © 2010, Pentaho. All Rights R
s Intelligence to s Intelligence to oop Data Storesoop Data Stores
ology Evangelist, Pentaho
Worldwide: +1 (866) 660-7555 | Slide 1Reserved. www.pentaho.com.
![Page 2: Putting Business Intelligence to Work on Hadoop Data Stores](https://reader037.vdocuments.mx/reader037/viewer/2022110204/55d50fc0bb61eb632e8b45a6/html5/thumbnails/2.jpg)
Session AbstractThis presentation will cover how to ovmore out of your business data analysAn inexpensive way of storing large volumes of daAn inexpensive way of storing large volumes of dagetting data out of Hadoop is tough due to a lackexperience high latency (up to several minutes pequery, reporting, and business analysis with tradiTh fi t t i i H d ' t i tThe first step in overcoming Hadoop's constraintsinfrastructure built on top of Hadoop, which provschedule reporting of large datasets data stored ilanguage called Hive QL which is based on SQL anthis data.But to really unlock the power of Hadoop, you mumultiple (often tens or hundreds) of nodes with atool that will then allow you to move your Hadooy ywhere you can use BI tools for analysis.
Attendees will learn, how an IT person without jaIntegrate with Hadoop and Hive to bring ETL, datanalyzing Big Data;Provide key data integration and transformation fManage and control Hadoop jobs using a graphica
© 2010, Pentaho. All Rights Reserved. www.pentaho.com.
Manage and control Hadoop jobs using a graphicaIntegrating Hadoop data with data from other soufor today's massive volumes of data.
vercome Hadoop's constraints to get sis.ata Hadoop is also scalable and redundant But ata, Hadoop is also scalable and redundant. But
k of a built-in query language. Also, because users er query), Hadoop is not appropriate for ad hoc itional tools. i ti t HIVE d t h s is connecting to HIVE, a data warehouse
vides the relational structure necessary for in Hadoop files. HIVE also provides a simple query nd which enables users familiar with SQL to query
ust be able to efficiently extract data stored across a user-friendly ETL (extract, transform and load) op data into a relational data mart or warehouse p
va programming skills can:ta warehousing and BI applications to the tasks of
functionality to Hadoop data;al interface;
Worldwide: +1 (866) 660-7555 | Slide 2
al interface;urces to drive compelling reporting and analytics
![Page 3: Putting Business Intelligence to Work on Hadoop Data Stores](https://reader037.vdocuments.mx/reader037/viewer/2022110204/55d50fc0bb61eb632e8b45a6/html5/thumbnails/3.jpg)
THE CASE FOR B
© 2010, Pentaho. All Rights Reserved. www.pentaho.com.
BIG DATA
Worldwide: +1 (866) 660-7555 | Slide 3
![Page 4: Putting Business Intelligence to Work on Hadoop Data Stores](https://reader037.vdocuments.mx/reader037/viewer/2022110204/55d50fc0bb61eb632e8b45a6/html5/thumbnails/4.jpg)
The Case for Big DataEnterprises increasingly face neelarger and larger volumes of stru
ComplianceCompetitive Advantage
Challenges associated with big daChallenges associated with big daCost – storage and processing powerTimeliness of data processing
Why Hadoop?Low cost, reliable scale-out architecParallel distributed computing framParallel, distributed computing framProven success in solving Big Data prGoogle, Yahoo!, IBM and GEVib i l di iVibrant community, exploding intere
© 2010, Pentaho. All Rights Reserved. www.pentaho.com.
eds to store, process and maintain ctured and unstructured data
ataatar
cture for storing massive amounts of datamework for processing data
Google trends for ‘Hadoop’
mework for processing dataroblems at fortune 500 companies like
i l iest, strong commercial investments
Worldwide: +1 (866) 660-7555 | Slide 4
![Page 5: Putting Business Intelligence to Work on Hadoop Data Stores](https://reader037.vdocuments.mx/reader037/viewer/2022110204/55d50fc0bb61eb632e8b45a6/html5/thumbnails/5.jpg)
Hadoop for Data IntegratioTop Use Cases for Hadoop*
1. “mine data for improved busines2 “reducing cost of data analysis”
Top Challenges with Hadoop*
2. reducing cost of data analysis3. “log analysis”
1. Steep technical learning curve2. Hiring qualified people3. Availability of appropriate produ
Unfortunately, Hadoop was not designed
It’s not a database
High latency queries and jobs not ideal
*Based on a survey of 100+ Hadoop users conducted
Skill set mismatch for traditional ETL us
© 2010, Pentaho. All Rights Reserved. www.pentaho.com.
*Based on a survey of 100+ Hadoop users conducted
n and BI
ss intelligence”
ucts and tools
d specifically for ETL and BI use cases:
for all BI use cases
d by Karmasphere Sept 2010
sers and BI Solution architects
Worldwide: +1 (866) 660-7555 | Slide 5
d by Karmasphere, Sept. 2010
![Page 6: Putting Business Intelligence to Work on Hadoop Data Stores](https://reader037.vdocuments.mx/reader037/viewer/2022110204/55d50fc0bb61eb632e8b45a6/html5/thumbnails/6.jpg)
ESTABLISHING AARCHITECTURE F
© 2010, Pentaho. All Rights Reserved. www.pentaho.com.
AN FOR BIG DATA
Worldwide: +1 (866) 660-7555 | Slide 6
![Page 7: Putting Business Intelligence to Work on Hadoop Data Stores](https://reader037.vdocuments.mx/reader037/viewer/2022110204/55d50fc0bb61eb632e8b45a6/html5/thumbnails/7.jpg)
Example Use Cases Top
Transactional•Fraud detectionFi i l i / t•Financial services/sto
Sub-Transactional•Weblogs•Social/online media•Social/online media•Telecoms events
© 2010, Pentaho. All Rights Reserved. www.pentaho.com. © 2010, Pentaho. All Rights Reserved. www.pentaho.com.
dayy
k k tock markets
Worldwide: +1 (866) 660-7555 | Slide 7US and Worldwide: +1 (866) 660-7555 | Slide
![Page 8: Putting Business Intelligence to Work on Hadoop Data Stores](https://reader037.vdocuments.mx/reader037/viewer/2022110204/55d50fc0bb61eb632e8b45a6/html5/thumbnails/8.jpg)
Example Use Cases Top
Non-Transactional•Web pages, blogs etcD t•Documents
•Physical eventsy•Application events•Machine events
In most cases structur
© 2010, Pentaho. All Rights Reserved. www.pentaho.com. © 2010, Pentaho. All Rights Reserved. www.pentaho.com.
dayy
c
red or semi-structured
Worldwide: +1 (866) 660-7555 | Slide 8US and Worldwide: +1 (866) 660-7555 | Slide
![Page 9: Putting Business Intelligence to Work on Hadoop Data Stores](https://reader037.vdocuments.mx/reader037/viewer/2022110204/55d50fc0bb61eb632e8b45a6/html5/thumbnails/9.jpg)
Traditional Business InData Mart(s)
Tape/TTape/T
Data ? ?DataSource
?? ?
??
??
© 2010, Pentaho. All Rights Reserved. www.pentaho.com. © 2010, Pentaho. All Rights Reserved. www.pentaho.com.
ntelligence (BI)g ( )
TrashTrash
Worldwide: +1 (866) 660-7555 | Slide 9US and Worldwide: +1 (866) 660-7555 | Slide
![Page 10: Putting Business Intelligence to Work on Hadoop Data Stores](https://reader037.vdocuments.mx/reader037/viewer/2022110204/55d50fc0bb61eb632e8b45a6/html5/thumbnails/10.jpg)
Data Lake• Single source• Large volumeLarge volume• Not distilled
T i ll th 0 2• Typically no more than 0-2 lakes per company
• Known and unknown questions
• Multiple user communities• Don’t fit in traditional
RDBMS with a reasonable cost
© 2010, Pentaho. All Rights Reserved. www.pentaho.com. © 2010, Pentaho. All Rights Reserved. www.pentaho.com. Worldwide: +1 (866) 660-7555 | Slide 10US and Worldwide: +1 (866) 660-7555 | Slide
![Page 11: Putting Business Intelligence to Work on Hadoop Data Stores](https://reader037.vdocuments.mx/reader037/viewer/2022110204/55d50fc0bb61eb632e8b45a6/html5/thumbnails/11.jpg)
Data Lake Requiremenq
• Store all the data• Satisfy routine reporting
and analysis• Satisfy ad-hoc query /
analysis / reporting • Balance performance and
cost
© 2010, Pentaho. All Rights Reserved. www.pentaho.com. © 2010, Pentaho. All Rights Reserved. www.pentaho.com.
nts
Worldwide: +1 (866) 660-7555 | Slide 11US and Worldwide: +1 (866) 660-7555 | Slide
![Page 12: Putting Business Intelligence to Work on Hadoop Data Stores](https://reader037.vdocuments.mx/reader037/viewer/2022110204/55d50fc0bb61eb632e8b45a6/html5/thumbnails/12.jpg)
What if...Data Mart(s) Ad-H
Data L
Data
Data L
DataSource
© 2010, Pentaho. All Rights Reserved. www.pentaho.com. © 2010, Pentaho. All Rights Reserved. www.pentaho.com.
Hoc Data Warehouse
Lake(s)Lake(s)
Worldwide: +1 (866) 660-7555 | Slide 12US and Worldwide: +1 (866) 660-7555 | Slide
![Page 13: Putting Business Intelligence to Work on Hadoop Data Stores](https://reader037.vdocuments.mx/reader037/viewer/2022110204/55d50fc0bb61eb632e8b45a6/html5/thumbnails/13.jpg)
Big Data Does Not Repg p
It’s not a database
High latency
Optimized for mass
Big Data databases
Databases are no© 2010, Pentaho. All Rights Reserved. www.pentaho.com. © 2010, Pentaho. All Rights Reserved. www.pentaho.com.
Databases are no-
lace Data Marts
sive data-crunching
s are immature
SQLWorldwide: +1 (866) 660-7555 | Slide 13US and Worldwide: +1 (866) 660-7555 | Slide
SQL
![Page 14: Putting Business Intelligence to Work on Hadoop Data Stores](https://reader037.vdocuments.mx/reader037/viewer/2022110204/55d50fc0bb61eb632e8b45a6/html5/thumbnails/14.jpg)
What Hadoop Really isp yCore Components
HDFSa distributed file system allowstorage across a cluster of comstorage across a cluster of comservers
MapReduceMapReduceFramework for distributed comcommon use cases include aggsorting, and filtering BIG data Problem is broken up into smaof work that can be computedof work that can be computedrecomputed in isolation on anycluster
© 2010, Pentaho. All Rights Reserved. www.pentaho.com. © 2010, Pentaho. All Rights Reserved. www.pentaho.com.
s….
wing massive mmodity mmodity
mputation, gregating, sets
all fragments d or d or y node of the
Worldwide: +1 (866) 660-7555 | Slide 14US and Worldwide: +1 (866) 660-7555 | Slide
![Page 15: Putting Business Intelligence to Work on Hadoop Data Stores](https://reader037.vdocuments.mx/reader037/viewer/2022110204/55d50fc0bb61eb632e8b45a6/html5/thumbnails/15.jpg)
What Hadoop Really isp yRelated Projects
Hive – a data warehouse Hive a data warehouse infrastructure on top of H
Implements a SQL like Query lImplements a SQL like Query lincluding a JDBC driverAllows MapReduce developers p pcustom mappers and reducers
Hbase – the Hadoop dataAH HA!
A variant of NoSQL databases,problematic for traditional BIBest at storing large amounts unstructured data
© 2010, Pentaho. All Rights Reserved. www.pentaho.com. © 2010, Pentaho. All Rights Reserved. www.pentaho.com.
unstructured data
s….
Hadooplanguage language,
to plugin p g
abase –
of
Worldwide: +1 (866) 660-7555 | Slide 15US and Worldwide: +1 (866) 660-7555 | Slide
![Page 16: Putting Business Intelligence to Work on Hadoop Data Stores](https://reader037.vdocuments.mx/reader037/viewer/2022110204/55d50fc0bb61eb632e8b45a6/html5/thumbnails/16.jpg)
Hadoop and BI?p
Distributed processinDistributed file systeC dit h dCommodity hardwarPlatform independenPlatform independenScales out beyond teeconomy of a RDBM
In many cases it’s the
© 2010, Pentaho. All Rights Reserved. www.pentaho.com. © 2010, Pentaho. All Rights Reserved. www.pentaho.com.
ngem rent (in theory)nt (in theory) echnology and/or
MS
only viable solution
Worldwide: +1 (866) 660-7555 | Slide 16US and Worldwide: +1 (866) 660-7555 | Slide
![Page 17: Putting Business Intelligence to Work on Hadoop Data Stores](https://reader037.vdocuments.mx/reader037/viewer/2022110204/55d50fc0bb61eb632e8b45a6/html5/thumbnails/17.jpg)
Hadoop and BI?p
90% of new Had90% of new Hadare transfoare transfosemi/structsemi/struct
* of those companies we’ve talke
© 2010, Pentaho. All Rights Reserved. www.pentaho.com. © 2010, Pentaho. All Rights Reserved. www.pentaho.com.
of those companies we ve talke
doop use casesdoop use cases ormation oformation of tured data*tured data
ed to
Worldwide: +1 (866) 660-7555 | Slide 17US and Worldwide: +1 (866) 660-7555 | Slide
ed to...
![Page 18: Putting Business Intelligence to Work on Hadoop Data Stores](https://reader037.vdocuments.mx/reader037/viewer/2022110204/55d50fc0bb61eb632e8b45a6/html5/thumbnails/18.jpg)
Hadoop and BI?p
“The working conditiowithin Hadoop are showithin Hadoop are sho
ETL Developer
© 2010, Pentaho. All Rights Reserved. www.pentaho.com. © 2010, Pentaho. All Rights Reserved. www.pentaho.com.
ons ocking”ocking
Worldwide: +1 (866) 660-7555 | Slide 18US and Worldwide: +1 (866) 660-7555 | Slide
![Page 19: Putting Business Intelligence to Work on Hadoop Data Stores](https://reader037.vdocuments.mx/reader037/viewer/2022110204/55d50fc0bb61eb632e8b45a6/html5/thumbnails/19.jpg)
Hadoop and BI?p
Instead of this...
© 2010, Pentaho. All Rights Reserved. www.pentaho.com. © 2010, Pentaho. All Rights Reserved. www.pentaho.com. Worldwide: +1 (866) 660-7555 | Slide 19US and Worldwide: +1 (866) 660-7555 | Slide
![Page 20: Putting Business Intelligence to Work on Hadoop Data Stores](https://reader037.vdocuments.mx/reader037/viewer/2022110204/55d50fc0bb61eb632e8b45a6/html5/thumbnails/20.jpg)
Hadoop and BI?p
You have to do this in public void map(
Text key,
Text value,
OutputCollector output
Reporter reporter)Reporter reporter)
public void reduce(p
Text key,
Iterator values,
OutputCollector output
Reporter reporter)
© 2010, Pentaho. All Rights Reserved. www.pentaho.com. © 2010, Pentaho. All Rights Reserved. www.pentaho.com.
Java...
t,
t,
Worldwide: +1 (866) 660-7555 | Slide 20US and Worldwide: +1 (866) 660-7555 | Slide
![Page 21: Putting Business Intelligence to Work on Hadoop Data Stores](https://reader037.vdocuments.mx/reader037/viewer/2022110204/55d50fc0bb61eb632e8b45a6/html5/thumbnails/21.jpg)
People dPeople dHadoop forHadoop for
they wathey wa
© 2010, Pentaho. All Rights Reserved. www.pentaho.com. © 2010, Pentaho. All Rights Reserved. www.pentaho.com.
don’t usedon t use BI becauseBI because
ant toant to...
Worldwide: +1 (866) 660-7555 | Slide 21US and Worldwide: +1 (866) 660-7555 | Slide
![Page 22: Putting Business Intelligence to Work on Hadoop Data Stores](https://reader037.vdocuments.mx/reader037/viewer/2022110204/55d50fc0bb61eb632e8b45a6/html5/thumbnails/22.jpg)
they do i...they do ithey hathey ha
© 2010, Pentaho. All Rights Reserved. www.pentaho.com. © 2010, Pentaho. All Rights Reserved. www.pentaho.com.
it becauseit because ave toave to...
Worldwide: +1 (866) 660-7555 | Slide 22US and Worldwide: +1 (866) 660-7555 | Slide
![Page 23: Putting Business Intelligence to Work on Hadoop Data Stores](https://reader037.vdocuments.mx/reader037/viewer/2022110204/55d50fc0bb61eb632e8b45a6/html5/thumbnails/23.jpg)
... and unfowasn’t d
for most BI r
© 2010, Pentaho. All Rights Reserved. www.pentaho.com. © 2010, Pentaho. All Rights Reserved. www.pentaho.com.
ortunately it designed equirements
Worldwide: +1 (866) 660-7555 | Slide 23US and Worldwide: +1 (866) 660-7555 | Slide
![Page 24: Putting Business Intelligence to Work on Hadoop Data Stores](https://reader037.vdocuments.mx/reader037/viewer/2022110204/55d50fc0bb61eb632e8b45a6/html5/thumbnails/24.jpg)
Why not addthe things it
© 2010, Pentaho. All Rights Reserved. www.pentaho.com. © 2010, Pentaho. All Rights Reserved. www.pentaho.com.
d to Hadoop ’s missing...
Worldwide: +1 (866) 660-7555 | Slide 24US and Worldwide: +1 (866) 660-7555 | Slide
![Page 25: Putting Business Intelligence to Work on Hadoop Data Stores](https://reader037.vdocuments.mx/reader037/viewer/2022110204/55d50fc0bb61eb632e8b45a6/html5/thumbnails/25.jpg)
... until itwhat we n
© 2010, Pentaho. All Rights Reserved. www.pentaho.com. © 2010, Pentaho. All Rights Reserved. www.pentaho.com.
t can do need it to?
Worldwide: +1 (866) 660-7555 | Slide 25US and Worldwide: +1 (866) 660-7555 | Slide
![Page 26: Putting Business Intelligence to Work on Hadoop Data Stores](https://reader037.vdocuments.mx/reader037/viewer/2022110204/55d50fc0bb61eb632e8b45a6/html5/thumbnails/26.jpg)
If only wIf only wJava embJava, emb
data transformdata transform
© 2010, Pentaho. All Rights Reserved. www.pentaho.com. © 2010, Pentaho. All Rights Reserved. www.pentaho.com.
we had awe had a beddablebeddable,
mation enginemation engine...
Worldwide: +1 (866) 660-7555 | Slide 26US and Worldwide: +1 (866) 660-7555 | Slide
![Page 27: Putting Business Intelligence to Work on Hadoop Data Stores](https://reader037.vdocuments.mx/reader037/viewer/2022110204/55d50fc0bb61eb632e8b45a6/html5/thumbnails/27.jpg)
A Data Integration Engg gData Marts, Da
Analytical Ay
Data IntegrData IntegrEngine
Hadoop Data IntegrE iHadoop Engine
Data IntegrEngine
© 2010, Pentaho. All Rights Reserved. www.pentaho.com. © 2010, Pentaho. All Rights Reserved. www.pentaho.com.
g
gine for Hadoopg pata Warehouse, Applicationspp
rationration e
ration Design
Deploye Deploy
Orchestrate
ration e
Worldwide: +1 (866) 660-7555 | Slide 27US and Worldwide: +1 (866) 660-7555 | Slide
![Page 28: Putting Business Intelligence to Work on Hadoop Data Stores](https://reader037.vdocuments.mx/reader037/viewer/2022110204/55d50fc0bb61eb632e8b45a6/html5/thumbnails/28.jpg)
Visualize Reporting / Dashb
OptimizeDM &
OptimizeHiv
Files /
Load Applications
© 2010, Pentaho. All Rights Reserved. www.pentaho.com. © 2010, Pentaho. All Rights Reserved. www.pentaho.com.
boards / Analysis
Web Tier
& DW RDBMS
veHadoop
HDFSHadoop
s & Systems
Worldwide: +1 (866) 660-7555 | Slide 28US and Worldwide: +1 (866) 660-7555 | Slide
![Page 29: Putting Business Intelligence to Work on Hadoop Data Stores](https://reader037.vdocuments.mx/reader037/viewer/2022110204/55d50fc0bb61eb632e8b45a6/html5/thumbnails/29.jpg)
Reporting / Dashb
DM &
adat
a
HivMet
a
Files /
Applications
© 2010, Pentaho. All Rights Reserved. www.pentaho.com. © 2010, Pentaho. All Rights Reserved. www.pentaho.com.
boards / Analysis
Web Tier
& DW RDBMS
veHadoop
HDFSHadoop
s & Systems
Worldwide: +1 (866) 660-7555 | Slide 29US and Worldwide: +1 (866) 660-7555 | Slide
![Page 30: Putting Business Intelligence to Work on Hadoop Data Stores](https://reader037.vdocuments.mx/reader037/viewer/2022110204/55d50fc0bb61eb632e8b45a6/html5/thumbnails/30.jpg)
Data Mart(s) Ad-H
Data LData L
DataDataSource
© 2010, Pentaho. All Rights Reserved. www.pentaho.com. © 2010, Pentaho. All Rights Reserved. www.pentaho.com.
Data WarehouseHoc
ake(s)ake(s)
Worldwide: +1 (866) 660-7555 | Slide 30US and Worldwide: +1 (866) 660-7555 | Slide
![Page 31: Putting Business Intelligence to Work on Hadoop Data Stores](https://reader037.vdocuments.mx/reader037/viewer/2022110204/55d50fc0bb61eb632e8b45a6/html5/thumbnails/31.jpg)
Reporting / Dashb
Data Lake
Applications
© 2010, Pentaho. All Rights Reserved. www.pentaho.com. © 2010, Pentaho. All Rights Reserved. www.pentaho.com.
boards / Analysis
Web Tier
RDBMS
HadoopHadoop
s & Systems
Worldwide: +1 (866) 660-7555 | Slide 31US and Worldwide: +1 (866) 660-7555 | Slide
![Page 32: Putting Business Intelligence to Work on Hadoop Data Stores](https://reader037.vdocuments.mx/reader037/viewer/2022110204/55d50fc0bb61eb632e8b45a6/html5/thumbnails/32.jpg)
Product Requirements for BI Ag
Lower technical barriers through grapenvironment for creating and managingM R d j bMapReduce jobs
Extreme ETL scalability through deploacross the Hadoop clusteracross the Hadoop cluster
Easily spin-off high performance datainteractive analysis
Easily integrate data from Hadoop withother sources
P id d t d BI dd i Provide end-to-end BI addressing commcases with Hadoop including reporting, query and interactive analysis
Reduce costs through subscription-basereduced dependency on scarce technica
d i i t i bilit
© 2010, Pentaho. All Rights Reserved. www.pentaho.com.
resources, and easier maintainability
gainst Hadoop
phical ETL g Hadoop
Interactive Analysis
oyment Batch Reportingand Ad Hoc Query
Interactive Analysis
D t M t
marts for Data Marts
gile
BI
Hih data from
BI Hadoop
Ag Hive
mon BI use ad hoc Data Integration Jobs
ed pricing, al Log
FilesDBs andother sources
Worldwide: +1 (866) 660-7555 | Slide 32
![Page 33: Putting Business Intelligence to Work on Hadoop Data Stores](https://reader037.vdocuments.mx/reader037/viewer/2022110204/55d50fc0bb61eb632e8b45a6/html5/thumbnails/33.jpg)
THE ROA
© 2010, Pentaho. All Rights Reserved. www.pentaho.com.
D AHEAD
Worldwide: +1 (866) 660-7555 | Slide 33
![Page 34: Putting Business Intelligence to Work on Hadoop Data Stores](https://reader037.vdocuments.mx/reader037/viewer/2022110204/55d50fc0bb61eb632e8b45a6/html5/thumbnails/34.jpg)
The Road AheadOther NoSQL Integration
Facilitate BI use cases on top Facilitate BI use cases on top MongoDB, Cassandra
Streaming Data Source SuStreaming Data Source SuIn support of near-realtime usLong/always running data procLong/always running data proc
Contiguous Meta-dataData Lineage and Impact AnalyData Lineage and Impact Analyarchitecture
The End of MapReduce (… asp (understand)
Push down optimization of Tra
© 2010, Pentaho. All Rights Reserved. www.pentaho.com. © 2010, Pentaho. All Rights Reserved. www.pentaho.com.
native MapReduce tasks in Had
of HBase possibly others like of HBase, possibly others like
upportupportse casescessing jobscessing jobs
ysis covering the entire big data ysis covering the entire big data
s a concept ETL users need to p
ansformations that generate
Worldwide: +1 (866) 660-7555 | Slide 34US and Worldwide: +1 (866) 660-7555 | Slide
doop
![Page 35: Putting Business Intelligence to Work on Hadoop Data Stores](https://reader037.vdocuments.mx/reader037/viewer/2022110204/55d50fc0bb61eb632e8b45a6/html5/thumbnails/35.jpg)
Hadoop Distro Wars
The Apache Software Foundation
© 2010, Pentaho. All Rights Reserved. www.pentaho.com. Worldwide: +1 (866) 660-7555 | Slide 35
![Page 36: Putting Business Intelligence to Work on Hadoop Data Stores](https://reader037.vdocuments.mx/reader037/viewer/2022110204/55d50fc0bb61eb632e8b45a6/html5/thumbnails/36.jpg)
Tools That Make Hadoe.g. Apache Pig
Pig is a platform for analyzing large data sets
Produces sequences of Produces sequences of MapReduce programs
Integrate Pig scripts into enterprise data integration workflows e.g.
1 Submit and monitor a 1. Submit and monitor a series of Pig and MapReduce jobs
2. Process a database bulk load step to ready data for ad-hoc analysis or
© 2010, Pentaho. All Rights Reserved. www.pentaho.com.
report bursting
oop Easier
Worldwide: +1 (866) 660-7555 | Slide 36
![Page 37: Putting Business Intelligence to Work on Hadoop Data Stores](https://reader037.vdocuments.mx/reader037/viewer/2022110204/55d50fc0bb61eb632e8b45a6/html5/thumbnails/37.jpg)
Growth in Adoption oNoSQL Big Data Platf
Hbase – the Hadoop database
mongoDB – scalable high performmongoDB scalable, high-perform
LexisNexis HPCC – a data intens
Many othersMany others
© 2010, Pentaho. All Rights Reserved. www.pentaho.com.
of Other forms
ance document oriented databaseance, document-oriented database
ive computing system platform
Worldwide: +1 (866) 660-7555 | Slide 37
![Page 38: Putting Business Intelligence to Work on Hadoop Data Stores](https://reader037.vdocuments.mx/reader037/viewer/2022110204/55d50fc0bb61eb632e8b45a6/html5/thumbnails/38.jpg)
Summary
Hadoop and other Big Data NGreat at storing and processinGreat at storing and processinNot designed for Business Inte
Choosing the right BI technoto drive actionable insightsg
Graphical user interfacesScalableSpin-off data martsIntegrate data into data warehIntegrated dashboards, reportintegration
© 2010, Pentaho. All Rights Reserved. www.pentaho.com.
NoSQL platformsng large diverse data volumesng large diverse data volumeselligence
ology can unlock your Big Data
housesting, data analysis, data
Worldwide: +1 (866) 660-7555 | Slide 38
![Page 39: Putting Business Intelligence to Work on Hadoop Data Stores](https://reader037.vdocuments.mx/reader037/viewer/2022110204/55d50fc0bb61eb632e8b45a6/html5/thumbnails/39.jpg)
ThankThank
ifyfe@penifyfe@pen
© 2010, Pentaho. All Rights Reserved. www.pentaho.com. © 2010, Pentaho. All Rights Reserved. www.pentaho.com.
k You!k You!
ntaho comntaho.com
Worldwide: +1 (866) 660-7555 | Slide 39US and Worldwide: +1 (866) 660-7555 | Slide