big data advance topics - part 2.pptx
TRANSCRIPT
© 2016 Ness SES. All Rights Reserved1
BIG DATAadvanced topics
Cloudera vs HortonworksMOLDOVAN Radu Adrian Timisoara May 2016
© 2016 Ness SES. All Rights Reserved2
Who am I? :)❏passionate about
technology❏20 years of programming using open source❏ last 4 years in Big Data
❏Big Data Architect @
© 2016 Ness SES. All Rights Reserved3
© 2016 Ness SES. All Rights Reserved4
Cloudera and Hortonworks: The Similarities
- set on top of Apache Hadoop
- both are mature offering security
- provide paid consulting, training and services
- strong development communities
- master-slave architecture
- support MapReduce
- YARN as resource manager
- reducing the deployment time
- set on top of Apache Hadoop
- both are mature offering security
- provide paid consulting, training
and services
- strong development communities
- master-slave architecture
- support MapReduce
- YARN as resource manager
- reducing the deployment time
The Similarities
© 2016 Ness SES. All Rights Reserved5
Cloudera and Hortonworks: The Differences
- a commercial license
(a free 60-day trial)
- reposition as “enterprise
data hub”
- 2008, Facebook, Google,
Oracle and Yahoo in 2008
- +400 customers
- founds $1.04B
- open source license is
completely free.
- positioned as Hadoop distro
- has no proprietary software
- 2011, Teradata
- Yahoo & Microsoft
- founds $248M
https://www.crunchbase.com
© 2016 Ness SES. All Rights Reserved6
Security Solutions
http://www.forbes.com/sites/gilpress/2016/03/14/top-10-hot-big-data-technologies/#7cd07887f26a
HortonworksApache RangerApache KnoxApache Falcon
Cloudera Project RhinoProject Sentry
© 2016 Ness SES. All Rights Reserved7
HADOOP (HDFS) (C+H)
Res. ManagerYarn (C+H)
Warehouse DBPresto (H)
MapReducePIG(C+H)
Search EnginesSolrCloud (C+H)
Analytics
Columnar Store
Accumulo (C+H)
Impala(C)
Machine
LearningSpark ML (C+H)
Mahout(H)
HBase(C+H)
Data StreamingStorm(H)Spark Streaming(C+H)
HIVE (C+H)
Tableau
Data AggregationFlume (C+H)
Msg Brokers + Streams
Kafka (C+H)
COLLECT PROCESS STORE VISUALIZE
Data LoaderSqoop (C+H)
Cluster ecosystem - VISUALIZE
In MemorySpark (C+H)
Tez (H)
Logi
Jasper Reports
D3
Pentaho*Interactive Reporting
Crystal Reports
Data GovernanceAtlas (H)
© 2016 Ness SES. All Rights Reserved8
Cloudera
© 2016 Ness SES. All Rights Reserved9
Cloudera Management Service
© 2016 Ness SES. All Rights Reserved10
Hortonworks
© 2016 Ness SES. All Rights Reserved11
Trends - Forbes report Q1 2016
http://www.forbes.com/sites/gilpress/2016/03/14/top-10-hot-big-data-technologies/#7cd07887f26a
© 2016 Ness SES. All Rights Reserved12
Big Data - Buzz words #TAGs
FAULT TOLERANCE
DATA LOCALITY
LAMBDA ARCHITECTURE
CRUD => CRUD
SHARDING
REPLICATION
RESILIENT SYSTEMS
DISRUPTIVE TECHNOLOGIES
Cloud ComputingInternet of ThingsData Analytics
© 2016 Ness SES. All Rights Reserved13
Thank you!
Skype: r.moldovan