big data simplified

31
BIG DATA simplified! Pravin Hanchinal pravinhanchinal.com

Upload: pravin-hanchinal

Post on 12-Apr-2017

135 views

Category:

Technology


0 download

TRANSCRIPT

BIG DATA simplified!

Pravin Hanchinalpravinhanchinal.com

Before we start...

Big Data

What you can do with Big Data?

Big Data

Big Data is a

cluster of many technologies and tools

that are used in various scenarios.

(Hadoop + HDFS+ Hcatalog+Flume+PowerView)

(HortonWorks + PowerView)

What you can do in Big Data?

Fetching

Processing

Visualizing

How Big is Big Data?

Byte of data: one grain of rice

Kilobyte: cup of rice

Megabyte: 8 bags of rice

Gigabyte: 3 container of lorries

Terabyte: 2 container ships

Petabyte: covers Mumbai

Exabyte: covers India

Zettabyte: fills Indian Ocean

Big Data Industry Overview

MapReduce•MapReduce is a processing technique and a program model for distributed computing based on java.•The MapReduce algorithm contains two important tasks, namely Map and Reduce.

Mapreduce

Hadoop Cluster

What you can do on Big Data?

Get Started with this:

CloudEra

HortonWorks

Why Big Data?

Business Intelligence

HortonWorks

Cloud Era

Why Hadoop?

-> Hadoop modeling and development: MapReduce, Pig, Mahout-> Hadoop storage and data management: HDFS, HBase, Cassandra-> Hadoop data warehousing, summarization and query: Hive, Sqoop-> Hadoop data collection, aggregation and analysis: Chukwa, Flume-> Hadoop metadata, table and schema management: HCatalog-> Hadoop cluster management, job scheduling and workflow: ZooKeeper, Oozie and Ambari-> Hadoop Data serialization: Avro

Big Data in Nutshell

Got questions?

Text/WhatsApp on 974-086-1099

What Next?

Dive in and Explore

Typical Use Case

Resourceshttps://ayende.com/blog/4435/map-reduce-a-visual-explanation

MultiNode on Amazon: https://dzone.com/articles/how-set-multi-node-hadoop

https://ayende.com/blog/4435/map-reduce-a-visual-explanation

Run Sample MapReduce Examples:

MapReduce examples: http://www.informit.com/articles/article.aspx?p=2190194&seqNum=3

https://hortonworks.com/hadoop-tutorial/how-to-process-data-with-apache-pig/