collecting and analyzing sensor data with hadoop or other no sql databases

19
Collecting and Analyzing sensor data Bigdata with Hadoop or other NoSQL databases

Upload: matteo-redaelli

Post on 01-Dec-2014

525 views

Category:

Technology


0 download

DESCRIPTION

Scouting howto collecting and analyzing sensor data with hadoop or other no sql databases

TRANSCRIPT

Page 1: Collecting and analyzing sensor data with hadoop or other no sql databases

Collecting and Analyzing

sensor data

Bigdata with Hadoop or other NoSQL databases

Page 2: Collecting and analyzing sensor data with hadoop or other no sql databases

Who am I

I am an Open Source enthusiast!

matteo DOT redaelli AT gmail DOT com

http://www.redaelli.org/matteo/

Page 3: Collecting and analyzing sensor data with hadoop or other no sql databases

Hadoop ecosystem (1 of 2)

● HDFS is the distribuited file system of Hadoop: data are usually stored as text/csv files (rows are distribuited in the cluster)

● HIVE is the datawarehouse of Hadoop

Page 5: Collecting and analyzing sensor data with hadoop or other no sql databases

Collecting

Apache flume from Cloudera

Page 9: Collecting and analyzing sensor data with hadoop or other no sql databases

Hadoop evolution (1 of 2)

Page 10: Collecting and analyzing sensor data with hadoop or other no sql databases

Hadoop evolution (2 of 2)

Page 12: Collecting and analyzing sensor data with hadoop or other no sql databases

Hadoop top distributions: Hortonworks

Page 13: Collecting and analyzing sensor data with hadoop or other no sql databases

Hadoop top distributions: MapR

Page 16: Collecting and analyzing sensor data with hadoop or other no sql databases

Hadoop alternatives: Riak

Riakhttp://docs.basho.com/riak/1.2.1

/cookbooks/use-cases/sensor-data/

Page 17: Collecting and analyzing sensor data with hadoop or other no sql databases

Hadoop alternatives: Kafka + Storm

Apache Kafka (from Linkedin) for aggregating

Apache Storm (from Twitter) for realtime computing

Page 18: Collecting and analyzing sensor data with hadoop or other no sql databases

Alternatives: timeseries databases

OpenTSDB Hadoop Hbase

Influxdb

Kairosdb Cassandra, Hadoop Hbase