hadoop for rubyists
TRANSCRIPT
-
8/3/2019 Hadoop for Rubyists
1/15
Hadoo for Rub ists
Loren [email protected]
Friday, October 7, 2011
mailto:[email protected]:[email protected] -
8/3/2019 Hadoop for Rubyists
2/15
GOVT SEARCH
Friday, October 7, 2011
-
8/3/2019 Hadoop for Rubyists
3/15
FROM LOGS TO DATA
BizLogic
Friday, October 7, 2011
-
8/3/2019 Hadoop for Rubyists
4/15
SUPER SIMPLE WINS
Friday, October 7, 2011
-
8/3/2019 Hadoop for Rubyists
5/15
VERSION 1.0
Friday, October 7, 2011
-
8/3/2019 Hadoop for Rubyists
6/15
One year later...
Friday, October 7, 2011
-
8/3/2019 Hadoop for Rubyists
7/15Friday, October 7, 2011
-
8/3/2019 Hadoop for Rubyists
8/15
HIVE =
HDFS
+
Schema
+
HQL
selectds, count(*) cnt
fromlogs
group bydsorder bycnt
Friday, October 7, 2011
-
8/3/2019 Hadoop for Rubyists
9/15
HIVE WITH RUBY
HDFS
+
Schema
+
HQL with custom mapper
add file /local/path/to/queries_mapper.rb;
select transform(host, time, agent, ...)using './queries_mapper.rb'as host, time, agent, query, affiliate, locale, is_bot, ...fromlogs where ... group by ... having ...
Friday, October 7, 2011
-
8/3/2019 Hadoop for Rubyists
10/15
STDIN & STDOUT
% cat logfile | ./queries_mapper.rb
Friday, October 7, 2011
-
8/3/2019 Hadoop for Rubyists
11/15
VERSION 2.0
Friday, October 7, 2011
-
8/3/2019 Hadoop for Rubyists
12/15
BUT WHAT ABOUT...
Hive UDF
Hadoop streaming
Wukong/MRToolkit
Java MR (kidding!)
Friday, October 7, 2011
-
8/3/2019 Hadoop for Rubyists
13/15
WHERE ARE YOU?
AddSlaves
CacheLayer
LargerBoxes
Denormalize
RemoveIndexes
Shard
Friday, October 7, 2011
-
8/3/2019 Hadoop for Rubyists
14/15
Parting wordsHadoop ecosystem is rich and very complexNo one piece is too hard
You can leverage your Ruby/SQL skills with Hive
Start somewhere, its fun!
Friday, October 7, 2011
-
8/3/2019 Hadoop for Rubyists
15/15
THANK YOU!
F id O t b 7 2011