apache bigtop working group cluster stuff. cloud computing

17
Apache Bigtop Working Group Cluster stuff

Upload: brennen-haymes

Post on 29-Mar-2015

226 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Apache Bigtop Working Group Cluster stuff. Cloud computing

Apache Bigtop Working Group

Cluster stuff

Page 2: Apache Bigtop Working Group Cluster stuff. Cloud computing

Cloud computing

Page 3: Apache Bigtop Working Group Cluster stuff. Cloud computing

Bigtop Administration

• Make sure you are signed up on the bigtop-dev mailing list. Lots of info which will never get repeated if you miss it

• Bigtop-user, bigtop-dev

Page 4: Apache Bigtop Working Group Cluster stuff. Cloud computing

Bigtop Administration

• Sign up for jira

Page 5: Apache Bigtop Working Group Cluster stuff. Cloud computing

Bigtop Administration

– Registration, Join Biocurious. Pays for space nobody takes a cut of this

– Free drinks – Registration = AWS Credits. Cancelling IntelliJ.

Expires end of April. – [email protected]

Page 6: Apache Bigtop Working Group Cluster stuff. Cloud computing

Newbie Slide• Structure:

– Do labs• Lab 1 Modified to take 1-2 weeks. Update the wiki with your findings• Lab 2 Build Bigtop 0.3.0; • Can start projects here, do Jira tickets• Lab 3 map reduce program• Lab 4 Run the unit tests under the component downloads• Lab 5 Run the integration tests• Lab 6 Puppet, deploy and run• Lab 7 Port a module

– Labs are changing; not a class. Requires time commitment– Demo, doesn’t need to be working; for your benefit not ours

Page 8: Apache Bigtop Working Group Cluster stuff. Cloud computing

Lab 1

• Install bigtop, run all the components, Hive/Hbase/Pig/Hadoop/Mahout/Oozie

• There are bugs, document them• Add the sample programs in quickstart to the

wiki. Not all are included yet

Page 9: Apache Bigtop Working Group Cluster stuff. Cloud computing

Lab 1

• Update the wiki• Sqoop open (User group meeting next week)• Flume/Flume NG (open/nothing)• Zookeeper(open/nothing)

Page 10: Apache Bigtop Working Group Cluster stuff. Cloud computing

Hadoop Components

• Old: Don’t stop at running Pi as test of HDFS• Still missing: Run Terasort in Hadoop, need

cluster• https://cwiki.apache.org/confluence/display/B

IGTOP/How+to+install+Hadoop+distribution+from+Bigtop

• Whirr may need patch depending on where you run it from

Page 11: Apache Bigtop Working Group Cluster stuff. Cloud computing

Mahout

• Don’t run jar like in Hadoop• Scripts handle downloading and clustering,

email demo, etc.. Under /examples/bin. • Bigtop puts example/bin under

/usr/share/doc/mahout. Is this correct? Not documentation

• Add documentation to wiki• Ticket filed

Page 12: Apache Bigtop Working Group Cluster stuff. Cloud computing

Oozie

• Oozie runs, forget the error message, set to highest version

Page 13: Apache Bigtop Working Group Cluster stuff. Cloud computing

Oozie

Page 14: Apache Bigtop Working Group Cluster stuff. Cloud computing

Flume/Flume NG

• New patch checkin for Flume NG• Testing

Page 15: Apache Bigtop Working Group Cluster stuff. Cloud computing

Whirr

• sudo apt-get install whirr• Run as: whirr launch-cluster --config

/udt/lib/whirr/recipes/mahout-ec2.properties• If successful will see directory under ~/.whirr• whirr.log• mvn clean install

Page 16: Apache Bigtop Working Group Cluster stuff. Cloud computing

Puppet

• sudo apt-get install puppet facter fails

Page 17: Apache Bigtop Working Group Cluster stuff. Cloud computing

Ticket Questions/Demo• Bigtop install should include stable for ubuntu? Diff between

stable and bigtop-0.3.0-incubating. There used to be a diff. • Monitoring, metrics.properties ->metrics2• Ganglia or JMX? All components w/daemon;• Bruno has Ganglia recipes to monitor status of cluster. Hadoop

monitoring: performance and functionality. Hooked up to kerberos/ commercial version is Cloudera

manager. Networking, i/o, block sizes, swap space, disk space. • Stable vs. incubating?

• Anwar: LogMining (M/R, clickstream and FE log data, exception on day to day basis);