apache apex & bigtop

13
© 2016 DataTorrent Chinmay Kolhatkar Committer, Apache Apex Engineer, DataTorrent June 21, 2016 Apache Apex-Bigtop

Upload: apache-apex

Post on 12-Jan-2017

143 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Apache Apex & Bigtop

© 2016 DataTorrent

Chinmay KolhatkarCommitter, Apache Apex

Engineer, DataTorrentJune 21, 2016

Apache Apex-Bigtop

Page 2: Apache Apex & Bigtop

© 2016 DataTorrent

Agenda

2

•About Apache Apex•Apex Platform Overview•Apex - Native Hadoop Integration•Apex Malhar Library•Apex as a Bigtop component• Installing Bigtop Apex•Apex Docker sandbox•Apex Docker sandbox Demo

Page 3: Apache Apex & Bigtop

© 2016 DataTorrent

About Apache Apex

3

•Platform and runtime engine that enables development of scalable and fault-tolerant distributed applications

•Hadoop native (Hadoop >= 2.2)No separate service to manage stream processingStreaming Engine built into Application Master and

Containers•Process streaming or batch big data•High throughput and low latency•Library of commonly needed business logic•Write any custom business logic in your application

Page 4: Apache Apex & Bigtop

© 2016 DataTorrent

Apex Platform Overview

4

Page 5: Apache Apex & Bigtop

© 2016 DataTorrent

Apex - Native Hadoop Integration

5

• YARN is the resource manager

• HDFS used for storing any persistent state

Page 6: Apache Apex & Bigtop

© 2016 DataTorrent

Apex Malhar Library

6

RDBMS• Vertica• MySQL• Oracle• JDBC

NoSQL• Cassandra, Hbase• Aerospike, Accumulo• Couchbase/ CouchDB• Redis, MongoDB• Geode

Messaging• Kafka• Solace• Flume, ActiveMQ• Kinesis, NiFi

File Systems• HDFS/ Hive• NFS• S3

Parsers• XML • JSON• CSV• Avro• Parquet

Transformations• Filters• Rules• Expression• Dedup• Enrich

Analytics• Dimensional Aggregations

(with state management for historical data + query)

Protocols• HTTP• FTP• WebSocket• MQTT• SMTP

Other• Elastic Search• Script (JavaScript, Python, R)• Solr• Twitter

Page 7: Apache Apex & Bigtop

© 2016 DataTorrent

Apex as Bigtop component

7

•Uses Bigtop framework for ease of deploymentDeployment using puppet recipes and VagrantCan spawn multiple node clusters for docker, VM &

OpenStack

•Generates a deployable binaries for Apex engineRPM - CentOS 5 & 6, Fedora 20, OpenSuse 42.1DEB - Ubuntu 14.04 & 16.04, Debian 8

•Allows validating installationsPackage TestSmoke Test

Page 8: Apache Apex & Bigtop

© 2016 DataTorrent

•Add Bigtop Repositoryhttp://www.apache.org/dist/bigtop/bigtop-1.1.0/repos/

• Install bigtop-hadoopFor Debian: apt-get install hadoop\*For RPM: yum install hadoop\*

•Download bigtop-apex from bigtop CIhttps://ci.bigtop.apache.org/job/Bigtop-trunk-packages/

• Install Apex:For Debian: dpkg -i apex_3.4.0-1_all.debFor RPM: rpm -i apex-3.4.0-1.el6.noarch.rpm

Installing Bigtop Apex Bigtop 1.1.0 (Current)

8

Page 9: Apache Apex & Bigtop

© 2016 DataTorrent

•Add Bigtop Repository (Future URL)http://www.apache.org/dist/bigtop/bigtop-1.2.0/repos/

• Install apexFor Debian: apt-get install apexFor RPM: yum install apex

Installing Bigtop Apex Bigtop 1.2.0 (Next Release)

9

Page 10: Apache Apex & Bigtop

© 2016 DataTorrent

•A quick starter Apex docker image: https://hub.docker.com/r/chinmayk/apex/

•Preconfigured and running componentsHDFS (namenode, secondarynamenode, datanode)YARN (resourcemanager, nodemanager, timelineserver)

•Preconfigured and installed componentApex

•Get started:Step1: docker pull chinmayk/apexStep2: docker run -it chinmayk/apex:ubuntu-14.04

Apex Docker sandbox

10

Page 11: Apache Apex & Bigtop

© 2016 DataTorrent

Apex Docker sandbox (contd.)

11

Page 12: Apache Apex & Bigtop

© 2016 DataTorrent

Resources

12

• Apache Apex website - http://apex.apache.org/• Subscribe - http://apex.apache.org/community.html• Download - http://apex.apache.org/downloads.html• Twitter - @ApacheApex; Follow - https://twitter.com/apacheapex• Facebook - https://www.facebook.com/ApacheApex/• Meetup - http://www.meetup.com/topics/apache-apex•SlideShare - http://www.slideshare.net/ApacheApex/presentations•More Examples - https://github.com/DataTorrent/examples• Startup Program – Free Enterprise License for Startups, Educational

Institutions, Non-Profits - https://www.datatorrent.com/startups/•Cloud Trial - https://www.datatorrent.com/download/cloud-trial/

Page 13: Apache Apex & Bigtop

© 2016 DataTorrent

We Are Hiring

13

[email protected]• Back-End Engineers• Front-End Engineers• QA Automation Engineers• Solutions Engineers