real time analytics via spark & scala | spark & scala fundamentals | spark & scala...

22
Slide 1 © 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Real Time Analytics via Spark & Scala

Upload: skillspeed

Post on 07-Aug-2015

112 views

Category:

Data & Analytics


3 download

TRANSCRIPT

Page 1: Real Time Analytics via Spark & Scala | Spark & Scala Fundamentals | Spark & Scala Architecture

Slide 1© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Real Time Analytics via Spark & Scala

Page 2: Real Time Analytics via Spark & Scala | Spark & Scala Fundamentals | Spark & Scala Architecture

Slide 2© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Session Objectives

ᗍ Introduction to Big Data and Hadoop

ᗍ Understanding Spark

ᗍ Why Spark?

ᗍ Note on Scala

ᗍ Getting answers to interview questions on Spark

ᗍ Getting your doubt’s cleared

Page 3: Real Time Analytics via Spark & Scala | Spark & Scala Fundamentals | Spark & Scala Architecture

Slide 3© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Big Data and its Challenges

Page 4: Real Time Analytics via Spark & Scala | Spark & Scala Fundamentals | Spark & Scala Architecture

Slide 4© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Big Data and its Challenges

Big data is the term for a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications

Systems / Enterprises generate huge amount of data from Terabytes to and even Petabytes of information

It’s very difficult to manage such huge data……

Get Started with Spark and Scala

Page 5: Real Time Analytics via Spark & Scala | Spark & Scala Fundamentals | Spark & Scala Architecture

Slide 5© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Who Generates Big Data?

Have you ever wondered how Google, Facebook or LinkedIn manages to store and utilize the huge data?Today, it is becoming a problem for all of us to manage such BIG DATA….Get Started with Spark and Scala

Page 6: Real Time Analytics via Spark & Scala | Spark & Scala Fundamentals | Spark & Scala Architecture

Slide 6© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Spark - Lightens Fast Cluster Computing

Apache Spark is an open source big data processing framework

Spark is used at a wide range of organizations to process large datasets

1. Easy to develop2. Interactive shell3. Out of box functionality

Get Started with Spark and Scala

Page 7: Real Time Analytics via Spark & Scala | Spark & Scala Fundamentals | Spark & Scala Architecture

Slide 7© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Why Spark?

100x Faster than

Speed: Run programs up to 100x faster than Hadoop Map Reduce in memory, or 10x faster on disk.

Get Started with Spark and Scala

Page 8: Real Time Analytics via Spark & Scala | Spark & Scala Fundamentals | Spark & Scala Architecture

Slide 8© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Why Spark?

Ease of Use: Allows to write applications quickly in Java, Scala or Python.

Get Started with Spark and Scala

Page 9: Real Time Analytics via Spark & Scala | Spark & Scala Fundamentals | Spark & Scala Architecture

Slide 9© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Why Spark?

Generality: Combines SQL, streaming, and complex analytics.

Get Started with Spark and Scala

Page 10: Real Time Analytics via Spark & Scala | Spark & Scala Fundamentals | Spark & Scala Architecture

Slide 10© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Why Spark?

Runs Everywhere: Runs on Hadoop, Mesos, standalone, or in the cloud.

Get Started with Spark and Scala

Page 11: Real Time Analytics via Spark & Scala | Spark & Scala Fundamentals | Spark & Scala Architecture

Slide 11© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Spark Architecture

Spark Architecture includes following three main components:

ᗍ Data Storageᗍ APIᗍ Management Framework

API (Scala, Java,

Python)

Storage (HDFS other

formats)

Distributed Computing

(Stand-alone, Mesos, YARN)

Compute Interface

Data Management

Get Started with Spark and Scala

Page 12: Real Time Analytics via Spark & Scala | Spark & Scala Fundamentals | Spark & Scala Architecture

Slide 12© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Spark Ecosystem

Spark Core Engine

Shark(SQL)

Spark Streaming(streamin

g)

MLLib(Machine Learning)

GraphX(Graph

Computation)

SparkR(R on Spark)

BlinkDB(Approxima

te SQL)

Alpha/Pre-alpha

Get Started with Spark and Scala

Page 13: Real Time Analytics via Spark & Scala | Spark & Scala Fundamentals | Spark & Scala Architecture

Slide 13© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

The Spark Users

Get Started with Spark and Scala

Page 14: Real Time Analytics via Spark & Scala | Spark & Scala Fundamentals | Spark & Scala Architecture

Slide 14© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Scala

ᗍ Scala is a general purpose programming language designed to express common programming patterns in a concise, elegant, and type-safe way

ᗍ It integrates features of object-oriented and functional languages

ᗍ Publicly released in January 2004 on the JVM Platform and a few months later on the .NET platform

Get Started with Spark and Scala

Page 15: Real Time Analytics via Spark & Scala | Spark & Scala Fundamentals | Spark & Scala Architecture

Slide 15© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Why Scala?

ᗍ Scala is a type-safe JVM language that incorporates both object oriented and functional

programming into an extremely concise, logical, and extraordinarily powerful language

ᗍ Scala makes use of a lot of functional syntactic sugar that has become popular with developers

and makes many developers characterize Scala as a more functional language

ᗍ Scala is Statically typed Language

ᗍ Being used heavily for future Big data and developments frameworks like Spark, Akka, Scalding

etc.

Get Started with Spark and Scala

Page 16: Real Time Analytics via Spark & Scala | Spark & Scala Fundamentals | Spark & Scala Architecture

Slide 16© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Job Trends – Spark and Scala

Get Started with Spark and Scala

Page 17: Real Time Analytics via Spark & Scala | Spark & Scala Fundamentals | Spark & Scala Architecture

Slide 17© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Why SkillSpeed?

Course Curriculum

from Industry Experts

Instructor Led Live Virtual Sessions

Lifetime access to Course

Content via LMS

100% Placement Assistance

24x7 Support

24x7

Get Started with Spark and Scala

Page 18: Real Time Analytics via Spark & Scala | Spark & Scala Fundamentals | Spark & Scala Architecture

Slide 18© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Course Topics

Module 1

Getting Started / Introduction to Scala

Module 2

Scala – Essentials and Deep Dive

Module 3Introducing Traits and OOPS in Scala

Module 4Functional

Programming in Scala

Module 5

Spark and Big Data

Module 6

Advanced Spark Concepts

Module 7

Understanding RDDs

Module 8

Shark, SparkSQL and Project Discussion

Get Started with Spark and Scala

Page 19: Real Time Analytics via Spark & Scala | Spark & Scala Fundamentals | Spark & Scala Architecture

Slide 19© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Corporate Partners

Get Started with Spark and Scala

Page 20: Real Time Analytics via Spark & Scala | Spark & Scala Fundamentals | Spark & Scala Architecture

Slide 20© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Lines open 24/7

To know more about the course, Please contact:

IND+91-90660-20904 USA1866-607-6547 (Toll Free)

Or reach us at

[email protected]

Contact us..

Get Started with Spark and Scala

Page 21: Real Time Analytics via Spark & Scala | Spark & Scala Fundamentals | Spark & Scala Architecture

Slide 21© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Image References

Google images – credit for google, Facebook and LinkedIn LOGO and Snapshots

http://pixshark.com/big-data-comic.htm

http://findicons.com/icon/66444/user_group

http://www.virtualizor.com/tour

https://accounts.it.et.byu.edu/

http://www.clipartsfree.net/tag/server.html

http://www.gopixpic.com/16/time-clock-icon-png-download

http://blog.smartbear.com/requirements/how-to-interview-users-to-find-out-what-they-really-want/

http://www.lincs.fr/research/areas/big-data/

http://www.counsellingpages.co.uk/

http://langfordsconsultancy.com/langfords-training-support-package/

http://cbsepathshala.blogspot.in/2012/05/physics-class-x-chapter-electricity.html

http://mmatycoon.com/tycoontimes/tycoontimesstory.php?SID=1010

Page 22: Real Time Analytics via Spark & Scala | Spark & Scala Fundamentals | Spark & Scala Architecture