dr vishwanath karad mit - world peace...

123
SYLLABUS DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITY FACULTY OF SCIENCE M.SC. Big Data Analytics BATCH 2018-19 Dr.SudhirGavhane Dean, LASC

Upload: others

Post on 12-Mar-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

SYLLABUS

DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITY

FACULTY OF SCIENCE

M.SC. Big Data Analytics

BATCH – 2018-19

Dr.SudhirGavhane Dean, LASC

Page 2: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

PROGRAMME STRUCTURE

Preamble:

Big Data Analytics is required to deal with the problems faced by industry today. The techniques

and tools are used to solve problems from a wide variety of Industries such as manufacturing,

services, retail, banking and finance, sports, pharmaceuticals, and aerospace etc.

Big Data Analytics is interdisciplinary and is required to analyse ever growing large data ( growing

by volumn, velocity and variety) applying techniques like data mining, machine learning, and deep

learning from computer science, statistics and maths.

Big Data Analytics is required to cope up with rapid changes in both, domain knowledge and

technology. It is one of the fastest growing and most promising technologies

First year Provides foundation of Big Data Technology, Maths and Statistics including

programming languages. Programme includes technology such as Hadoop, techniques such data

mining, and computer programing, maths and statistics subjects that will provide the foundation for

students.

Second year will include subjects belonging to the chosen track in his/her own interest relevant to

Big data Analytics. It will also include advanced topics and technologies in Big Data. There will be

mini project and Internship to get industrial exposure to the students.

Dr.SudhirGavhane Dean, LASC

Page 3: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

Vision and Mission of the Programme

Vision:

To contribute to the society through excellence in scientific and knowledge-based education

utilizing the potential of computer science with a deep passion for wisdom, culture and

values.

Mission:

Big Data Analytics is aimed to offer a thorough professional training which prepare

students to embark on Big Data Analytics careers which is one of the fastest growing

technologies. They are also provided a very good foundation for further study at PhD level.

Prepare and equip students for opportunities in ever changing technology with hands-on

industrial training.

Transform the students to become globally competent professionals through international

training/internship.

Nurture the creativity and inculcate entrepreneurial skills among the students.

Programme Educational Objectives

To enable learners to develop expert knowledge and analytical skills in current and developing areas of analysis statistics, and machine learning.

To provide learners with a deep and systematic knowledge of business and technical strategies for data analytics and the subsequent skills to implement solutions in these areas.

To facilitate the development by the learner of applied skills that are directly complementary and relevant to the workplace.

To develop in the learner a deep and systematic understanding of current issues of research and analysis

To enable learners conduct independent research and analysis in the field of data analytics.

To enable the learner to identify, develop and apply detailed analytical, creative, problem solving skills.

Provide the learner with a comprehensive platform for career development, innovation and further study.

Dr.SudhirGavhane Dean, LASC

Page 4: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

Programme Specific Outcomes

A graduate with a M.Sc. in Big Data Analytics will have the ability to communicate

computer science concepts, designs, and solutions effectively and professionally

This course is aimed to offer training which prepare students to embark on Big Data

Analytics careers which is one of the fastest growing technologies. They are also provided

a very good foundation for further study at PhD level.

Prepare and equip students for opportunities in ever changing technology with hands-on

industrial training.

Transform the students to become globally competent professionals through internship.

Nurture the creativity and inculcate entrepreneurial skills among the students.

Project work gives students hands on experience in solving a real world problem.

The Syllabus also develops requisite professional skills and problem solving abilities for

pursuing a career in Software Industry.

Dr.SudhirGavhane Dean, LASC

Page 5: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

Programme Structure:

(a) Programme duration: 2 years full time.

(b) System followed: Trimester

(c) Credits System:

(i) Per term or per year, as applicable

(ii) Total in the programme, as applicable

(d) Credits for activities other than academics: NA

(e) Internship: Full time three months Industrial training should be completed.

(f) Assessment Criteria: Minimum 50% credits of first year are required to take

admission in second year.

(g) Branches or Specialisations: NA

(h) Mandatory Attendance to appear for examination:

It is expected on the part of the student to attend each and every Lecture,

Tutorial, and Laboratory practical sessions in a course for the academic

excellence. However, due to any contingencies, the attendance requirement will

be a minimum of 90% of the classes scheduled/ held.

(j) Medium of Instruction and Examination: English

(k) Eligibility criteria for admission to the programme: B.Sc.(CS), BCS, B.Sc.(IT),

BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate in case of

candidate backward class categories and persons with disability belonging to

Maharashtra state only)

Dr.SudhirGavhane Dean, LASC

Page 6: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

M.Sc. Big Data Analytics

2017-18

A. DefinitionofCredit:-

4Hr.Lecture / Tutorial perweek 3credit

3HoursPractical(Lab) per week 3credit

B. Credits:-

Total number of credits for two years Post Graduate M.Sc. Programme would be 120.

C. StructureofCredits for Post Graduate M.Sc.Program:-

S.

No.

Category SuggestedBreakupof Credits(Total175)

1 Humanities andSocialSciences and Peace Programmes

includingManagementcourses 10

2 Professionalcorecourses including Laboratory/Mini Project Work

84

3 ProfessionalElectivecourses

06

4 Full Time Industrial Training 20

Total 120

Dr. Sudhir Gavhane Dean, LASC

Page 7: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

D. Coursecodeanddefinition:-

E. Grading Scheme:

Grades & Grade Points

Marks Out of 100

Grade Grade Point

80-100 O: Outstanding 10

70-79 A+: Excellent 9

60-69 A: Very Good 8

55-59 B+: Good 7

50-54 B: Above Average 6

45-49 C: Average 5

40-44 Pass 4

0-39 Fail 0

Ab Absent NA

Coursecode Definitions

L Lecture

T Tutorial

WP Humanities andSocialSciences and Peace Programs

includingManagementcourses MBD M.Sc.(Big Data Analytics)

Dr.SudhirGavhane Dean, LASC

Page 8: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

M. Sc. Big Data Analytics (First Year) (Batch 2017-18) Trimester – I

Type: Core **Assessment Marks are valid only if Attendance criteria are met

Weekly Teaching Hours:25 *CCA: Class Continuous Assessment

Total Credits: First Year M.Sc. Big Data Analytics Trimester I:20 *LCA: Laboratory Continuous Assessment

Sr.

No. Course Code Name of Course Type

Weekly Workload, Hrs Credits Assessment, Marks

Theory Tutorial Lab Th Lab CCA* LCA*

End

Term

Test

Total

1 MIT-WPU-MBD-1101 Data Warehousing & Data Mining Core 3 1 3 50 50 100

2 MIT-WPU-MBD-1102 Parallel And Distributed Computing Core 4 3 50 50 100

3 MIT-WPU-MBD-1103 Big Data Architecture & Ecosystem -

Hadoop

Core 4 3 50 50 100

4 MIT-WPU-MBD-1104 Python Programming Core 3 1 3 50 50 100

5 MIT-WPU-MBD-1105 Lab on Python Core 3 3 50 50 100

6 MIT-WPU-MBD-1106 Lab on Hadoop using HDFS Core 3 3 50 50 100

7 WP Philosophy of Science and Spirituality SEC 3 2 25 25 50

Total : 17 02 06 14 06 225 125 300 650

Dr.SudhirGavhane Dean, LASC

Page 9: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

M. Sc. Big Data Analytics (First Year) (Batch 2017-18) Trimester – II

Type: Core **Assessment Marks are valid only if Attendance criteria are met

Weekly Teaching Hours: 25 *CCA: Class Continuous Assessment

Total Credits: First Year M.Sc. Big Data Analytics Trimester II:20 *LCA: Laboratory Continuous Assessment

Sr.

No. Course Code Name of Course Type

Weekly Workload, Hrs Credits Assessment Marks **

Theory Tutorial Lab Th Lab CCA* LCA*

End

Term

Test

Total

1 MIT-WPU-MBD-1201 R Programing Core 3 1 3 50 50 100

2 MIT-WPU-MBD-1202 Distributed Processing using Hadoop Core 4 3 50 50 100

3 MIT-WPU-MBD-1203 Operation Research Core 4 3 50 50 100

4 MIT-WPU-MBD-1204 Next Generation Databases Core 3 1 3 50 50 100

5 MIT-WPU-MBD-1205 Lab on R Programming Core 3 3 50 50 100

6 MIT-WPU-MBD-1206 Lab on Hadoop and Tools Core 3 3 50 50 100

7 WP Philosophy of Science and Spirituality SEC 3 2 25 25 50

Total : 17 02 06 14 06 225 125 300 650

Dr.SudhirGavhane Dean, LASC

Page 10: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

M. Sc. Big Data Analytics (First Year) (Batch 2017-18) Trimester – III

Type: Core **Assessment Marks are valid only if Attendance criteria are met

Weekly Teaching Hours: 25 *CCA: Class Continuous Assessment

Total Credits: First Year M.Sc. Big Data Analytics Trimester III:20 *LCA: Laboratory Continuous Assessment

Total First Year M.Sc. Big Data Analytics Credits: 60

Sr.

No. Course Code Name of Course Type

Weekly Workload, Hrs Credits Assessment Marks**

Theory Tutorial Lab Th Lab CCA* LCA*

End

Term

Test

Total

1 MIT-WPU-MBD-1301 Statistical Computing Core 4 3 50 50 100

2 MIT-WPU-MBD-1302 Information Security Core 4 3 50 50 100

3 MIT-WPU-MBD-1303 Apache Spark Core 3 1 3 50 50 100

4 MIT-WPU-MBD-1304 Machine Learning Algorithm-I Core 3 1 3 50 50 100

5 MIT-WPU-MBD-1305 Lab on Statistical Computing Core 3 3 50 50 100

6 MIT-WPU-MBD-1306 Lab on Machine Learning Algorithms- I Core 3 3 50 50 100

7 WP Creativity and Innovation SEC 3 2 25 25 50

Total : 17 02 06 14 06 225 125 300 650

Dr.SudhirGavhane Dean, LASC

Page 11: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

M. Sc. Big Data Analytics(Second Year) (Batch 2017-18) Trimester – I

Type: Core/ Elective **Assessment Marks are valid only if Attendance criteria are met

Weekly Teaching Hours:26 *CCA: Class Continuous Assessment

Total Credits: Second Year M.Sc. Big Data Analytics Trimester I:20 *LCA: Laboratory Continuous Assessment

Sr.

No. Course Code Name of Course Type

Weekly Workload, Hrs Credits Assessment Marks**

Theory Tutorial Lab Th Lab CCA* LCA*

End

Term

Test

Total

1 MIT-WPU-MBD-2101 Principles Of Deep Learning Core 3 1 3 50 50 100

2 MIT-WPU-MBD-2102 Machine Learning Algorithm-II Core 3 1 3 50 50 100

3 MIT-WPU-MBD-2103 Data Science life cycle Core 4 3 50 50 100

4 MIT-WPU-MBD-2104 Lab on Machine Learning

Algorithms II

Core 4 3 50 50 100

5 MIT-WPU-MBD-2105 Lab Data Science life cycle Core 3 3 50 50 100

6 Elective I Elective 4 3 50 50 100

7 WP Scientific studies of Peace – Mind,

matter, Spirit and consciousness

SEC 3 2 25 25 50

Total : 21 02 03 14 06 225 125 300 650

Dr.SudhirGavhane

Dean, LASC

Page 12: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

M. Sc. Big Data Analytics (Second Year) (Batch 2017-18) Trimester – II

Type: Core/ Elective **Assessment Marks are valid only if Attendance criteria are met

Weekly Teaching Hours: 26 *CCA: Class Continuous Assessment

Total Credits: Second Year M.Sc. Big Data Analytics Trimester II:20 *LCA:Laboratory Continuous Assessment

Sr.

No. Course Code Name of Course Type

Weekly Workload, Hrs Credits Assessment Marks**

Theory Tutorial Lab Th Lab CCA* LCA*

End

Term

Test

Total

1 MIT-WPU-MBD-2201 Natural Language Processing Core 4 3 50 50 100

2 MIT-WPU-MBD-2202 Web & Social Intelligence Core 3 1 3 50 50 100

3 MIT-WPU-MBD-2203 Cloud Computing Core 4 3 50 50 100

4 MIT-WPU-MBD-2204 Lab on Web & Social Intelligence Core 3 1 3 50 50 100

5 MIT-WPU-MBD-2205 Mini Project Core 3 3 50 50 100

6 Elective II Elective 4 3 50 50 100

7 WP Business-strategic planning and

finance

SEC 3 2 25 25 50

Total : 21 02 03 17 03 225 125 300 650

Dr.SudhirGavhane Dean, LASC

Page 13: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

M. Sc. Big Data Analytics (Second Year) (Batch 2017-18) Trimester – III

Type: Core **Assessment Marks are valid only if Attendance criteria are met

Weekly Teaching Hours: 15 *CCA: Class Continuous Assessment

Total Credits: Second Year M.Sc. Big Data Analytics Trimester III:20 *LCA: Laboratory Continuous Assessment

Total Second Year M.Sc. Big Data AnalyticsCredits:60

Sr.

No.

Course

Code Name of Course Type

Weekly Workload, Hrs Credits Assessment Marks**

Theory Tutorial Lab Th Lab CCA* LCA*

End

Term

Test

Total

1

MIT-

WPU-

MS-

2301

Full Time Industrial Training

Core

4 3 50 50 100

Total : 4 3 50 50 100

Dr.SudhirGavhane Dean, LASC

Page 14: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

ElectiveCourses:

Big Data Analytics Big Data Analytics

Code Title Code Title

Elect I MIT-

WPU-

MBD-

2106

Internet Of Things MIT-

WPU-

MBD-

2206

Marketing Analytics

Elect II MIT-

WPU-

MBD-

2107

Introduction to image

processing

MIT-

WPU-

MBD-

2207

HR Analytics

Name of Specialisation: Big Data Analytics

Dr.SudhirGavhane Dean, LASC

Page 15: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

`

COURSE STRUCTURE

Course Code MIT-WPU- MBD-1101

Course Category Core BigData Analytics

Course Title Data Warehousing & Data Mining

Teaching Scheme and Credits

Weekly load hrs

L T Laboratory Credits

4 - - 3

Pre-requisites:

Understanding of: Relational database normalization techniques,

Physical design of a database, Concepts of algorithm design and analysis, Basic understanding of:

Software engineering principles and techniques, Probability and statistics – Bayesian theory,

regression, hypothesis testing

Course Objectives:

1. To understand the structure of Data Warehouse

2. To understand different data pre-processing techniques.

3. To understand basic descriptive and predictive data mining techniques.

4. To use data mining tool on different data sets

5. To understand Classification algorithms

6. To understand Prediction algorithms.

7. To understand Clustering algorithms.

8. To use data mining tool on different data sets

CourseOutcomes:

The student will get knowledge of:

Data processing and data quality.

Modelling and design of data warehouses.

Basic and advanced concepts of algorithms for data mining.

Data mining tool and practical experience of applying data mining algorithms

CourseContents:

Introduction to Data Mining

Basic concepts of data mining, Types of Data to be mined.

Introduction to Data Warehouse

Data Warehouse and DBMS, Architecture of Data Warehouse

Data pre-processing

Need Data pre-processing, Attributes and Data types

Data Mining Techniques: Association Rule Mining

Basic idea: item sets, Frequent Item-sets

Data Mining Techniques: Classification

Dr. Sudhir Gavhane

Dean, LASC

Page 16: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

`

Definition of Classification, Decision tree Induction: Information gain, gain ratio, Gini Index

Data Mining Techniques: Prediction

Definition of Prediction, Linear regression

Data Mining Techniques: Clustering

Definition of Clustering, Partitioning Methods

Performance Measures

Precision, recall, F-measure

Problem solving with R or Weka: filters, Discretization, mining association rules, decision trees,

Prediction, k-means

LearningResources:

Reference Books:

1. Data Mining: Concepts and Techniques, Han, Elsevier ISBN:9789380931913/ 9788131205358 2. Margaret H. Dunham, S. Sridhar, Data Mining – Introductory and Advanced Topics, Pearson

Education

3. Data warehousing: fundamentals fot IT professionals 3rd edition , Kimball, Wiley Publication

4. Ian H.Witten, Eibe Frank Data Mining: Practical Machine Learning Tools and Techniques, Elsevier/(Morgan Kauffman), ISBN:9789380501864 5. Introduction to Data Mining (2005) By Pang-Ning Tan, Michael Steinbach, Vipin Kumar Addison Wesley ISBN: 0-321-32136-7 6. [Research-Papers]: Some of the relevant research papers that contain recent results and developments in data mining field

Pedagogy: Participative learning, discussions, algorithm, Flowchart & Program writing,

experiential learning through practical problem solving, assignment, PowerPoint presentation.

AssessmentScheme:

Class Continuous Assessment (CCA) 50 Marks

Assignments Test Presentations Case study MCQ Oral Attendance

10 10 - - 10 10 10

Term End Examination : 50 Marks

Dr. Sudhir Gavhane

Dean, LASC

Page 17: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

`

Syllabus:

Module

No. Contents

Workload in Hrs

Theory Lab Assess

1

Introduction to Data Mining: Basic concepts of data mining,

Types of Data to be mined, Stages of the Data Mining

Process, Data Mining Techniques, Knowledge Discovery in

Databases, Data Mining Issues, Applications of Data Mining

4 - -

2

Introduction to Data Warehouse: Data Warehouse and DBMS

Architecture of Data Warehouse, Multidimensional data model,

Concepts of OLAP and Data Cube, OLAP

operations, Dimensional Data Modelling- Star, Snow flake

schemas

5 - -

3

Data pre-processing: Need Data pre-processing, Attributes and

Data types, Statistical descriptions of Data, Handling missing

Data, Data sampling, Data cleaning, Data Integration and

transformation, Data reduction, Discretization and generating

concept hierarchies

6 - -

4 Data Mining Techniques: Association Rule Mining: Basic

idea: item sets, Frequent Item-sets, Association Rule Mining,

Generating item sets and rules efficiently, FP growth algorithm

4 - -

5

Data Mining Techniques: Classification: Definition of

Classification, Decision tree Induction: Information gain, gain

ratio, Gini Index, Issues: Over-fitting, tree pruning methods,

missing values, continuous classes, Classification and Regression

Trees (CART), Bayesian Classification: Bayes Theorem, Naïve

Bayes classifier, Bayesian Networks, Linear classifiers,

Least squares, SVM classifiers, Lazy Learners (or Learning from

Your Neighbors)

9 - -

6 Data Mining Techniques: Prediction: Definition of Prediction

Linear regression, Non-linear regression, Logistic regression

3 - -

7 Data Mining Techniques: Clustering: Definition of Clustering

Partitioning Methods, Hierarchical Methods, Distance Measures

in Algorithmic Methods, Density Based Clustering

6 - -

8 Performance Measures: Precision, recall, F-measure, confusion

matrix, cross-validation, bootstrap. 3 - -

9 Problem solving with R or Weka: filters, Discretization,

mining association rules, decision trees, Prediction, k-means 5 - -

Prepared By Ms. Devyani B Kamble

Assistant Professor

Checked By

Ms. Pradnya Mahadik BOS Chairman

Approved By

Dr. Sudhir Gavhane

Dean, LASC

Page 18: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

`

COURSE STRUCTURE

Course Code MIT-WPU-MBD-1102

Course Category Core Big Data Analytics

Course Title Parallel And Distributed Computing

Teaching Scheme and Credits

Weekly load hrs

L T Laboratory Credits

4 - - 3

Pre-requisites:

1. Ability to program well in C, C++ or Fortran.

2. Willingness to rethink how problems should be solved.

3. Algorithm & Data Structures

Basics of Computer Architecture

Course Objectives:

1. Learning basic models of parallel machines and tools

2. How to parallelize programs and how to use basic tools like MPI and POSIX threads.

3. To learn core ideas behind parallel and distributed computing.

4. To explore the methodologies adopted for concurrent and distributed environment.

5. To understand the networking aspects of parallel and distributed computing.

6. To provide an overview of the computational aspects of parallel and distributed computing.

7. To learn parallel and distributed computing models.

Course Outcomes:

Students will be able to:

1. Explore the methodologies adopted for concurrent and distributed environment.

2. Analyse the networking aspects of Distributed and Parallel Computing.

3. Explore the different performance issues and tasks in parallel and distributed computing.

4. Develop parallel algorithms for solving real–world problems.

Course Contents:

1. Parallel and Distributed Computing— Introduction, Benefits and Needs, Programming

Environment, Theoretical Foundations- Parallel Algorithms Parallel Models and Algorithms-

Sorting- Matrix Multiplication- Convex Hull- Pointer Based Data Structures.

2. Synchronization- Process Parallel Languages- Architecture of Parallel and Distributed

Systems- Consistency and Replication- Security- Parallel Operating Systems.

Dr. Sudhir Gavhane

Dean, LASC

Page 19: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

`

3. Management of Resources in Parallel Systems- Tools for Parallel Computing- Parallel

Database Systems and Multimedia Object Servers.

4. Networking Aspects of Distributed and Parallel Computing- Process- Parallel and

Distributed Scientific Computing.

5. High-Performance Computing in Molecular Sciences- Communication- Multimedia

Applications for Parallel and Distributed Systems- Distributed File Systems.

Learning Resources:

Reference Books: 1. Jacek Błażewicz, et al., “Handbook on parallel and distributed processing”, Springer

Science & Business Media, 2013.

2. Andrew S. Tanenbaum, and Maarten Van Steen, “Distributed Systems: Principles and

Paradigms”. Prentice-Hall, 2007.

3. George F.Coulouris, Jean Dollimore, and Tim Kindberg, “Distributed systems: concepts

and design”, Pearson Education, 2005.

4. Gregor Kosec and Roman Trobec, “Parallel Scientific Computing: Theory, Algorithms, and

Applications of Mesh Based and Meshless Methods”, Springer, 2015.

Supplementary Reading: 1. Quinn, M. J., Parallel Computing: Theory and Practice (McGraw-Hill Inc.).

2. Gibbons, A., W. Rytter, Efficient Parallel Algorithms (Cambridge Uni. Press).

3. Shameem A and Jason, Multicore Programming, Intel Press, 2006

Weblinks:

1 https://www.tutorialspoint.com/parallel_algorithm/parallel_algorithm_introduction.htm

Pedagogy: Participative learning, discussions, demonstrations, practical, assignment, PowerPoint

presentation

Assessment Scheme:

Class Continuous Assessment (CCA) 50 Marks

Assignments Test Presentations Case study MCQ Oral Attendance

20 10 10 - - - 10

Term End Examination : 50 Marks

Dr. Sudhir Gavhane

Dean, LASC

Page 20: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

`

Syllabus:

Module

No. Contents

Workload in Hrs

Theory Lab Assess

1

Parallel and Distributed Computing— Introduction- Benefits

and Needs- Parallel and

Distributed Systems- Programming Environment- Theoretical

Foundations- Parallel

Algorithms— Introduction- Parallel Models and Algorithms-

Sorting- Matrix

Multiplication- Convex Hull- Pointer Based Data Structures.

10 - -

2

Synchronization- Process Parallel Languages- Architecture of

Parallel and Distributed

Systems- Consistency and Replication- Security- Parallel

Operating Systems.

10 - -

3

Management of Resources in Parallel Systems- Tools for

Parallel Computing- Parallel

Database Systems and Multimedia Object Servers.

6 - -

4

Networking Aspects of Distributed and Parallel Computing-

Process- Parallel and

Distributed Scientific Computing.

11 - -

5

High-Performance Computing in Molecular Sciences-

Communication- Multimedia

Applications for Parallel and Distributed Systems- Distributed

File Systems.

8 - -

ggest the below items:

Prepared By

Ms. Deepali Sonawane

Assistant Professor

Checked By

Ms. Pradnya Mahadik

BOS Chairman

Approved By

Dr. Sudhir Gavhane Dean, LASC

Page 21: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

`

COURSE STRUCTURE

Course Code MIT-WPU-MBD-1103

Course Category Core Big Data Analytics

Course Title Big Data Architecture & Ecosystem - Hadoop

Teaching Scheme and Credits

Weekly load hrs

L T Laboratory Credits

4 - - 3

Pre-requisites:

Some basic knowledge and experience of Java (Jars, Array, Classes, Objects, etc.)

Course Objectives:

1. Learn Injecting data into Hadoop

2. Learn to build and maintain reliable, scalable, distributed systems with Hadooop

3. Able to apply Hadoop ecosystem components.

Course Outcomes:

1. Students will learn injecting data into Hadoop .

2. They will able to learn distributed systems with Apache Hadoop.

3. They will able to apply Hadoop ecosystem components.

Course Contents:

1. Introduction to big data: Introduction, distributed file system, Big Data and its importance,

Drivers, Big data analytics, Big data applications. Algorithms, Matrix-Vector, Multiplication by

Map Reduce.

2. Introduction to HADOOP: Big Data, Apache Hadoop & Hadoop Ecosystem, MapReduce,

Data Serialization.

3. HADOOP Architecture: Architecture, Storage, Task trackers, Hadoop Configuration

4. HADOOP ecosystem and yarn: Hadoop ecosystem components, Hadoop 2.0 New Features

NameNode High Availability, HDFS Federation, MRv2, YARN, Running MRv1 in YARN.

Dr. Sudhir Gavhane

Dean, LASC

Page 22: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

`

Syllabus:

Module

No. Contents

Workload in Hrs

Theory Lab Assess

1

1. Introduction to big data

Introduction – distributed file system – Big Data and its

importance, Four Vs, Drivers for Big data, Big data analytics, Big

data applications. Algorithms using map reduce, Matrix-Vector

Multiplication by Map Reduce.

11 - -

2

Introduction to HADOOP

Big Data, Apache Hadoop & Hadoop Ecosystem, Moving Data

in and out of Hadoop,

Understanding inputs and outputs of MapReduce, Data

Serialization.

11 - -

Learning Resources:

Reference Books:

1. Boris lublinsky, Kevin t. Smith, Alexey Yakubovich, “Professional Hadoop Solutions”,

Wiley, ISBN: 9788126551071, 2015.

2. Chris Eaton, Dirk deroos et al. “Understanding Big data ”, McGraw Hill, 2012.

3. Tom White, “HADOOP: The definitive Guide”, O Reilly 2012.

4. MapReduce Design Patterns (Building Effective Algorithms & Analytics for Hadoop) by

Donald Miner & Adam Shook

Supplementary Reading:

Weblinks:

https://cloudthat.in/course/processing-bigdata-with-apache-hadoop/

Pedagogy: Participative learning, discussions, algorithm, Flowchart & Program writing,

experiential learning through practical problem solving, assignment, PowerPoint presentation.

Assessment Scheme:

Class Continuous Assessment (CCA) 50 Marks

Assignments Test Presentations Case study MCQ Oral Attendance

20 10 10 - - - 10

Term End Examination : 50 Marks

Dr. Sudhir Gavhane

Dean, LASC

Page 23: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

`

3

HADOOP Architecture

Hadoop Architecture, Hadoop Storage: HDFS, Common Hadoop

Shell commands, Anatomy of

File Write and Read, NameNode, Secondary NameNode, and

DataNode, Hadoop MapReduce

Paradigm, Map and Reduce tasks, Job, Task trackers - Cluster

Setup – SSH &Hadoop

Configuration – HDFS Administering –Monitoring &

Maintenance.

12 - -

4

HADOOP ecosystem and yarn

Hadoop ecosystem components - Schedulers - Fair and Capacity,

Hadoop 2.0 New Features NameNode High Availability, HDFS

Federation, MRv2, YARN, Running MRv1 in YARN.

11 - -

ggest the below items:

Prepared By

Ms. Deepali Sonawane

Assistant Professor

Checked By

Ms. Pradnya Mahadik

BOS Chairman

Approved By

Dr. Sudhir Gavhane Dean, LASC

Page 24: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

`

COURSE STRUCTURE

Course Code MIT-WPU-MBD-1104

Course Category Core Big Data Analytics

Course Title Python Programming

Teaching Scheme and Credits

Weekly load hrs

L T Laboratory Credits

4 - - 3

Pre-requisites:

Knowledge of any scripting language, XML.

Course Objectives:

1. To understand why Python is a useful scripting language for developers.

2. To learn how to design and program Python applications.

3. To learn how to use lists, tuples, and dictionaries in Python programs.

4. To learn how to identify Python object types.

5. To define the structure and components of a Python program.

6. To learn how to write loops and decision statements in Python.

Course Outcomes:

1. Students will demonstrate the ability to solve problems using system approaches, critical

and innovative thinking, and technology to create solutions.

2. Students will design, develop, and present their final project.

3. Students will understand the purpose and the process of code reviews.

4. Students will be able to create scripts in Python for Autodesk's Maya.

5. Students will understand and will be able to articulate and apply the principles of 3D

graphics

Course Contents:

Introduction to Python

Introduction to python language.

Conditional Statements & Looping

Introduction conditional and looping statements in python

String Manipulation

Introduction to various operations on strings.

Lists, Tuple and Dictionaries

Introduction to various operations on Lists, Tuple and Dictionaries.

Dr. Sudhir Gavhane

Dean, LASC

Page 25: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

`

Functions

Introduction to functions in python.

Modules

Introduction to module, package in python.

Input-Output

Handling of inputs in python

Regular expressions

Use of regular expression in python

CGI

Introduction to CGI and cookies.

Database

Handling of database in python.

Learning Resources:

Reference Books:

1. Dive into Python by Mark Pilgrim

2. Programming Python by Mark Lutz, O’Reilly Media

3. Python Programming: An Introduction to Computer Science” by John Zelle

Supplementary Reading:

1. Python Testing Cookbook by Greg L. Turnquist

Web Resources:

1. www.tutorialspoint.com/python/

2. docs.python.org/3/tutorial/

3. www.learnpython.org

4. www.guru99.com/python-tutorials.html

5. www.tutorialspoint.com/cprogramming/

6. www.learn-c.org/

7. www.w3schools.in/c-tutorial/

Pedagogy: Participative learning, discussions, Problem Solving, experiential learning through

practical problem solving, assignment, PowerPoint presentation

Dr. Sudhir Gavhane

Dean, LASC

Page 26: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

`

Syllabus:

Module

No. Contents

Workload in Hrs

Theory Lab Assess

1

Introduction to Python

History

Features

Setting up path

working with Python

Basic Syntax

Variable and Data Types

Operator

4 - -

2

Conditional Statements & Looping

If, If- else, Nested if-else

For, While, Nested loops

Break, Continue, Pass

4 - -

3

String Manipulation

Accessing Strings

Basic Operations

String slices

Function and Methods

5 - -

4 Lists, Tuple and Dictionaries

Lists – Introduction, Accessing list, Operations, Working with 6 - -

Assessment Scheme:

Class Continuous Assessment (CCA) 50 Marks

Assignments Test Presentations Case study MCQ Oral Attendance

20 10 10 - - 10

Term End Examination : 50 Marks

Dr. Sudhir Gavhane

Dean, LASC

Page 27: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

`

lists, Function and Methods

Tuple – Introduction, Accessing tuples, Operations, Working,

Functions and Methods

Dictionaries - Introduction, Accessing values in dictionaries,

working with dictionaries, Properties, Functions

5

Functions

Defining a function

calling a function

Types of functions

Function Arguments

Anonymous functions

Global and local variables

4 - -

6

Modules

Importing module

Math module

Random module

Packages

Composition

4 - -

7

Input-Output

Printing on screen

Reading data from keyboard

Opening and closing file

Reading and writing files

Functions

4 - -

8

Regular expressions

Match function

Search function

Matching VS Searching

Modifiers

Patterns

4 - -

9

CGI

Introduction

Architecture

CGI environment variable

GET and POST methods

5 - -

Dr. Sudhir Gavhane

Dean, LASC

Page 28: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

`

Cookies

File upload

10

Database

Introduction

Connections

Executing queries

Transactions

Handling error

5 - -

ggest the below items:

Prepared By

Ms. Punam Nikam Assistant Professor

Checked By

Ms. Pradnya Mahadik

BOS Chairman

Approved By

Dr. Sudhir Gavhane Dean, LASC

Page 29: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

`

COURSE STRUCTURE

Course Code MIT-WPU-MBD-1105

Course Category Core Big Data Analytics

Course Title Lab on Python

Teaching Scheme and Credits

Weekly load hrsDr. Sudhir Gavhane Dean, LASC

L T Laboratory Credits

- - 3 3

Pre-requisites:

Knowledge of any scripting language, XML.

Course Objectives:

1. To understand why Python is a useful scripting language for developers.

2. To learn how to design and program Python applications.

3. To learn how to use lists, tuples, and dictionaries in Python programs.

4. To learn how to identify Python object types.

5. To define the structure and components of a Python program.

6. To learn how to write loops and decision statements in Python.

Course Outcomes:

1. Students will demonstrate the ability to solve problems using system approaches, critical

and innovative thinking, and technology to create solutions.

2. Students will design, develop, and present their final project.

3. Students will understand the purpose and the process of code reviews.

4. Students will be able to create scripts in Python for Autodesk's Maya.

5. Students will understand and will be able to articulate and apply the principles of 3D

graphics

Course Contents:

Introduction to Python

Introduction to python language.

Conditional Statements & Looping

Introduction conditional and looping statements in python

String Manipulation

Introduction to various operations on strings.

Lists, Tuple and Dictionaries

Introduction to various operations on Lists, Tuple and Dictionaries.

Dr. Sudhir Gavhane

Dean, LASC

Page 30: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

`

Functions

Introduction to functions in python.

Modules

Introduction to module, package in python.

Input-Output

Handling of inputs in python

Regular expressions

Use of regular expression in python

CGI

Introduction to CGI and cookies.

Database

Handling of database in python.

Laboratory Exercises / Practical:

1. Introduction to Python : Assignment on simple programs in python

2. Conditional Statements & Looping: Assignment on conditional statements and looping

statements

3. String Manipulation: Assignment on string manipulations.

4. Lists, Tuple and Dictionaries : Assignment on Lists, tuples and directories

5. Functions: Assignment on functions.

6. Modules : Assignment on use of modules

7. Input-Output : Assignment Input-Output operations

8. Regular expressions : Assignment on use of regular expressions

Dr. Sudhir Gavhane Dean, LASC

Page 31: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

`

Syllabus:

9. CGI : Assignment on CGI

10. Database : Assignment on database

Learning Resources:

Reference Books:

1. Dive into Python by Mark Pilgrim

2. Programming Python by Mark Lutz, O’Reilly Media

3. Python Programming: An Introduction to Computer Science” by John Zelle

Supplementary Reading:

1. Python Testing Cookbook by Greg L. Turnquist

Web Resources:

1. www.tutorialspoint.com/python/

2. docs.python.org/3/tutorial/

3. www.learnpython.org

4. www.guru99.com/python-tutorials.html

5. www.tutorialspoint.com/cprogramming/

6. www.learn-c.org/

7. www.w3schools.in/c-tutorial/

Pedagogy: Participative learning, discussions, Problem Solving, experiential learning through

practical problem solving, assignment, PowerPoint presentation

Assessment Scheme:

Class Continuous Assessment (CCA) 50 Marks

Assignments Test Presentations Case study MCQ Oral Attendance

20 10 10 - - 10

Term End Examination : 50 Marks

Dr. Sudhir Gavhane

Dean, LASC

Page 32: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

`

Module

No. Contents

Workload in Hrs

Theory Lab Assess

1

Introduction to Python

History

Features

Setting up path

working with Python

Basic Syntax

Variable and Data Types

Operator

- 2 -

2

Conditional Statements & Looping

If, If- else, Nested if-else

For, While, Nested loops

Break, Continue, Pass

- 2 -

3

String Manipulation

Accessing Strings

Basic Operations

String slices

Function and Methods

- 2 -

4

Lists, Tuple and Dictionaries

Lists – Introduction, Accessing list, Operations, Working with

lists, Function and Methods

Tuple – Introduction, Accessing tuples, Operations, Working,

Functions and Methods

Dictionaries - Introduction, Accessing values in dictionaries,

working with dictionaries, Properties, Functions

- 2 -

5

Functions

Defining a function

calling a function

Types of functions

Function Arguments

Anonymous functions

Global and local variables

- 3 -

6

Modules

Importing module

Math module

- 3 -

Dr. Sudhir Gavhane Dean, LASC

Page 33: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

`

Random module

Packages

Composition

7

Input-Output

Printing on screen

Reading data from keyboard

Opening and closing file

Reading and writing files

Functions

- 3 -

8

Regular expressions

Match function, Search function

Matching VS Searching

Modifiers

Patterns

- 3 -

9

CGI

Introduction

Architecture

CGI environment variable

GET and POST methods

Cookies

File upload

- 2 -

10

Database

Introduction Connections

Executing queries

Transactions

Handling error

- 2 -

ggest the below items:

Prepared By

Ms. Punam Nikam Assistant Professor

Checked By

Ms. Pradnya Mahadik

BOS Chairman

Approved By

Dr. Sudhir Gavhane Dean, LASC

Page 34: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

`

COURSE STRUCTURE

Course Code MIT-WPU-MBD-1106

Course Category Core Big Data Analytics

Course Title Lab on Hadoop using HDFS

Teaching Scheme and Credits

Weekly load hrs

L T Laboratory Credits

4 - - 3

Pre-requisites:

Some basic knowledge and experience of Java (Jars, Array, Classes, Objects, etc.)

Course Objectives:

1. Learn tips and tricks for Big Data use cases and solutions.

2. Learn to build and maintain reliable, scalable, distributed systems with Apache

3. Able to apply Hadoop ecosystem components.

Course Outcomes:

1. Students will learn tips and tricks for Big Data use cases and solutions.

2. They will able to build distributed systems with Apache Hadoop.

3. They will able to apply Hadoop ecosystem components.

Course Contents:

1. Introduction to big data: Introduction, distributed file system, Big Data and its importance,

Drivers, Big data analytics, Big data applications. Algorithms, Matrix-Vector, Multiplication by

Map Reduce.

2. Introduction to HADOOP: Big Data, Apache Hadoop & Hadoop Ecosystem, MapReduce,

Data Serialization.

3. HADOOP Architecture: Architecture, Storage, Task trackers, Hadoop Configuration

4. HADOOP ecosystem and yarn: Hadoop ecosystem components, Hadoop 2.0 New Features

NameNode High Availability, HDFS Federation, MRv2, YARN, Running MRv1 in YARN.

Lab Assignments

1. Lab on Install and configure Hadoop cluster

Dr. Sudhir Gavhane

Dean, LASC

Page 35: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

`

2. Lab on Manipulating files in HDFS using hadoop fs commands.

3. Lab on Manipulating files in HDFS pragmatically using the FileSystem API.Alternative

Hadoop File Systems: IBM GPFS, MapR-FS, Lustre, Amazon S3 etc.

4. Lab on Write an Inverted Index MapReduce Application with custom Partitioner and

Combiner Custom types and Composite Keys Custom Comparators InputFormats and

OutputFormats Distributed Cache MapReduce Design Patterns Sorting Joins.

5. Lab on Writing a streaming MapReduce job in Python YARN and Hadoop 2.0.

6. Lab on Exporting data from HDFS to an Other data integration tools: Flume, Kafka,

Informatica, Talend etc.

Learning Resources:

Reference Books:

1. Boris lublinsky, Kevin t. Smith, Alexey Yakubovich, “Professional Hadoop Solutions”,

Wiley, ISBN: 9788126551071, 2015.

2. Chris Eaton, Dirk deroos et al. “Understanding Big data ”, McGraw Hill, 2012.

3. Tom White, “HADOOP: The definitive Guide”, O Reilly 2012.

4. MapReduce Design Patterns (Building Effective Algorithms & Analytics for Hadoop) by

Donald Miner & Adam Shook

Supplementary Reading:

Weblinks:

https://cloudthat.in/course/processing-bigdata-with-apache-hadoop/

Pedagogy: Participative learning, discussions, algorithm, Flowchart & Program writing,

experiential learning through practical problem solving, assignment, PowerPoint presentation.

Assessment Scheme:

Laboratory Continuous Assessment (LCA) 50 Marks

Practical Oral based on

practical

Site Visit Mini Project Problem based

Learning

Any other

10 20 - - 20 -

Term End Examination : 50 Marks

Dr. Sudhir Gavhane

Dean, LASC

Page 36: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

`

Syllabus:

Module

No. Contents

Workload in Hrs

Theory Lab Assess

1

1. Introduction to big data

Introduction – distributed file system – Big Data and its

importance, Four Vs, Drivers for Big data, Big data analytics, Big

data applications. Algorithms using map reduce, Matrix-Vector

Multiplication by Map Reduce.

11 - -

2

Introduction to HADOOP

Big Data, Apache Hadoop & Hadoop Ecosystem, Moving Data

in and out of Hadoop,

Understanding inputs and outputs of MapReduce, Data

Serialization.

11 - -

3

HADOOP Architecture

Hadoop Architecture, Hadoop Storage: HDFS, Common Hadoop

Shell commands, Anatomy of

File Write and Read, NameNode, Secondary NameNode, and

DataNode, Hadoop MapReduce

Paradigm, Map and Reduce tasks, Job, Task trackers - Cluster

Setup – SSH &Hadoop

Configuration – HDFS Administering –Monitoring &

Maintenance.

12 - -

4

HADOOP ecosystem and yarn

Hadoop ecosystem components - Schedulers - Fair and Capacity,

Hadoop 2.0 New Features NameNode High Availability, HDFS

Federation, MRv2, YARN, Running MRv1 in YARN.

11 - -

ggest the below items:

Prepared By

Ms. Deepali Sonawane

Assistant Professor

Checked By

Ms. Pradnya Mahadik

BOS Chairman

Approved By

Dr. Sudhir Gavhane Dean, LASC

Page 37: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

COURSE STRUCTURE

Course Code MIT-WPU-MBD-1201

Course Category Core Big Data Analytics

Course Title R Programming

Teaching Scheme and Credits

Weekly load hrs

L T Laboratory Credits

4 - - 3

Pre-requisites: Knowledge of any Programming Language

Course Objectives:

1. Understand the basics of R programming including objects, classes, vectors etc.

2. Write functions including generic functions using various methods and loops

3. Install various packages and work effectively in the R environment

4. Become proficient in writing a fundamental program and perform analytics with R

Course Outcomes:

Students will be able to:

1. Recognize and make appropriate use of different types of data structures

2. Use R to create sophisticated figures and graphs

3. Identify and implement appropriate control structures to solve a particular programming

problem

4. Design and write functions in R and implement simple iterative algorithms.

Course Contents:

Introduction to R

Overview of R programming, Evolution of R, Applications of R programming, Basic syntax

Basic Concepts of R

Reserved Words, Variables & Constants

Data structures in R

Vectors, Matrix

Control flow

If...else,If else() Function

Functions

R Functions, Function Return Value

Strings

Dr. Sudhir Gavhane

Dean, LASC

Page 38: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

String construction rules

R packages

Study of different packages in R

R Data Reshaping

Joining Columns and Rows in a Data Frame

Working with files

Read and writing into different types of files

R object and Class

Object and Class,R S3 Class,R S4 Class

Data visualization in R and Data Management

Bar Chart,Dot Plot

Statistical modelling and Databases in R

Mean, mode, median

Learning Resources:

Reference Books:

1. The Art of R Programming-a tour of statistical software design by Norman Matloff

2. R Cookbook: Proven Recipes for Data Analysis, Statistics, and Graphics (O'Reilly

Cookbooks) by Paul Teetor

3. R in Action Book by Rob Kabacoff

4. Practical Data Science with R by Nina Zumel , John Mount , Jim Porzak

5. Learning R: A Step-by-Step Function Guide to Data Analysis by Richard Cotton

Pedagogy:

Participative learning, discussions, algorithm, Flowchart & Program writing, experiential learning

through practical problem solving, assignment, PowerPoint presentation.

Assessment Scheme:

Class Continuous Assessment (CCA) 50 Marks

Assignments Test Presentations Case study MCQ Oral Attendance

10 10 - - 10 10 10

Term End Examination : 50 Marks

Dr. Sudhir Gavhane

Dean, LASC

Page 39: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

Syllabus:

Module

No. Contents

Workload in Hrs

Theory Lab Assess

1 Introduction to R:

Overview of R programming, Evolution of R, Applications of R

programming, Basic syntax

2 - -

2 Basic Concepts of R: Reserved Words, Variables & Constants

Operators, Operator Precedence, Data Types , Input and Output 4 - -

3 Data structures in R: Vectors, Matrix, List in R programming

Data Frame, Factor 5 - -

4 Control flow: If...else, If else() Function, Programming for loop

While Loop, Break & next, Repeat Loop 4 - -

5 Functions: R Functions, Function Return Value, Environment &

Scope, R Recursive Function, R Infix Operator, R Switch

Function.

4 - -

6 Strings: String construction rules, String Manipulation functions 3 - -

7 R packages: Study of different packages in R 2 - -

8 R Data Reshaping: Joining Columns and Rows in a Data Frame

Merging Data Frames, Melting and Casting 4 - -

9 Working with files: Read and writing into different types of files 2 - -

10 R object and Class Object and Class: R S3 Class, R S4 Class

R Reference Class, R Inheritance 2 - -

11

Data visualization in R and Data Management: Bar Chart, Dot

Plot, Scatter Plot (3D), Spinning Scatter Plots, Pie Chart

Histogram (3D) [including colorful ones], Overlapping

Histograms, Boxplot, Plotting with Base and Lattice Graphics

Missing Value Treatment, Outlier Treatment, Sorting Datasets

Merging Datasets, Binning variables

7 - -

12 Statistical modelling and Databases in R: Mean, mode, median

Linear regression, Decision tree, K-means Clustering, RODBC

and DBI Package, Performing queries

6 - -

Prepared By

Preeti Adhav Lecturer

Checked By Pradnya Mahadik BOS Chairman

Approved By

Dr. Sudhir Gavhane Dean

Page 40: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

COURSE STRUCTURE

Course Code MIT-WPU-BA-1202

Course Category Core Big Data Analytics

Course Title Distributed Processing of Data using Hadoop

Teaching Scheme and Credits

Weekly load hrs

L T Laboratory Credits

3 -- -- 3

Pre-requisites:

Some basic knowledge and experience of Java (Jars, Array, Classes, Objects, etc.)

Course Objectives:

What is Hadoop and how can it help process large data sets.

How to write MapReduce programs using Hadoop API.

How to use HDFS (the Hadoop Distributed Filesytem), from the command line and API,

for effectively loading and processing data in Hadoop.

How to ingest data from a RDBMS or a data warehouse to Hadoop.

Best practices for building, debugging and optimizing Hadoop solutions.

Get introduced to tools like Pig, Hive, HBase, Elastic MapReduce etc. and understand how

they can help in BigData projects.

Course Outcomes:

Understand Sqoop architecture and uses Able to load real-time data from an RDBMS

table/Query on to HDFS Able to write sqoop scripts for exporting data from HDFS onto

RDMS tables.

Understand Apache PIG , PIG Data Flow Engine Understand data types, data model, and

modes of execution.

Able to store the data from a Pig relation on to HDFS.

Able to load data into Pig Relation with or without schema.

Able to split, join, filter, and transform the data using pig operators Able to write pig scripts

and work with UDFs.

Understand the importance of Hive, Hive Architecture Able to create Managed, External,

Partitioned and Bucketed Tables Able to Query the data, perform joins between tables

Understand storage formats of Hive Understand Vectorization in Hive

Dr. Sudhir Gavhane Dean, LASC

Page 41: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

Course Contents

Data Storage

What is Hadoop Distributed File System (HDFS). Architecture of HDFS.Architectural assumptions

and goals.How data is stored in HDFS.How data is read from HDFS

Namenodes and Datanodes

Data Processing

What is use of MapReduce.Architecture of the MapReduce framework.what are Phases of a

MapReduce Job.what are MapReduce Design Patterns.what is YARN Architecture

Data Integration

How to Integrate Hadoop into your existing enterprise.Introduction to Sqoop

Higher Level Tools

Workflows of Oozie.An introduction & Architecture hive.Data Types and File Formats

How to Create Tables and Load Data.how to Read & Querying Data. introduction to Pig

Grunt Shell.what is Pig's Data Model.An introduction to HBase.what is Architecture of Client

API & MapReduce Integration

Learning Resources:

Reference Books:

1. The Definitive Guide by Tom White.

2. MapReduce Design Patterns (Building Effective Algorithms & Analytics for Hadoop) by

Donald Miner & Adam Shook

3. Professional Hadoop Solutions by Boris Lublinksy, Kevin Smith, and Alexey Yakubovich

Weblinks:

https://cloudthat.in/course/processing-bigdata-with-apache-hadoop/

Pedagogy:

Participative learning, discussions, algorithm, Flowchart & Program writing, experiential learning

through practical problem solving, assignment, PowerPoint presentation

Assessment Scheme:

Class Continuous Assessment (CCA) 50 marks

Assignments Test Attendance Viva Presentation Any other

10 10 10 10 10 -

Term End Examination : 50 marks

Dr. Sudhir Gavhane Dean, LASC

Page 42: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

Syllabus:

Module

No. Contents

Workload in Hrs

Theory Lab Assess

1

Data Storage

File System Abstraction

Big Data and Distributed File Systems

Hadoop Distributed File System (HDFS)

HDFS Architecture

Architectural assumptions and goals

How data is stored in HDFS

How data is read from HDFS

Namenodes and Datanodes

Blocks

Data Replication

Fault Tolerance

Data Integrity

Namespaces

Federation in Hadoop 2.0

High Availability in Hadoop 2.0

Security and Encryption

HDFS Interfaces: FileSystem API, FSShell, WebHDFS, Fuse

etc.

13 - -

2

Data Processing

MapReduce

The fundamentals: map() and reduce()

Data Locality

Architecture of the MapReduce framework.

Phases of a MapReduce Job

Custom types and Composite Keys

Custom Comparators

InputFormats and OutputFormats

Distributed Cache

MapReduce Design Patterns

Sorting

Joins,YARN and Hadoop 2.0

Separating resource management and processing

YARN Applications: MapReduce, Tez, HBase, Storm, Spark,

Giraph etc.

YARN Architecture, ResourceManager, NodeManagers

ApplicationMasters,Containers, Fault Tolerance

12 - -

3 Data Integration

Integrating Hadoop into your existing enterprise.

Introduction to Sqoop

10 - -

Dr. Sudhir Gavhane

Dean, LASC

Page 43: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

4

Higher Level Tools

Defining workflows with Oozie

An introduction to Hive

Architecture

Interfaces: Hive Shell, Thrift, JDBC, ODBC etc.

HiveQL: A dialect of SQL

Data Types and File Formats

Creating Tables and Loading Data

Schema at Read

Querying Data

User Defined Functions

An introduction to Pig

Grunt Shell

Pig's Data Model

Pig Latin

User Defined Functions

An introduction to HBase

Architecture

Client API

MapReduce Integration

Schema Design

10

-

-

Prepared By

Ms. Varsha Gholave Assistant Professor

Checked By

Ms. Pradnya Mahadik

BOS Chairman

Approved By

Dr. Sudhir Gavhane Dean, LASC

Page 44: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

COURSE STRUCTURE

Course Code MIT-WPU-MBD-1203

Course Category Core Big Data Analytics

Course Title Operational Research

Teaching Scheme and Credits

Weekly load hrs

L T Laboratory Credits

3 1 -- 3

Pre-requisites:

1. Linear algebra

2. Probability and Statistics

Course Objectives:

To introduce the students to the use of basic methodology for the solution of liner programs

and integer programs.

To introduce the students to the advanced methods for large-scale transportation and

assignment problems.

Course Outcomes:

Define and formulate linear programming problems and appreciate their limitations.

Solve linear programming problems using appropriate techniques and optimization solvers,

interpret the results obtained and translate solutions into directives for action.

Conduct and interpret post-optimal and sensitivity analysis and explain the primal-dual

relationship.

Identify the special features of the transportation problem, and assignment problem.

Course Contents

Introduction to Operation Research

Brief introduction about Optimization and the OR process. Descriptive vs. Simulation. Exact vs.

Heuristic techniques, Deterministic vs. Stochastic models.

LPP and Methods to solve LPP

Duality Theory and applications Dual Simplex method. Sensitivity analysis in L.P., Parametric

Programming. Transportation, assignment and least cost transportation. Interior point methods:

scaling techniques, log barrier methods. Dual and primal dual extensions

Non-Linear programming

Kuhn-Tucker conditions. Convex functions and convex regions. Convex programming

problems. Algorithms for solving convex programming problems.

Dr. Sudhir Gavhane Dean, LASC

Page 45: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

PERT and CPM

Basic differences between PERT and CPM. What is Arrow Networks, time estimates, Earliest

expected time. Representation in Tabular Form, Critical Path. Probability of meeting scheduled

date of completion.

Calculation on CPM network, Various floats for activities. Critical path updating projects.

Operation time cost trade off Curve project. Time cost – trade off Curve- Selection of schedule

based on Cost.

Network Flow Problem

Formulation, Max-Flow Min-Cut theorem. Ford and Fulkerson’s algorithm. Exponential

behavior of Ford and Fulkerson’s algorithm.

Learning Resources:

Reference Books:

1. Hadley G. (1969): Linear Programming, Addision Wesley.

2. Taha H. A. (1971): Operations Research an Introduction, Macmillan N. Y.

3. KantiSwaroop, Gupta and Manmohan (1985): Operations Research, Sultan

Chand and Co.

4. Sharma J. K. (2003): Operations Research Theory and Applications, 2

Nd Ed. Macmillan India ltd.

5. Sharma J. K. (1986): Mathematical Models Operations Research, McGraw Hill.

Pedagogy:

Participative learning, discussions, algorithm, Flowchart & Program writing, experiential learning

through practical problem solving, assignment, PowerPoint presentation

Assessment Scheme:

Class Continuous Assessment (CCA) 50 marks

Assignments Test Attendance Viva Presentation Any other

10 10 10 10 10 -

Term End Examination : 50 marks

Dr. Sudhir Gavhane

Dean LASC

Page 46: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

Syllabus:

Module

No. Contents

Workload in Hrs

Theory Lab Assess

1

Introduction to Operation Research

The nature of O.R., History, Meaning, Models, Principles

Problem solving with mathematical models. Optimization and the

OR process. Descriptive vs. Simulation . Exact vs. Heuristic

techniques, Deterministic vs. Stochastic models.

5 - -

2

LPP and Methods to solve LPP

Linear Programming, Introduction. Graphical Solution and

Formulation of L.P. Models Simplex Method (Theory and

Computational aspects), Revised Simplex. Duality Theory and

applications Dual Simplex method. Sensitivity analysis in L.P.,

Parametric Programming. Transportation, assignment and least

cost transportation. Interior point methods: scaling techniques,

log barrier methods. Dual and primal dual extensions

10 - -

3

Non-Linear programming

Kuhn-Tucker conditions. Convex functions and convex regions.

Convex programming problems. Algorithms for solving convex

programming problems.

10 - -

4

PERT and CPM

Basic differences between PERT and CPM. Arrow Networks,

time estimates, Earliest expected time. Latest – allowable

occurrences time. Forward Pass Computation, Backward Pass

Computation. Representation in Tabular Form, Critical Path.

Probability of meeting scheduled date of completion. Calculation

on CPM network, Various floats for activities. Critical path

updating projects. Operation time cost trade off Curve project.

Time cost – trade off Curve- Selection of schedule based on Cost.

10 - -

5

Network Flow Problem

Formulation, Max-Flow Min-Cut theorem. Ford and Fulkerson’s

algorithm. Exponential behavior of Ford and Fulkerson’s

algorithm.

10 - -

Prepared By

Ms. Varsha Ghule

Assistant Professor

Approved By

Dr. Sudhir Gavhane Dean, LASC

Checked By

Pradnya Mahadik

BOS Chairman

Page 47: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

`

COURSE STRUCTURE

Course Code MIT-WPU-MBD-1204

Course Category Core Big Data Analytics

Course Title Next Generation Databases (No SQL

databases)

Teaching Scheme and Credits

Weekly load hrs

L T Laboratory Credits

4 - - 3

Pre-requisites:

Knowledge of RDMS

Course Objectives:

1. To study the usage and applications of Object Oriented database

2. To acquire knowledge on variety of NoSQL databases

3. To attain inquisitive attitude towards research topics in NoSQL databases

Course Outcomes:

1: Master the basics of SQL and construct queries using Pl/SQL efficiently and apply object

oriented features for developing database applications.

2: Compare and Contrast NoSQL databases with each other and Relational Database

Systems

3: Critically analyse and evaluate variety of NoSQL databases.

4: Demonstrate the knowledge of Key-Value databases, Document based Databases,

Column based Databases and Graph Databases.

Course Contents:

1. Introduction to NOSQL

Definition of NOSQL, History of NOSQL and Different NOSQL products, Exploring MondoDB

Java/Ruby/Python, Interfacing and Interacting with NOSQL

2. NOSQL Basics

NOSQL Storage Architecture, CRUD operations with MongoDB, Querying, Modifying and

Managing NOSQL Data stores, Indexing and ordering datasets (MongoDB/CouchDB/Cassandra)

3. Advanced NOSQL

NOSQL in CLOUD, Parallel Processing with Map Reduce, Big Data with Hive

4. Working with NOSQL

Surveying Database Internals, Migrating from RDBMS to NOSQL, Web Frameworksand NOSQL,

using MySQL as a NOSQL

Dr. Sudhir Gavhane

Dean, LASC

Page 48: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

`

Syllabus:

Module

No. Contents

Workload in Hrs

Theory Lab Assess

1

Introduction to NOSQL

Definition of NOSQL, History of NOSQL and Different NOSQL

products, Exploring MondoDB Java/Ruby/Python, Interfacing

and Interacting with NOSQL

6 - -

5. Developing Web Application with NOSQL and NOSQL Administration

Php and MongoDB, Python and MongoDB, Creating Blog Application with PHP, NOSQL

Database Administration

Learning Resources:

Reference Books:

Dan Sullivan,"NoSQL for Mere Mortals",1 stEdition, Pearson Education, 2015. (ISBN-13:

978-9332557338)

Supplementary Reading:

Pramod J. Sadalage, Martin Fowler,"NoSQL Distilled: A Brief Guide to the Emerging

World of Polyglot Persistence", 1 stEdition, Pearson Education, 2012. (ISBN-13: 978-

8131775691

Pedagogy: Participative learning, discussions, Problem Solving, experiential learning through

practical problem solving, assignment, PowerPoint presentation

Assessment Scheme:

Class Continuous Assessment (CCA) 50 Marks

Assignments Test Presentations Case study MCQ Oral Attendance

20 10 10 - - 10

Term End Examination : 50 Marks

Dr. Sudhir Gavhane

Dean, LASC

Page 49: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

`

2

NOSQL Basics

NOSQL Storage Architecture, CRUD operations with MongoDB,

Querying, Modifying and Managing NOSQL Data stores,

Indexing and ordering datasets (MongoDB/CouchDB/Cassandra)

12 - 1

3

Advanced NOSQL

NOSQL in CLOUD, Parallel Processing with Map Reduce, Big

Data with Hive

8 - 1

4

Working with NOSQL

Surveying Database Internals, Migrating from RDBMS to

NOSQL, Web Frameworksand NOSQL, using MySQL as a

NOSQL

9 - 1

5

Developing Web Application with NOSQL and NOSQL

Administration

Php and MongoDB, Python and MongoDB, Creating Blog

Application with PHP, NOSQL Database Administration

10 - 1

ggest the below items:

Prepared By

Ms. Smita Patil

Assistant Professor

Checked By

Ms. Pradnya Mahadik

BOS Chairman

Approved By

Dr. Sudhir Gavhane Dean, LASC

Page 50: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

COURSE STRUCTURE

Course Code MIT-WPU-MBD-1205

Course Category Core Big Data Analytics

Course Title Lab on R Programming

Teaching Scheme and Credits

Weekly load hrs

L T Laboratory Credits

- - 3 3

Pre-requisites: Knowledge of any Programming Language

Course Objectives:

1. Understand the basics of R programming including objects, classes, vectors etc.

2. Write functions including generic functions using various methods and loops

3. Install various packages and work effectively in the R environment

4. Become proficient in writing a fundamental program and perform analytics with R

Course Outcomes:

Students will be able to:

1. Recognize and make appropriate use of different types of data structures

2. Use R to create sophisticated figures and graphs

3. Identify and implement appropriate control structures to solve a particular programming

problem

4. Design and write functions in R and implement simple iterative algorithms.

Course Contents:

Basic Concepts of R: Variables, constants, Operators, datatypes, input output

Data structures in R:

Vectors, Matrix, List, Data Frame/ Factor

Control flow:

Decision making, Repeat, while, for

Functions: built-in, user defined

R packages, R Data Reshaping: Joining Columns and Rows in a Data Frame, Merging Data

Frames

Working with files, R object and Class: csv, excel, S3 and S4 Class, reference

Data visualization in R and Data Management: Bar Chart, Dot Plot, Scatter Plot (3D),Spinning

Dr. Sudhir Gavhane

Dean, LASC

Page 51: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

Scatter Plots, Pie Chart, Histogram (3D) [including colorful ones], Overlapping Histograms,

Boxplot, Plotting with Base and Lattice Graphics, Missing Value Treatment, Outlier Treatment,

Sorting Datasets, Merging Datasets, Binning variables

Statistical modelling and Databases in R: Mean, mode, median, Linear regression,

Decision tree, K-means Clustering

Laboratory Exercises / Practical:

1.Assignments on Basic Concepts of R

2. Assignments on Data structures in R

3. Assignments on Control flow

4. Assignments on Functions

5. Assignments on R packages, R Data Reshaping

6. Assignments on Working with files, R object and Class

7. Assignments on Data visualization in R and Data Management

8. Assignments on Statistical modelling and Databases in R

Learning Resources:

Reference Books:

1. The Art of R Programming-a tour of statistical software design by Norman Matloff

2. R Cookbook: Proven Recipes for Data Analysis, Statistics, and Graphics (O'Reilly

Cookbooks) by Paul Teetor

3. R in Action Book by Rob Kabacoff

4. Practical Data Science with R by Nina Zumel , John Mount , Jim Porzak

5. Learning R: A Step-by-Step Function Guide to Data Analysis by Richard Cotton

Pedagogy:

Participative learning, discussions, algorithm, Flowchart & Program writing, experiential learning

through practical problem solving, assignment, PowerPoint presentation.

Assessment Scheme:

Class Continuous Assessment (CCA) 50

Assignments Test Presentations Case study MCQ Oral Attendance

10 10 - - 10 10 10

Term End Examination : 50 Marks External

Dr. Sudhir Gavhane Dean, LASC

Page 52: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

Laboratory Continuous Assessment (LCA)50

Practical Oral based on

practical

Site Visit Mini

Project

Problem

based

Learning

Attendance

10 10 - 10 10 10

Term End Examination : 50 Marks External

Dr. Sudhir Gavhane

Dean, LASC

Page 53: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

Syllabus:

Module

No. Contents

Workload in Hrs

Theory Lab Assess

1 Basic Concepts of R: Variables, constants, Operators,

datatypes, input output - 3 -

2 Data structures in R:

Vectors, Matrix, List, Data Frame/ Factor - 3 -

3 Control flow:

Decision making, Repeat, while, for - 3 -

4 Functions: built-in, user defined - 3 -

5 R packages, R Data Reshaping: Joining Columns and Rows in a

Data Frame, Merging Data Frames - 3 -

6 Working with files, R object and Class: csv, excel, S3 and S4

Class, reference

- 3 -

7

Data visualization in R and Data Management: Bar Chart, Dot

Plot, Scatter Plot (3D),Spinning Scatter Plots, Pie Chart,

Histogram (3D) [including colorful ones], Overlapping

Histograms, Boxplot, Plotting with Base and Lattice Graphics,

Missing Value Treatment, Outlier Treatment, Sorting Datasets,

Merging Datasets, Binning variables

- 3 -

8

Statistical modelling and Databases in R: Mean, mode,

median, Linear regression, Decision tree, K-means

Clustering

- 3 -

Prepared By

Preeti Adhav Lecturer

Checked By Pradnya Mahadik BOS Chairman

Approved By

Dr. Sudhir Gavhane Dean

Page 54: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

COURSE STRUCTURE

Course Code MIT-WPU-BA-1206

Course Category Core Big Data Analytics

Course Title Lab on Hadoop and Databases

Teaching Scheme and Credits

Weekly load hrs

L T Laboratory Credits

-- -- 3 3

Pre-requisites:

Some basic knowledge and experience of Java (Jars, Array, Classes, Objects, etc.)

Course Objectives:

What is Hadoop and how can it help process large data sets.

How to write MapReduce programs using Hadoop API.

How to use HDFS (the Hadoop Distributed Filesytem), from the command line and API,

for effectively loading and processing data in Hadoop.

How to ingest data from a RDBMS or a data warehouse to Hadoop.

Best practices for building, debugging and optimizing Hadoop solutions.

Get introduced to tools like Pig, Hive, HBase, Elastic MapReduce etc. and understand how

they can help in BigData projects.

Course Outcomes:

Understand Sqoop architecture and uses Able to load real-time data from an RDBMS

table/Query on to HDFS Able to write sqoop scripts for exporting data from HDFS onto

RDMS tables.

Understand Apache PIG , PIG Data Flow Engine Understand data types, data model, and

modes of execution.

Able to store the data from a Pig relation on to HDFS.

Able to load data into Pig Relation with or without schema.

Able to split, join, filter, and transform the data using pig operators Able to write pig scripts

and work with UDFs.

Understand the importance of Hive, Hive Architecture Able to create Managed, External,

Partitioned and Bucketed Tables Able to Query the data, perform joins between tables

Understand storage formats of Hive Understand Vectorization in Hive

Course Contents

Data Storage

What is Hadoop Distributed File System (HDFS). Architecture of HDFS.Architectural assumptions

and goals.How data is stored in HDFS.How data is read from HDFS

Dr. Sudhir Gavhane

Dean, LASC

Page 55: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

Namenodes and Datanodes

Data Processing

What is use of MapReduce.Architecture of the MapReduce framework.what are Phases of a

MapReduce Job.what are MapReduce Design Patterns.what is YARN Architecture

Data Integration

How to Integrate Hadoop into your existing enterprise.Introduction to Sqoop

Higher Level Tools

Workflows of Oozie.An introduction & Architecture hive.Data Types and File Formats

How to Create Tables and Load Data.how to Read & Querying Data. introduction to Pig

Grunt Shell.what is Pig's Data Model.An introduction to HBase.what is Architecture of Client

API & MapReduce Integration

Lab Assignments

1. Lab on Manipulating files in HDFS pragmatically using the FileSystem API.Alternative

Hadoop File Systems: IBM GPFS, MapR-FS, Lustre, Amazon S3 etc.

2. Lab on Write an Inverted Index MapReduce Application with custom Partitioner and

Combiner Custom types and Composite Keys Custom Comparators InputFormats and

OutputFormats Distributed Cache MapReduce Design Patterns Sorting Joins.

3. Lab on Writing a streaming MapReduce job in Python YARN and Hadoop 2.0.

4. Lab on Importing data from an RDBMS to HDFS using Sqoop.

5. Lab on Exporting data from HDFS to an Other data integration tools: Flume, Kafka,

Informatica, Talend etc.

Learning Resources:

Dr. Sudhir Gavhane

Dean, LASC

Page 56: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

Syllabus:

Module

No. Contents

Workload in Hrs

Theory Lab Assess

Data Storage

File System Abstraction

Big Data and Distributed File Systems

Hadoop Distributed File System (HDFS)

HDFS Architecture,Architectural assumptions and goals

How data is stored in HDFS

How data is read from HDFS

Namenodes and Datanodes

Blocks,Data Replication

Fault Tolerance

Data Integrity Namespaces

Federation in Hadoop 2.0

High Availability in Hadoop 2.0

Security and Encryption

HDFS Interfaces: FileSystem API, FSShell, WebHDFS,

Fuse etc.

- 13 -

Reference Books:

1. The Definitive Guide by Tom White.

2. MapReduce Design Patterns (Building Effective Algorithms & Analytics for Hadoop) by

Donald Miner & Adam Shook

3. Professional Hadoop Solutions by Boris Lublinksy, Kevin Smith, and Alexey Yakubovich

Weblinks:

https://cloudthat.in/course/processing-bigdata-with-apache-hadoop/

Pedagogy:

Participative learning, discussions, algorithm, Flowchart & Program writing, experiential learning

through practical problem solving, assignment, PowerPoint presentation

Assessment Scheme:

Class Continuous Assessment (CCA) 50 marks

Practical Viva Attendance Mini

Project

Any other

15 10 15 10 -

Term End Examination : 50 marks

Dr. Sudhir Gavhane

Dean, LASC

Page 57: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

2

Data Processing

MapReduce

The fundamentals: map() and reduce()

Data Locality

Architecture of the MapReduce framework.

Phases of a MapReduce Job

Custom types and Composite Keys

Custom Comparators

InputFormats and OutputFormats

Distributed Cache

MapReduce Design Patterns

Sorting Joins

YARN and Hadoop 2.0

Separating resource management and processing

YARN Applications: MapReduce, Tez, HBase, Storm,

Spark, Giraph etc.

YARN Architecture

ResourceManager

NodeManagers

ApplicationMasters

Containers Fault Tolerance

12 -

3

Data Integration

Integrating Hadoop into your existing enterprise.

Introduction to Sqoop

- 10 -

4

Higher Level Tools

Defining workflows with Oozie

An introduction to Hive

Architecture Interfaces: Hive Shell, Thrift, JDBC, ODBC

etc. HiveQL: A dialect of SQL

Data Types and File Formats

Creating Tables and Loading Data

Schema at Read Querying Data

User Defined Functions

An introduction to Pig

Grunt Shell

Pig's Data Model

Pig Latin

User Defined Functions

An introduction to HBase

Architecture

Client API

MapReduce Integration

Schema Design

- 10

Prepared By

Ms. Varsha Ghule

Assistant Professor

Approved By

Dr. Sudhir Gavhane Dean, LASC

Checked By

Pradnya Mahadik

BOS Chairman

Page 58: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

`COURSE STRUCTURE

Course Code MIT-WPU- MBD-1301

Course Category Core Big Data Analytics

Course Title Statistical Computing

Teaching Scheme and Credits

Weekly load hrs

L T Laboratory Credits

3 3

Pre-requisites:

1. Linear algebra

2. Probability and Statistics

Course Objectives:

1. To provide an understanding of concepts and techniques of Business Statistics

2. How to use Excel, Python or R to solve Business Statistics problems

3.To learn Experimental Design

Course Outcomes:

1. The student should be able to formulate and solve problems related to topics covered in this course.

2. The student should be able to solve the problems using Python or R

3. Perform statistical analysis on variety of data.

Course Contents:

1. Data and Statistics

2. Descriptive Statistics: Tabular and Graphical Presentations

3. Descriptive Statistics: Numerical Measures

4. Probability

5. Discrete Probability Distributions

6. Continuous Probability Distribution

7. Sampling and Sampling Distributions

8. Interval Estimation

9. Fundamentals of Hypothesis Testing

10. Two-Sample Tests

11. Inferences about Population Variances

12. Tests of Goodness of Fit and Independence

13. Experimental Design and ANOVA

14. Simple Linear Regression

Laboratory Exercises / Practical:

1. Discrete Probability Distributions

2. Continuous Probability Distribution

3. Sampling and Sampling Distributions

Dr. Sudhir Gavhane

Dean, LASC

Page 59: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

4. Interval Estimation

5. Fundamentals of Hypothesis Testing

6. Two-Sample Tests

7. Inferences about Population Variances

8. Tests of Goodness of Fit and Independence

9. Experimental Design and ANOVA

10. Simple Linear Regression

Learning Resources:

Reference Books:

Text Book: David R Anderson, Dennis J Sweeney, Thomas A Williams, Jeffrey D. Camm and

James J. Cochran, Statistics for Business and Economics. 12th Edition. Cengage Learning. 2014

(note that a new edition, 13e, has recently come up, but mostly unavailable)

Pedagogy: Participative learning, discussions, demonstrations, practical, assignment

Assessment Scheme:

Class Continuous Assessment (CCA)

Assignments Test Presentations Case study Attendance

10% 10% 10% 10% 10%

Term End Examination : 50%

Dr. Sudhir Gavhane

Dean, LASC

Page 60: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

Syllabus:

Module

No. Contents

Workload in Hrs

Theory Lab Assess

1

Data and Statistics:Applications in Business and

Economics, Data Data Sources, Descriptive Statistics,

Statistical Inference Computers and Statistical Analysis,

Data Mining and Ethical Guidelines for Statistical

Practice (Self Study

4

2

Descriptive Statistics: Tabular and Graphical

Presentations, Summarizing Qualitative Data,

Summarizing Quantitative Data, Cross Tabulation and

Scatter Diagrams, Data Visualization Practices (Self

Study)

4

3

Descriptive Statistics: Numerical Measures

Measures of Location

Measures of Variability Measures of Shape,

Relative Location and Detecting Outliers

Exploratory Data Analysis

Measures of Association between Two Variables

Data Dashboards (Self Study)

4

4

Probability

Basic Probability Concepts

Conditional Probability

Bayes’ Theorem

4

5

Discrete Probability Distributions

Probability Distribution for a Discrete Random

Variable

Properties: Expectation, Variance

Binomial Distribution

Poisson Distribution

Hypergeometric Distribution

Discrete Bivariate Distributions: Covariance and Financial

Portfolios

4

6

Continuous Probability Distribution

Uniform Probability Distributions

Normal Probability Distribution

Normal Approximation to Binomial Probabilities

Exponential Probability Distribution

4

Dr. Sudhir Gavhane

Dean, LASC

Page 61: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

7

Sampling and Sampling Distributions

Simple Random Sampling

Point Estimation

Introduction to Sampling Distribution

Sampling Distribution of the Mean

Sampling Distribution of Proportion

Properties of Point Estimators

Other Sampling Methods

4

8

Interval Estimation

Confidence Interval Estimation for the Mean (σ

known)

Confidence Interval Estimation for the Mean (σ

unknown)

Determining Sample Size

Confidence Interval Estimation for the

Proportion

4

9

Fundamentals of Hypothesis Testing

Hypothesis Testing Methodology

Z test of Hypothesis for the Mean (σ known)

t test of Hypothesis for the Mean (σ unknown)

Z test of Hypothesis for the Proportion

Decision Making, Probability of Type-II Errors,

Sample Size Determination

4

10

. Two-Sample Tests

Comparing Means of Two Independent

Populations

Comparing Means of Two Related Populations

Comparing Two Population Proportions

4

11

. Inferences about Population Variances

Inferences about a Population Variance

Inferences about Two Population Variances

4

12

Tests of Goodness of Fit and Independence

Test the Equality of Three or More Population

Proportions

Test of Independence

Goodness of Fit Test: A Multinomial Population

(Self Study)

4

Dr. Sudhir Gavhane

Dean, LASC

Page 62: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

13

Experimental Design and ANOVA

An Introduction

ANOVA and the Completely Randomized

Design

Multiple Comparison Procedure

Randomized Block Design and Factorial

Experiment (Self Study)

4

14

Simple Linear Regression

Simple Linear Regression Model

Least Squares method

Coefficient of Determination

Model Assumptions

Testing for Significance

Computer Solution

Residual Analysis (Self Study)

4

Checked By Ms. Pradnya Mahadik BOS Chairman BOS Chairman

Approved by Dr. Sudhir Gavhane Dean, LASC

Prepared by Ms. Pradnya Mahadik Assistant Professor

Page 63: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

`

COURSE STRUCTURE

Course Code MIT-WPU- MBD -1302

Course Category Core BigData Analytics

Course Title Information Security

Teaching Scheme and Credits

Weekly load hrs

L T Laboratory Credits

4 - - 3

Pre-requisites:

Basic concepts of Networking.

Course Objectives:

1. To provide an understanding of principal concepts, major issues, technologies and basic

approaches in information security.

2. Develop a basic understanding of cryptography, how it has evolved and some key encryption

techniques used today

CourseOutcomes:

The students will have firm understanding of:

1. Basic concepts related to network and system level security.

2. Basics of cryptography, security management and network security techniques.

3. Information security governance, and related legal and regulatory issues

4. How threats to an organization are discovered, analyzed, and dealt with.

CourseContents:

UNIT - I Security Attacks (Interruption, Interception, Modification and Fabrication), Security Services

UNIT - II Conventional Encryption Principles, Conventional encryption algorithms

UNIT - III Public key cryptography principles, public key cryptography algorithms

UNIT - IV Email privacy: Pretty Good Privacy (PGP) and S/MIME.

UNIT - V IP Security Overview, IP Security Architecture

UNIT - VI Web Security Requirements, Secure Socket Layer (SSL) and Transport Layer Security (TLS)

Dr. Sudhir Gavhane

Dean, LASC

Page 64: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

`

Syllabus:

Module

No. Contents

Workload in Hrs

Theory Lab Assess

1

UNIT - I Security Attacks (Interruption, Interception, Modification and

Fabrication), Security Services (Confidentiality, Authentication,

Integrity, Non-repudiation, access Control and Availability) and

9 - -

UNIT - VII Basic concepts of SNMP, SNMPv1 Community facility and SNMPv3.

UNIT - VIII Firewall Design principles, Trusted Systems. Intrusion Detection Systems.

LearningResources:

TEXT BOOKS: 1. Network Security Essentials (Applications and Standards) by William Stallings Pearson

Education.

2. Hack Proofing your network by Ryan Russell, Dan Kaminsky, Rain Forest Puppy, Joe Grand,

David Ahmad, Hal Flynn Ido Dubrawsky, Steve W.Manzuik and Ryan Permeh, Wiley Dreamtech

REFERENCES: 1. Fundamentals of Network Security by Eric Maiwald (Dreamtech press)

2. Network Security - Private Communication in a Public World by Charlie Kaufman, Radia

Perlman and Mike Speciner, Pearson/PHI.

3. Cryptography and network Security, Third edition, Stallings, PHI/Pearson

4. Principles of Information Security, Whitman, Thomson.

5. Network Security: The complete reference, Robert Bragg, Mark Rhodes, TMH

6. Introduction to Cryptography, Buchmann, Springer.

Pedagogy: Participative learning, discussions, algorithm, Flowchart & Program writing,

experiential learning through practical problem solving, assignment, PowerPoint presentation.

AssessmentScheme:

Class Continuous Assessment (CCA) 50 Marks

Assignments Test Presentations Case study MCQ Oral Attendance

10 10 10 - 10 - 10

Term End Examination : 50 Marks

Dr. Sudhir Gavhane

Dean, LASC

Page 65: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

`

Mechanisms, A model for Internetwork security, Internet

Standards and RFCs, Buffer overflow & format string

vulnerabilities, TCP session hijacking, ARP attacks, route table

modification, UDP hijacking, and man-in-the-middle attacks.

2

UNIT - II Conventional Encryption Principles, Conventional encryption

algorithms, cipher block modes of operation, location of

encryption devices, key distribution Approaches of Message

Authentication, Secure Hash Functions and HMAC.

6 - -

3

UNIT - III Public key cryptography principles, public key cryptography

algorithms, digital signatures, digital Certificates, Certificate

Authority and key management Kerberos, X.509 Directory

Authentication Service.

6 - -

4 UNIT - IV Email privacy: Pretty Good Privacy (PGP) and S/MIME.

4 - -

5

UNIT - V IP Security Overview, IP Security Architecture, Authentication

Header, Encapsulating Security Payload, Combining Security

Associations and Key Management.

7 - -

6

UNIT - VI Web Security Requirements, Secure Socket Layer (SSL) and

Transport Layer Security (TLS), Secure Electronic Transaction

(SET).

5 - -

7 UNIT - VII Basic concepts of SNMP, SNMPv1 Community facility and

SNMPv3. Intruders, Viruses and related threats.

5 - -

8 UNIT - VIII Firewall Design principles, Trusted Systems. Intrusion Detection

Systems. 3 - -

ggest the below items:

Prepared By

Ms. Devyani B Kamble

Assistant Professor

Checked By

Ms. Pradnya Mahadik

BOS Chairman

Approved By

Dr. Sudhir Gavhane Dean, LASC

Page 66: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

`

COURSE STRUCTURE

Course Code MIT-WPU-MBD-1303

Course Category Core BigData Analytics

Course Title Big Data – Apache Spark - In memory

distributed processing

Teaching Scheme and Credits

Weekly load hrs

L T Laboratory Credits

4 - - 3

Pre-requisites:

Basic knowledge of Object Oriented programming concepts, Java, database concepts and

any of the Linux operating system flavors.

Course Objectives:

1. To understand the concepts of Scala and learn their implementation.

2. To understand the Apache Spark architecture.

3. To understand Spark Resilient Distributed Datasets – Transformation, Action.

CourseOutcomes:

The student will get knowledge of: 1. Concepts of Scala and its implementation. 2. Concepts of Spark and how it is used along with Spark.

CourseContents:

Introduction: Introduction to Scala, History of Scala

Conditional Expressions: If-else, While, do-while

Scala Function: Function declaration, function definition.

Scala Classes and Objects: Object, Class, Singleton Object

Array and Strings: Single dimensional

Scala Collections: Sequence, List

File Input-Output: Reading and Writing of files

Introduction to Apache Spark: Features of Apache Spark

Resilient Distributed Dataset(RDD): Introduction of Resilient Distributed Dataset

Spark RDD operations: RDD Transformation

Dr. Sudhir Gavhane

Dean, LASC

Page 67: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

`

LearningResources:

1. Programming Scala by Dean Wampler, Alex Payne

2. Scala Cookbook by Alvin Alexander

3. Scala in depth by Joshua D. Suereth

4. Programming in Scala by Martin Odersky, Lex Spoon, Bill Venners

5. Scala for the Impatient by Cay S. Horstmann

6. Learning Spark by Matei Zaharia, Patrick Wendell, Andy Konwinski, Holden Karau

7. Advanced Analytics with Spark by Sandy Ryza, Uri Laserson, Sean Owen and Josh Wills

8. Mastering Apache Spark by Mike Frampton

9. Apache Spark Graph Processing by Rindra Ramamonjison

Pedagogy: Participative learning, discussions, algorithm, Flowchart & Program writing,

experiential learning through practical problem solving, assignment, PowerPoint presentation.

AssessmentScheme:

Class Continuous Assessment (CCA) 50 Marks

Assignments Test Presentations Case study MCQ Oral Attendance

10 10 - - 10 10 10

Term End Examination : 50 Marks

Dr. Sudhir Gavhane

Dean, LASC

Page 68: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

`

Syllabus:

Module

No. Contents

Workload in Hrs

Theory Lab Assess

1 Introduction: Introduction to Scala, History of Scala, Features

Basic Syntax, Scala Comments, Variables, Data types, Operators. 3 - -

2 Conditional Expressions: If-else, While, do-while, for, Pattern

matching, break statement. 5 - -

3

Scala Function: Function declaration, function definition,

Function calling, Functions-Call by name, Functions with named

arguments, Functions with variable arguments, Default parameter

values, Nested functions, Recursion, Higher order functions,

Scala Closures.

7 - -

4

Scala Classes and Objects: Object, Class, Singleton Object,

Companion Object, access modifiers, constructors, method

overloading, inheritance, method overriding, this keyword,

inheritance, method overriding, field overriding, final, Scala

Abstract class, Scala Trait, Apply and Unapply.

4 - -

5 Array and Strings: Single dimensional, Passing array into the

function, Multidimensional Array, Strings, String methods, String

Interpolation

5 - -

6 Scala Collections: Sequence, List, Set, Map, Tuples, Options,

Iterators 5 - -

7 File Input-Output: Reading and Writing of files 1 - -

8

Introduction to Apache Spark: Features of Apache Spark,

Apache Spark Architecture, Spark Applications, Apache Spark

Components, Describe the Different Data Sources and Formats in

Spark.

5 - -

9 Resilient Distributed Dataset (RDD): Introduction of RDD,

Features of RDD in Spark, RDD operations. 5 - -

10 Spark RDD operations: RDD Transformation, RDD Action. 5 - -

ggest the below items:

Prepared By

Ms. Devyani B Kamble

Assistant Professor

Checked By

Ms. Pradnya Mahadik

BOS Chairman

Approved By

Dr. Sudhir Gavhane Dean, LASC

Page 69: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

COURSE STRUCTURE

Course Code MIT-WPU-MBD-1304

Course Category Core Big Data Analytics

Course Title Machine Learning Algorithm -I

Teaching Scheme and Credits

Weekly load hrs

L T Laboratory Credits

4 -- -- 3

Pre-requisites:

1. The main prerequisite for machine learning is data analysis.

2. Familiarity with probability theory

3. Familiarity with linear algebra

Course Objectives:

4. To introduce the basic concepts and techniques of Machine Learning.

5. To develop the skills in using recent machine learning software for solving practical

problems.

6. To be familiar with a set of well-known supervised, semi-supervised and unsupervised

learning algorithms

Course Outcomes:

1. Select real-world applications that needs machine learning based solutions.

2. Implement and apply machine learning algorithms.

3. Select appropriate algorithms for solving a particular group of real-world problems.

4. Recognize the characteristics of machine learning techniques that are useful to solve

real-world problems.

Course Contents

Introduction to learning

What is Supervised, Unsupervised and Reinforcement Learning? visualization of algebraic

concepts

Linear Regression

What is Regression? What is simple one variable regression line and coefficients of the line? What

are assumptions of linear regression? What is Gradient descent algorithm, cost function to find

'beta' values and concept

Gradient Descent

How to represent matrix of problem? How to use Gradient descent for multiple features and

scaling techniques in gradient descent? What are types of feature scaling, finding coefficients

analytically?

Dr. Sudhir Gavhane Dean

Page 70: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

Logistic Regression

What is Logistic regression model? What is Sigmoid function and its graphical representation?

What is Receiver-operating characteristic (RoC) curve? What is the use of RoC curve?

Optimization and Classifications

What is Optimization objective from logistic regression? What is large margin classifier? What is

concept behind large margin classifications using SVM?

Learning Resources:

1. T. Hastie, R. Tibshirani and J. Friedman, “Elements of Statistical Learning”,

2. Springer, 2009.

3. E. Alpaydin, “Machine Learning”, MIT Press, 2010.

4. K. Murphy, “Machine Learning: A Probabilistic Perspective”, MIT Press, 2012.

5. C. Bishop, “Pattern Recognition and Machine Learning, Springer”, 2006.

6. Shai Shalev-Shwartz, Shai Ben-David, “Understanding Machine Learning:From Theory to

Algorithms”, Cambridge University Press, 2014.

7. John Mueller and Luca Massaron, “Machine Learning for Dummies“, John Wiley &

Sons, 2016.

Pedagogy:

Participative learning, discussions, algorithm, Program writing, experiential learning through

practical problem-solving, assignment, PowerPoint presentation

Assessment Scheme:

Class Continuous Assessment (CCA)

Assignments Test Problem solving Attendance Case study Any other

10 10 10 10 10 -

Term End Examination : 50 marks External

Dr. Sudhir Gavhane Dean

Page 71: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

Syllabus:

Module

No. Contents

Workload in Hrs

Theory Lab Assess

1

Introduction to learning

Supervised, Unsupervised and Reinforcement Learning,

geometry (lines, curves and 3D spaces) and visualization of

algebraic concepts

5 - -

2

Linear Regression

Regression as a concept, simple one variable regression line,

coefficients of the line, assumptions of linear regression,

Gradient descent algorithm, cost function to find 'beta' values

and concept, local and global minima, concept of learning rate

8 - -

3

Gradient Descent

Matrix representation of problem, Gradient descent for multiple

features, use of feature scaling techniques in gradient descent,

types of feature scaling, finding coefficients analytically,

normal equation (matrix)non-invertibility

7 - -

4

Logistic Regression

Logistic regression model, matrix representation, general

Sigmoid function and graphical representation, decision

boundary (linear and non-linear), metrics for logistic regression

(accuracy, sensitivity, specificity etcetera concepts), Receiver-

operating characteristic (RoC) curve, use of RoC curve to find

out optimum decision boundary, convexity and non-convexity

of a group of points

13 - -

5

Optimization and Classifications

Optimization objective from logistic regression to support

vector machines, large margin classifier, concepts behind large

margin classifications, kernels (concept, types and graphical

explanations), using SVM

12

Prepared By

Archana Varade

Assistant Professor

Checked By

Pradnya Mahadik BOS Chairman

Approved By

Dr. Sudhir Gavhane Dean

Page 72: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

COURSE STRUCTURE

Course Code MIT-WPU- MBD-1305

Course Category Core Big Data Analytics

Course Title Lab on Statistical Computing

Teaching Scheme and Credits

Weekly load hrs

L T Laboratory Credits

3 3

Course Objectives:

1. To provide an understanding of concepts and techniques of Business Statistics

2. How to use Excel to solve Business Statistics problems

3. Hands on training on Python and R.

Course Outcomes:

1. The student should be able to formulate and solve problems related to topics covered in this course.

2. The student should be able to solve the problems using Python or R

3. Perform statistical analysis on variety of data.

Course Contents:

Laboratory Exercises / Practical:

1. Data and Statistics

2. Descriptive Statistics: Tabular and Graphical Presentations

3. Descriptive Statistics: Numerical Measures

4. Probability

5. Discrete Probability Distributions

6. Continuous Probability Distribution

7. Sampling and Sampling Distributions

8. Interval Estimation

9. Fundamentals of Hypothesis Testing

10. Two-Sample Tests

11. Inferences about Population Variances

12. Tests of Goodness of Fit and Independence

13. Experimental Design and ANOVA

14. Simple Linear Regression

Dr. Sudhir Gavhane

Dean, LASC

Page 73: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

Learning Resources:

Reference Books:

Text Book: David R Anderson, Dennis J Sweeney, Thomas A Williams, Jeffrey D. Camm and James J.

Cochran, Statistics for Business and Economics. 12th Edition. Cengage Learning. 2014 (note that a new

edition, 13e, has recently come up, but mostly unavailable)

Pedagogy: Participative learning, discussions, demonstrations, practical, assignment

Assessment Scheme:

Laboratory Continuous Assessment (LCA)

Practical Oral based on

practical

Problem

based

Learning

Attendance

20% 10% 10% 10%

Term End Examination : 50%

Dr. Sudhir Gavhane

Dean, LASC

Page 74: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

Syllabus:

Module

No. Contents

Workload in Hrs

Theory Lab Ass

ess

1 Data and Statistics: 3

2 Descriptive Statistics:, 3

3 Descriptive Statistics: 3

4 Probability

3

5 Discrete Probability Distributions

3

6 Continuous Probability Distribution

3

7 Sampling and Sampling Distributions

3

8 Interval Estimation

3

9 Fundamentals of Hypothesis Testing

3

10 . Two-Sample Tests

3

11 . Inferences about Population Variances

3

12 Tests of Goodness of Fit and Independence

3

13 Experimental Design and ANOVA

3

14 Simple Linear Regression

3

Prepared by Ms. Pradnya Mahadik Assistant Professor

Checked By Ms. Pradnya Mahadik BOS Chairman BOS Chairman

Approved by Dr. Sudhir Gavhane Dean, LASC

Page 75: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

COURSE STR UCTURE

Course Code MIT-WPU-MBD-1306

Course Category Core Big Data Analytics

Course Title Lab on Machine Learning Algorithms - I

Teaching Scheme and Credits

Weekly load hrs

L T Laboratory Credits

-- -- 3 3

Pre-requisites:

1. 1. Basic Linear Algebra

2. Programming Experience

3. Statistics and Probability

Course Objectives:

1. To introduce basic machine learning techniques.

2. To develop the skills in using recent machine learning software for solvingpractical problems in

high-performance computing environment.

3. To develop the skills in applying appropriate supervised, semi-supervised or unsupervised

learning algorithms for solving practical problems.

Course Outcomes:

1. Students will be able to:

2. Implement and apply machine learning algorithms to solve problems.

3. Select appropriate algorithms for solving a of real-world problems.

4. Use machine learning techniques in high-performance computing environment to solve real-

world problems.

Course Contents

Laboratory Exercises / Practical:

1. Exercises to solve the real-world problems using the following machine learning

methods:

Linear Regression

Logistic Regression

Multi-Class Classification

Neural Networks

Support Vector Machines

K-Means Clustering & PCA

2. Develop programs to implement Anomaly Detection & Recommendation Systems.

3. Implement GPU computing models to solving some of the problems mentioned in Problem 1.

Dr. Sudhir Gavhane Dean, LASC

Page 76: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

Reference Books

1. Peter Flach: Machine Learning: The Art and Science of Algorithms that Make

Sense of Data, Cambridge University Press, Edition 2012.

2. Hastie, Tibshirani, Friedman: Introduction to Statistical Machine Learning with

Applications in R, Springer, 2nd Edition-2012.

3. C. M. Bishop : Pattern Recognition and Machine Learning, Springer 1st Edition-

2013.

4. Ethem Alpaydin : Introduction to Machine Learning, PHI 2nd Edition-2013.

5. Parag Kulkarni : Reinforcement and Systematic Machine Learning for Decision

Making, Wiley-IEEE Press, Edition July 2012.

Supplementary Reading:

Web Resources:

Weblinks: -

MOOCs: -

Pedagogy:

Mini Project development, Problem solving approach, Participative learning, discussions, algorithm,

Program writing, experiential learning through practical problem-solving, assignment, PowerPoint

presentation

Assessment Scheme:

Class Continuous Assessment (CCA)

Assignments Test Presentations Attendance Viva Any other

10 10 10 10 10 -

Term End Examination : 50 marks External

Dr. Sudhir Gavhane

Dean, LASC

Page 77: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

Syllabus:

Module

No. Contents

Workload in Hrs

Theory Lab Assess

1

Exercises to solve the real-world problems using the following

machine learning methods:

Linear Regression

Logistic Regression

- 3 -

2

Exercises to solve the real-world problems using the following

machine learning methods:

Multi-Class Classification

Neural Networks

- 3 -

3

Exercises to solve the real-world problems using the following

machine learning methods:

Support Vector Machines

K-Means Clustering & PCA

- 3 -

4 Develop programs to implement Anomaly Detection &

Recommendation Systems. - 3 -

5 Implement GPU computing models to solving some of the

problems mentioned in Problem 1. - 3 -

6 Implement GPU computing models to solving some of the

problems mentioned in Problem 2. - 3 -

7 Implement GPU computing models to solving some of the problems

mentioned in Problem 3. - 3 -

Prepared By

Dr. C. H. Patil Assistant Professor

Checked By

Pradnya Mahadik Course Coordinator

Approved By

Dr. Sudhir Gavhane Dean

Page 78: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

COURSE STRUCTURE

Course Code MIT-WPU-MBD-2101

Course Category Core Big Data Analytics

Course Title Principles Of Deep Learning

Teaching Scheme and Credits

Weekly load hrs

L T Laboratory Credits

3 -- -- 3

Pre-requisites:

1. This is an upper-level undergraduate/graduate course. All students should have the following

skills:

1. Calculus, Linear Algebra

2. Probability & Statistics

3. Ability to code in Python .

Course Objectives:

Learning in neural networks output vs hidden layers; linear vs nonlinear networks

Course Outcomes: Understand Deep Learning

Course Contents

Course overview What is deep learning? DL successes; syllabus & course logistics;

Intro to neural networks cost functions, hypotheses and tasks; training data; maximum likelihood

based cost, cross entropy, MSE cost; feed-forward networks; MLP, sigmoid units; neuroscience

inspiration; Learning in neural networks output vs hidden layers; linear vs nonlinear networks;

Backpropagation learning via gradient descent; recursive chain rule (backpropagation); if time:

bias-variance tradeoff, regularization; output units: linear, softmax; hidden units: tanh,

Deep learning strategies I (e.g., GPU training, regularization,etc); project proposals

Deep learning strategies II (e.g., RLUs, dropout, etc) SCC/TensorFlow overview How to use the

SCC cluster; introduction to Tensorflow. CNNs I Convolutional neural networks

Deep Belief Nets I probabilistic methods RNNs I Recurrent neural networks Other DNN variants

(e.g. attention, memory networks, etc.)

Neural Turing Machines(Kate) Unsupervised deep learning I(e.g. autoencoders etc.)

Unsupervised deep learning II (e.g. deep generative models etc.)

Deep reinforcement learning Vision applications I NLP applications I Laboratory Exercises /

Practical: NA

Dr. Sudhir Gavhane Dean, LASC

Page 79: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

Reference Books

1. Ian Goodfellow, Yoshua Bengio, Aaron Courville. Deep Learning.

Supplementary Reading:

1. Duda, R.O., Hart, P.E., and Stork, D.G. Pattern Classi cation . Wiley-Interscience.

2nd Edition. 2001.

2. Theodoridis, S. and Koutroumbas, K. Pattern Recognition. Edition 4 . Academic

Press, 2008.

3. Russell, S. and Norvig, N. Artificial Intelligence: A Modern Approach . Prentice Hall

Series in ArtificialIntelligence. 2003.

4. Bishop, C. M. Neural Networks for Pattern Recognition . Oxford University Press.

1995.

5. Hastie, T., Tibshirani, R. and Friedman, J. The Elements of Statistical Learning .

Springer. 2001.

6. Koller, D. and Friedman, N. Probabilistic Graphical Models . MIT Press. 2009.

Web Resources:

Weblinks: -

MOOCs: -

Pedagogy:

Mini Project development, Problem solving approach, Participative learning, discussions, algorithm,

Program writing, experiential learning through practical problem-solving, assignment, PowerPoint

presentation

Assessment Scheme:

Class Continuous Assessment (CCA)

Assignments Test Presentations Attendance Viva Any other

10 10 10 10 10 -

Term End Examination : 50 marks External

Dr. Sudhir Gavhane Dean, LASC

Page 80: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

Syllabus:

Module

No. Contents

Workload in Hrs

Theory Lab Assess

1

Course overview

What is deep learning? DL successes; syllabus & course

logistics;

2 - -

2

Intro to neural networks cost functions, hypotheses and tasks;

training data; maximum likelihood based cost, cross entropy,

MSE cost; feed-forward networks; MLP, sigmoid units;

neuroscience inspiration;

4 - -

3 Learning in neural networks

output vs hidden layers; linear vs nonlinear networks; 4 - -

4

Backpropagation learning via gradient descent; recursive chain

rule (backpropagation); if time: bias-variance tradeoff,

regularization; output units: linear, softmax; hidden units: tanh,

4 - -

5 Deep learning strategies I

(e.g., GPU training, regularization,etc); project proposals 2

6 Deep learning strategies II

(e.g., RLUs, dropout, etc) 2

7 SCC/TensorFlow overview

How to use the SCC cluster; introduction to Tensorflow. 2

8 CNNs I Convolutional neural networks 2

9 Deep Belief Nets I probabilistic methods 2

10 RNNs I Recurrent neural networks 2

11 Other DNN variants (e.g. attention, memory networks, etc.) 2

12 Neural Turing Machines 2

13 Unsupervised deep learning I(e.g. autoencoders etc.) 2

14 Unsupervised deep learning II (e.g. deep generative models etc.) 2

15 Deep reinforcement learning 2

16 Vision applications I 2

17 NLP applications I 2

Prepared By

Dr. C. H. Patil Assistant Professor

Checked By

Pradnya Mahadik Course Coordinator

Approved By

Dr. Sudhir Gavhane Dean LASC

Page 81: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

COURSE STRUCTURE

Course Code MIT-WPU-MBD-2102

Course Category Core Big Data Analytics

Course Title Machine Learning Algorithm -II

Teaching Scheme and Credits

Weekly load hrs

L T Laboratory Credits

4 -- -- 3

Pre-requisites:

1. The main prerequisite for machine learning is data analysis.

2. Familiarity with probability theory

3. Familiarity with linear algebra

CourseObjectives:

4. To introduce the basic concepts and techniques of Machine Learning.

5. To develop the skills in using recent machine learning software for solving practical

problems.

6. To be familiar with a set of well-known supervised, semi-supervised and unsupervised

learning algorithms

CourseOutcomes:

1. Select real-world applications that needs machine learning based solutions.

2. Implement and apply machine learning algorithms.

3. Select appropriate algorithms for solving a particular group of real-world problems.

4. Recognize the characteristics of machine learning techniques that are useful to solve

real-world problems.

CourseContents

Decision trees and random forests

Concept, diagrammatic representation, random forest as a voting committee of decision trees,

parameter meaning and explanation.

Naive Bayes:

Venn diagrams, Naive Bayes algorithm, application and problems, Naive Bayes learning, Bayesian

inference, Retail basket analysis; Concept of boosting and bagging

Unsupervised learning methods/Clustering:

K-means algorithm, optimization objective, graphical representation, random initialization,

choosing number of clusters

Association Rules

Association rule mining, K-nearest neighbours’ algorithm.

Dr. Sudhir Gavhane Dean, LASC

Page 82: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

1. T. Hastie, R. Tibshirani and J. Friedman, “Elements of Statistical Learning”,

2. Springer, 2009.

3. E. Alpaydin, “Machine Learning”, MIT Press, 2010.

4. K. Murphy, “Machine Learning: A Probabilistic Perspective”, MIT Press, 2012.

5. C. Bishop, “Pattern Recognition and Machine Learning, Springer”, 2006.

6. Shai Shalev-Shwartz, Shai Ben-David, “Understanding Machine Learning:From Theory to

Algorithms”, Cambridge University Press, 2014.

7. John Mueller and Luca Massaron, “Machine Learning For Dummies“, John Wiley &

Sons, 2016.

Pedagogy:

Participative learning, discussions, algorithm, Program writing, experiential learning through

practical problem-solving, assignment, PowerPoint presentation

AssessmentScheme:

Class Continuous Assessment (CCA)

Assignments Test Problem solving Attendance Case study Any other

10 10 10 10 10 -

Term End Examination: 50 marks External

Dr. Sudhir Gavhane Dean, LASC

Page 83: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

Syllabus:

Module

No. Contents

Workload in Hrs

Theory Lab Assess

1

Decision trees and random forests

Concept, diagrammatic representation, random forest as a

voting committee of decision trees, parameter meaning and

explanation.

12 - -

2

Naive Bayes:

Venn diagrams, Naive Bayes algorithm, application and

problems, Naive Bayes learning, Bayesian inference, Retail

basket analysis; Concept of boosting and bagging

12 - -

3

Unsupervised learning methods/Clustering:

K-means algorithm, optimization objective, graphical

representation, random initialization, choosing number of

clusters

12 - -

4

Association Rules

Association rule mining, K-nearest neighbours algorithm.

09 - -

Prepared By

Sameer Kakade Asst.Prof.

Checked By

Pradnya Mahadik Course Coordinator

Approved By

Dr. Sudhir Gavhane Dean LASC

Page 84: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

COURSE STRUCTURE

Course Code MIT-WPU-MBD-2103

Course Category Core Big Data Analytics

Course Title Data Science life cycle & Visualization

Teaching Scheme and Credits

Weekly load hrs

L T Laboratory Credits

3 - -- 3

Pre-requisites:

Computing: The Structure and Interpretation of Computer Programs

Math: Linear Algebra: some basic concepts like linear operators, eigenvectors, derivatives, and

integrals to enable statistical inference and derive new prediction algorithms.

Course Objectives:

To describe Data Science Life cycle.

To describe Data Visualization

Course Outcomes:

Students will be able to understand Data Science Life cycle & Data Visualization

Course Contents:

1. What is Data Science?

What does Data Science involve?

Era of Data Science

Business Intelligence vs Data Science

Life cycle of Data Science including Extract Transform and Load

Data Preprocessing

Data Imputation

Data Cleaning

Data Transformation

Data Visualization

Data Analysis

Data Engineering - Big Data

Tools of Data Science

2. Data Extraction Wrangling & Exploration

Data Analysis Pipeline

What is Data Extraction

Types of Data

Raw and Processed Data

Data Wrangling

Exploratory Data Analysis

3. Visualization of Data

Dr. Sudhir Gavhane

Dean, LASC

Page 85: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

Introduction to Visualization.

Human Perception and Information Processing

Data types

Graphical perception (the ability of viewers to interpret visual

(graphical) encodings of information and thereby decode information in graphs

Color for information display

Color management systems

Picture visualization and fruition

Data Transformation into sources of knowledge through visual representation.

Requirements and heuristics for high-quality visualizations.

Charts and standard views: relevance and appropriateness.

Advanced and innovative tools for data visualization and advanced quantitative analysis.

The evaluation of the quality of visualizations and infographics.

Learning Resources:

Reference Books:

1. Foundations of Data Science By Avrim Blum, John Hopcroft, and Ravindran Kannan

Pedagogy:

Participative learning, discussions, algorithm, experiential learning through practical problem

solving, assignment, PowerPoint presentation

Assessment Scheme:

Class Continuous Assessment (CCA)

Assignments Test Problem solving Attendance Case study Any other

10 10 10 10 10 -

Term End Examination : 50 marks External

Dr. Sudhir Gavhane

Dean, LASC

Page 86: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

Syllabus :

Module

No. Contents

Workload in Hrs

Theory Lab Assess

1

What is Data Science?

What does Data Science involve?

Era of Data Science

Business Intelligence vs Data Science

Life cycle of Data Science

Tools of Data Science

12 - -

2

Data Extraction Wrangling & Exploration

Data Analysis Pipeline

What is Data Extraction

Types of Data

Raw and Processed Data

Data Wrangling

Exploratory Data Analysis

12 - -

3

Visualization of Data

Introduction to Visualization.

Human Perception and Information Processing

Data types

Graphical perception (the ability of viewers to interpret visual

(graphical) encodings of information and thereby decode

information in graphs

Color for information display

Color management systems

Picture visualization and fruition

Data Transformation into sources of knowledge through visual

representation.

Requirements and heuristics for high-quality visualizations.

Charts and standard views: relevance and appropriateness.

Advanced and innovative tools for data visualization and

advanced quantitative analysis.

The evaluation of the quality of visualizations and infographics.

12 - -

Prepared By

Preeti Adhav Asst.Prof.

Checked By

Pradnya Mahadik BOS Chairman

Approved By

Dr. Sudhir Gavhane Dean LASC

Page 87: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

COURSE STRUCTURE

Course Code MIT-WPU-MBD-2104

Course Category Core Big Data Analytics

Course Title Machine Learning Algorithm -I

Teaching Scheme and Credits

Weekly load hrs

L T Laboratory Credits

4 -- -- 3

Pre-requisites:

1. The main prerequisite for machine learning is data analysis.

2. Familiarity with probability theory

3. Familiarity with linear algebra

Course Objectives:

4. To introduce the basic concepts and techniques of Machine Learning.

5. To develop the skills in using recent machine learning software for solving practical

problems.

6. To be familiar with a set of well-known supervised, semi-supervised and unsupervised

learning algorithms

Course Outcomes:

1. Select real-world applications that needs machine learning based solutions.

2. Implement and apply machine learning algorithms.

3. Select appropriate algorithms for solving a particular group of real-world problems.

4. Recognize the characteristics of machine learning techniques that are useful to solve

real-world problems.

Course Contents

Introduction to learning

What is Supervised, Unsupervised and Reinforcement Learning? visualization of algebraic

concepts

Linear Regression

What is Regression? What is simple one variable regression line and coefficients of the line? What

are assumptions of linear regression? What is Gradient descent algorithm, cost function to find

'beta' values and concept

Gradient Descent

How to represent matrix of problem? How to use Gradient descent for multiple features and

scaling techniques in gradient descent? What are types of feature scaling, finding coefficients

analytically?

Dr. Sudhir Gavhane Dean

Page 88: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

Logistic Regression

What is Logistic regression model? What is Sigmoid function and its graphical representation?

What is Receiver-operating characteristic (RoC) curve? What is the use of RoC curve?

Optimization and Classifications

What is Optimization objective from logistic regression? What is large margin classifier? What is

concept behind large margin classifications using SVM?

Learning Resources:

1. T. Hastie, R. Tibshirani and J. Friedman, “Elements of Statistical Learning”,

2. Springer, 2009.

3. E. Alpaydin, “Machine Learning”, MIT Press, 2010.

4. K. Murphy, “Machine Learning: A Probabilistic Perspective”, MIT Press, 2012.

5. C. Bishop, “Pattern Recognition and Machine Learning, Springer”, 2006.

6. Shai Shalev-Shwartz, Shai Ben-David, “Understanding Machine Learning:From Theory to

Algorithms”, Cambridge University Press, 2014.

7. John Mueller and Luca Massaron, “Machine Learning for Dummies“, John Wiley &

Sons, 2016.

Pedagogy:

Participative learning, discussions, algorithm, Program writing, experiential learning through

practical problem-solving, assignment, PowerPoint presentation

Assessment Scheme:

Class Continuous Assessment (CCA)

Assignments Test Problem solving Attendance Case study Any other

10 10 10 10 10 -

Term End Examination : 50 marks External

Dr. Sudhir Gavhane Dean

Page 89: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

Syllabus:

Module

No. Contents

Workload in Hrs

Theory Lab Assess

1

Introduction to learning

Supervised, Unsupervised and Reinforcement Learning,

geometry (lines, curves and 3D spaces) and visualization of

algebraic concepts

5 - -

2

Linear Regression

Regression as a concept, simple one variable regression line,

coefficients of the line, assumptions of linear regression,

Gradient descent algorithm, cost function to find 'beta' values

and concept, local and global minima, concept of learning rate

8 - -

3

Gradient Descent

Matrix representation of problem, Gradient descent for multiple

features, use of feature scaling techniques in gradient descent,

types of feature scaling, finding coefficients analytically,

normal equation (matrix)non-invertibility

7 - -

4

Logistic Regression

Logistic regression model, matrix representation, general

Sigmoid function and graphical representation, decision

boundary (linear and non-linear), metrics for logistic regression

(accuracy, sensitivity, specificity etcetera concepts), Receiver-

operating characteristic (RoC) curve, use of RoC curve to find

out optimum decision boundary, convexity and non-convexity

of a group of points

13 - -

5

Optimization and Classifications

Optimization objective from logistic regression to support

vector machines, large margin classifier, concepts behind large

margin classifications, kernels (concept, types and graphical

explanations), using SVM

12

Prepared By

Archana Varade

Assistant Professor

Checked By

Pradnya Mahadik BOS Chairman

Approved By

Dr. Sudhir Gavhane Dean

Page 90: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

COURSE STRUCTURE

Course Code MIT-WPU-MBD-2105

Course Category Core Big Data Analytics

Course Title Lab on R Programming

Teaching Scheme and Credits

Weekly load hrs

L T Laboratory Credits

- - 3 3

Pre-requisites

Computing: The Structure and Interpretation of Computer Programs

Math: Linear Algebra: some basic concepts like linear operators, eigenvectors, derivatives, and

integrals to enable statistical inference and derive new prediction algorithms.

Course Objectives:

To describe Data Science Life cycle.

To describe Data Visualization

Course Outcomes:

Students will be able to understand Data Science Life cycle & Data Visualization

Course Contents:

Data Cleaning

Data Transformation

Data Visualization

Data Analysis

Data Engineering - Big Data

Tableau Desktop

Getting Started

Connecting to Data

Visual Analytics

Dashboards and Stories

Mapping

Calculations

Why is Tableau Doing That?

How To cleanse & represent

Learning Resources:

Dr. Sudhir Gavhane

Dean, LASC

Page 91: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

Syllabus:

Module

No. Contents

Workload in Hrs

Theory Lab Assess

1 Assignment on Data Cleansing

2 Assignment on Transformation

3 Assignment on

Reference Books:

Foundations of Data Science By Avrim Blum, John Hopcroft, and Ravindran Kannan

Pedagogy:

Participative learning, discussions, algorithm, Flowchart & Program writing, experiential learning

through practical problem solving, assignment, PowerPoint presentation.

Assessment Scheme:

Class Continuous Assessment (CCA) 50

Assignments Test Presentations Case study MCQ Oral Attendance

10 10 - - 10 10 10

Term End Examination : 50 Marks External

Laboratory Continuous Assessment (LCA)50

Practical Oral based on

practical

Site Visit Mini

Project

Problem

based

Learning

Attendance

10 10 - 10 10 10

Term End Examination : 50 Marks External

Dr. Sudhir Gavhane

Dean, LASC

Page 92: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

4

Basic of Tableau :

i. Tableau interface:

Menus and Toolbar

Data Pane

Analytics Pane

Sheet Tabs

Shelves and Cards

Marks Card

Legends

Layout for Dashboards & Stories

Distributing and Publishing

ii. Distributing & publishing:

Way to share

Exploring images and PDFs

Workbook file types

Opening workbook files

Sharing securely

- -

5

Connecting with Data:

Getting Started with Data

Managing Metadata

Managing Extracts

Saving and Publishing Data Sources

Data Prep with Text and Excel Files

Join Types with Union

Cross-database Joins

Data Blending

Additional Data Blending Topics

Connecting to Cubes

Connecting to PDFs

- -

6

Visual Analytics:

Getting Started with Visual Analytics

Drill Down and Hierarchies

Sorting

Grouping

Additional Ways to Group

Creating Sets

Working with Sets

Ways to Filter

Using the Filter Shelf

- -

Dr. Sudhir Gavhane

Dean, LASC

Page 93: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

Interactive Filters

Where Tableau Filters

Additional Filtering Topics

Parameters

Formatting

The Formatting Pane

Basic Tooltips

Viz in Tooltip

Trend Lines

Reference Lines

Forecasting

Clustering

Analysis with Cubes and MDX

7

Dashboards and Stories:

Getting Started with Dashboards and Stories

Building a Dashboard

Dashboard Objects

Dashboard Formatting

Device Designer

Dashboard Interactivity Using Actions

Story Points

- -

8

Mapping:

Getting Started with Mapping

Maps in Tableau

Editing Unrecognized Locations

Spatial Files

Expanding Tableau's Mapping Capabilities

Custom Geocoding

Polygon Maps

Mapbox Integration

WMS: Web Mapping Services

Background Images

- -

9

Calculations:

Calculation Syntax

Introduction to LOD Expressions

Modifying Table Calculations

Aggregate Calculations

Logic Calculations

String Calculations

Number Calculations

Type Calculations

- -

Dr. Sudhir Gavhane

Dean, LASC

Page 94: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

Conceptual Topics with LOD Expressions

Aggregation and Replication with LOD Expressions

Nested LOD Expressions

How to Integrate R and Tableau

Using R within Tableau

Date Calculations

Getting Started with Calculations

Intro to Table Calculations

10

Why is Tableau Doing That?

Understanding Pill Types

Measure Names and Measure Values

Aggregation, Granularity, and Ratio Calculations

When to Blend and When to Join

Fixing "Incorrect" Sorts

Filtering for Top Across Panes

- -

11

How To

Finding the Second Purchase Date with LOD

Expressions

Using a Parameter to Change Fields

Cleaning Data by Bulk Re-aliasing

Bollinger Bands

Bump Charts

Control Charts

Funnel Charts

Pareto Charts

Waterfall Charts

- -

Prepared By

Preeti Adhav Lecturer

Checked By

Pradnya Mahadik BOS Chairmen

Approved By

Dr. Sudhir Gavhane Dean LASC

Page 95: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

COURSE STRUCTURE

Course Code MIT-WPU- MBD-2106

Course Category Elective Big Data Analytics

Course Title Internet Of Things

Teaching Scheme and Credits

Weekly load hrs

L T Laboratory Credits

4 - - 3

Pre-requisites:

1. Knowledge of networking, sensing, databases, programming, and related technology.

2. Familiarity with business concepts and marketing.

Course Objectives:

1. Vision and Introduction to IoT.

2. Understand IoT Market perspective.

3. Data and Knowledge Management and use of Devices in IoT Technology.

4. Understand State of the Art – IoT Architecture.

5. Real World IoT Design Constraints, Industrial Automation and Commercial Building

Automation in IoT.

Course Outcomes:

1. Students will understand IoT Market perspective.

2. Students will get Data and Knowledge Management and use of Devices in IoT

Technology.

3. Students will understand State of the Art – IoT Architecture.

4. Students will get Real World IoT Design Constraints, Industrial Automation and

Commercial Building Automation in IoT.

Course Contents:

M2M to IoT

Introduction of M2M to IoT

M2M to IoT – A Market Perspective

Introduce basic concepts of IoT. Emerging industrial structure for IoT and development of IoT

architecture.

M2M and IoT Technology Fundamentals

Fundamental concepts of technology required for M2M and IoT

Dr. Sudhir Gavhane

Dean, LASC

Page 96: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

IoT Architecture-State of the Art

Includes study of IoT reference model.

IoT Reference Architecture Study of different views of reference architecture. Introduction to Industrial Automation- Service-

oriented architecture-based device integration

Commercial Building Automation

Case study for Commercial Building Automation.

Learning Resources:

Reference Books:

1. Jan Holler, Vlasios Tsiatsis, Catherine Mulligan, Stefan Avesand, Stamatis

Karnouskos, David Boyle, “From Machine-to-Machine to the Internet of Things:

Introduction to a New Age of Intelligence”, 1st Edition, Academic Press, 2014.Data

Warehousing in the Real World, Anahory, Murray, Pearson Education

2. Vijay Madisetti and Arshdeep Bahga, “Internet of Things (A Hands-on-Approach)”, 1st Edition, VPT, 2014.

3. Francis daCosta, “Rethinking the Internet of Things: A Scalable Approach to Connecting Everything”, 1st Edition, Apress Publications, 2013

Supplementary Reading:

1. Collaborative Internet of Things (C-IoT): For Future Smart Connected Life and

Business

2. By Fawzi Behmann, Kwok Wu

Weblinks:

www.tutorialspoint.com

Pedagogy:

Participative learning, discussions, Problem Solving, experiential learning through practical

problem solving, assignment, PowerPoint presentation

Assessment Scheme:

Class Continuous Assessment (CCA) 50 Marks

Assignments Test Presentations Case study MCQ Oral Attendance

20 10 10 - - 10

Term End Examination : 50 Marks

Dr. Sudhir Gavhane

Dean, LASC

Page 97: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

Syllabus:

Module

No. Contents

Workload in Hrs

Theory Lab Assess

1 M2M to IoT

The Vision-Introduction, From M2M to IoT, M2M towards IoT-

the global context, A use case example, Differing Characteristics

5 - -

2

M2M to IoT – A Market Perspective

Introduction, Some Definitions, M2M Value Chains, IoT Value

Chains, An emerging industrial structure for IoT, The

international driven global value chain and global information

monopolies. M2M to IoT-An Architectural Overview– Building

an architecture, Main design principles and needed capabilities,

An IoT architecture outline, standards considerations

7 - 1

3

M2M and IoT Technology Fundamentals

Devices and gateways, Local and wide area networking, Data

management, Business processes in IoT, Everything as a

Service(XaaS), M2M and IoT Analytics, Knowledge

Management

7 - 1

4

IoT Architecture-State of the Art

Introduction, State of the art, Architecture Reference Model-

Introduction, Reference Model and architecture, IoT reference

Model

6 - 1

5

IoT Reference Architecture

Introduction, Functional View, Information View, Deployment

and Operational View, Other Relevant architectural views. Real-

World Design Constraints- Introduction, Technical Design

constraints-hardware is popular again, Data representation and

visualization, Interaction and remote control. Industrial

Automation- Service-oriented architecture-based device

integration, SOCRADES: realizing the enterprise integrated Web

of Things, IMC-AESOP: from the Web of Things to the Cloud of

Things

8 - 1

6

Commercial Building Automation

Introduction, Case study: phase one-commercial building

automation today, Case study: phase two- commercial building

automation in the future

7 - 1

Prepared By

Ms. Smita Patil

Assistant Professor

Checked By

Pradnya Mahadik

BOS Chairman

Approved By

Dr. Sudhir Gavhane Dean, LASC

Page 98: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

COURSE STRUCTURE

Course Code MIT-WPU-MBD-2107

Course Category Elective Big Data Analytics

Course Title Introduction to image processing

Teaching Scheme and Credits

Weekly load hrs

L T Laboratory Credits

-- 04 -- 03

Pre-requisites:

Basic knowledge of Core Java programing

Course Objectives: 1. To learn the fundamental concepts of Digital Image Processing.

2. To study basic image processing operations.

3. To understand image analysis algorithms.

4. To expose students to current applications in the field of digital image processing.

Course Outcomes: 1. Understand image formation and the role human visual system plays in perception of gray

and color image data.

2. Get broad exposure to and understanding of various applications of image processing in

industry, medicine, and defense.

3. Learn the signal processing algorithms and techniques in image enhancement and image

restoration.

4. Acquire an appreciation for the image processing issues and techniques and be able to apply

these techniques to real world problems.

5. Be able to conduct independent study and analysis of image processing problems and

techniques

Course Contents

Introduction

What is Image Processing?, The origins of Image Processing, Examples of Fields that use Image

Processing, Gamma-Ray Imaging, X-Ray Imaging, Imaging in the Ultraviolet Band, Imaging in

the Visible and Infrared Bands, Imaging in the Microwave Band, Imaging in the Radio Band,

Fundamental steps in Digital Image Processing, Components of an Image Processing System

Digital Image Fundamentals

Elements of Visual Perception, Light and the Electromagnetic Spectrum, Image sensing and

Acquisition, Image Sampling and Quantization, Some Basic Relationships between Pixels, An

Introduction to the Mathematical Tools Used in Digital Image Processing, Array versus Matrix

Operations, Linear versus Nonlinear Operations, Arithmetic Operations, Set and Logical

Operations

Intensity Transformation and Spatial Filtering

Dr. Sudhir Gavhane

Dean LASC

Page 99: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

Background, Some Basic Intensity Transformation Functions, Histogram Processing, Histogram

Equalization, Histogram Matching (Specification), Local Histogram Processing, Fundamentals of

Spatial Filtering, Smoothing Spatial Filters, Sharpening Spatial Filters, Combining Spatial

Enhancement Methods

Filtering in the Frequency Domain

Background, Preliminary Concepts, Sampling and the Fourier Transform of Sampled Functions,

The Discrete Fourier Transform (DFT) of One variable, Extension to Functions of Two Variables.

Image Restoration and Reconstruction

A Model of the Image Degradation / Restoration Process, Noise Models, Restoration in the

Presence of Noise Only- Spatial Filtering, Periodic Noise Reduction by Frequency Domain

Filtering, Bandreject Filters, Bandpass Filters, Notch Filters, Estimating the Degradation Function,

Inverse Filtering, Minimum Mean Square Error(Wiener) Filtering, Geometric Mean Filter

Morphological Image Processing

Preliminaries, Erosion and Dilation, Opening and Closing, The Hit-or-Miss Transformation, Some

Basic Morphological Algorithms, Boundary Extraction, Hole Filling, Extraction of Connected

Components, Convex Hull, Thinning, Thickening, Skeletons, Pruning, Morphological

Reconstruction

Image Segmentation

Fundamentals, Point, Line, and Edge Detection, Background, Detection of Isolated Points, Line

Detection

Edge Models, Basic Edge Detection, Edge Linking and Boundary Detection, Thresholding,

Foundation, Basic Global Thresholding, Optimum Global Thresholding Using Otsu's Method.

Learning Resources:

Reference Books

B1: Cay’s Horstmann and Gary Cornell Core Java Volume -1 and Volume 2.

B2: Herbert Schildt (TMH) The complete reference JAVA-2 Fifth Edition.

Pedagogy:

Participative learning, discussions, algorithm, Flowchart & Program writing, experiential learning

through practical problem solving, assignment, PowerPoint presentation.

Assessment Scheme:

Class Continuous Assessment (CCA) 50 Marks

Assignments Test Presentations Case study MCQ Oral Any other

Dr. Sudhir Gavhane

Dean LASC

Page 100: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

Syllabus:

Module

No. Contents

Workload in Hrs

Theory Lab Assess

1

Introduction [3]

What is Image Processing?, The origins of Image Processing,

Examples of Fields that use Image Processing, Gamma-Ray

Imaging, X-Ray Imaging, Imaging in the Ultraviolet Band,

Imaging in the Visible and Infrared Bands, Imaging in the

Microwave Band, Imaging in the Radio Band, Fundamental steps

in Digital Image Processing, Components of an Image

Processing System

4 - -

2

Digital Image Fundamentals [6]

Elements of Visual Perception, Light and the Electromagnetic

Spectrum, Image sensing and Acquisition, Image Sampling and

Quantization, Some Basic Relationships between Pixels, An

Introduction to the Mathematical Tools Used in Digital Image

Processing, Array versus Matrix Operations, Linear versus

Nonlinear Operations, Arithmetic Operations, Set and Logical

Operations

10 - -

3

Intensity Transformation and Spatial Filtering [7]

Background, Some Basic Intensity Transformation Functions,

Histogram Processing, Histogram Equalization, Histogram

Matching (Specification), Local Histogram Processing,

Fundamentals of Spatial Filtering, Smoothing Spatial Filters,

Sharpening Spatial Filters, Combining Spatial Enhancement

Methods

9 - -

4

Filtering in the Frequency Domain [10]

Background, Preliminary Concepts, Sampling and the Fourier

Transform of Sampled Functions, The Discrete Fourier

Transform (DFT) of One variable, Extension to Functions of Two

Variables.

7 - -

5

Image Restoration and Reconstruction [6]

A Model of the Image Degradation / Restoration Process, Noise

Models, Restoration in the Presence of Noise Only- Spatial

Filtering, Periodic Noise Reduction by Frequency Domain

Filtering, Bandreject Filters, Bandpass Filters, Notch Filters,

Estimating the Degradat ion Function, Inverse Filtering,

Minimum Mean Square Error(Wiener) Filtering, Geometric

Mean Filter

7 - -

6

Morphological Image Processing [5]

-or-

8 - -

Dr. Sudhir Gavhane

Dean LASC

Page 101: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

Morphological Algorithms, Boundary Extraction, Hole Filling,

Extraction of Connected Components, Convex Hull, Thinning,

Thickening, Skeletons, Pruning, Morphological Reconstruction

Image Segmentation [7]

Fundamentals, Point, Line, and Edge Detection,Background,

Detection of Isolated Points, Line Detection

Edge Models, Basic Edge Detection, Edge Linking and Boundary

Detection, Thresholding, Foundation, Basic Global Thresholding,

Optimum Global Thresholding Using Otsu's Method.

7 - -

Prepared By

Nilesh Magar

Assistant professor

Checked By

Pradnya Mahadik

Course Coordinator

Approved By

Dr. Sudhir Gavhane Dean LASC

Page 102: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

COURSE STRUCTURE

Course Code MIT-WPU-MBD-2201

Course Category Core Big Data Analytics

Course Title Natural Language Processing

Teaching Scheme and Credits

Weekly load hrs

L T Laboratory Credits

3 -- -- 3

Pre-requisites:

1. 1. Linear algebra

2. Probability & Statistics

3. Artificial Intelligence and Neural Networks

Course Objectives: To understand natural language processing, algorithms, structures and

meanings

Course Outcomes: 1. Students will understand Word forms.

2. Students will understand structures.

3. Students will understand meaning processing.

Course Contents

Introduction to Natural Language Processing Brief History and introduction about Natural Language Processing

ML basics

Algorithms, Naïve Bayes, Bayesian Statistics, HMM, CRF

Word Forms

POS tagging and Chunking: Morphology fundamentals; Morphological Diversity of Indian

Languages; Morphology Paradigms; Finite State Machine Based Morphology; Automatic

Morphology Learning; Shallow Parsing; Named Entities; Maximum Entropy Models; Random

Fields, POS tagging techniques, Chunking techniques:CRF.

Structures

Theories of Parsing, Parsing Algorithms; Robust and Scalable Parsing on Noisy Text as in Web

documents; dependency parsing; Hybrid of Rule Based and Probabilistic Parsing: MST, MALT

parser; Scope Ambiguity and Attachment Ambiguity resolution.

Meaning

Lexical Knowledge Networks, Wordnet Theory; Indian Language Wordnets and Multilingual

Dictionaries; Semantic Roles; Word Sense Disambiguation; WSD and Multilinguality; Metaphors;

Co-references.

Dr. Sudhir Gavhane Dean, LASC

Page 103: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

Learning Resources:

Reference Books:

1. Allen, James, “Natural Language Understanding”, Second Edition, Benjamin/Cumming, 1995.

2. Charniack, Eugene, “Statistical Language Learning”, MIT Press, 1993.

3. Jurafsky, Dan and Martin, James, “Speech and Language Processing”,Second Edition, Prentice

Hall, 2008.

4. Manning, Christopher and Heinrich, Schutze, “Foundations of StatisticalNatural Language

Processing”, MIT Press, 1999.

5. AksharBharti, VineetChaitanya, Rajeev Sangal,”Natural Language Processing: An Paninian

perspective”

Web Resources:

Weblinks: -

MOOCs:-

Pedagogy:

Participative learning, discussions, algorithm, Flowchart & Program writing, experiential learning

through practical problem solving, assignment, PowerPoint presentation

Assessment Scheme:

Class Continuous Assessment (CCA)

Assignments Test Presentations Attendance Viva Any other

10 10 10 10 10 -

Term End Examination: 50 marks External

Dr. Sudhir Gavhane

Dean, LASC

Page 104: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

Syllabus:

Module

No. Contents

Workload in Hrs

Theory Lab Assess

1

Introduction to Natural Language Processing

Brief History, Applications: Speech to text, story understanding,

QA system, Machine Translation, Text summarization, text

classification, sentiment analysis, chatterbox, challenges/Open

Problems, Natural Language (NL) Characteristics and NL

computing techniques, NL tasks: Segmentation, Chunking,

tagging, NER, Parsing, Word Sense Disambiguation, NL

Generation, Web 2.0 Applications : Sentiment Analysis; Text

Entailment; Cross Lingual Information Retrieval (CLIR).

10 - -

2 ML basics

Algorithms, Naïve Bayes, Bayesian Statistics, HMM, CRF 5 - -

3

Word Forms

POS tagging and Chunking: Morphology fundamentals;

Morphological Diversity of Indian Languages; Morphology

Paradigms; Finite State Machine Based Morphology; Automatic

Morphology Learning; Shallow Parsing; Named Entities;

Maximum Entropy Models; Random Fields, POS tagging

techniques, Chunking techniques: CRF.

10 - -

4

Structures

Theories of Parsing, Parsing Algorithms; Robust and Scalable

Parsing on Noisy Text as in Web documents; dependency

parsing; Hybrid of Rule Based and Probabilistic Parsing: MST,

MALT parser; Scope Ambiguity and Attachment Ambiguity

resolution.

10 - -

5

Meaning

Lexical Knowledge Networks, Wordnet Theory; Indian

Language Wordnets and Multilingual Dictionaries; Semantic

Roles; Word Sense Disambiguation; WSD and Multilinguality;

Metaphors; Coreferences.

10

Prepared By

Mr. Sameer Kakade Asst. Professor

Checked By

Ms. Pradnya Mahadik BOS Chairman

Approved By

Dr. Sudhir Gavhane Dean, LASC

Page 105: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

COURSE STRUCTURE

Course Code MIT-WPU- MBD-2202

Course Category Core Big Data Analytics

Course Title Web & Social Intelligence

Teaching Scheme and Credits

Weekly load hrs

L T Laboratory Credits

4 - - 3

Pre-requisites:

Knowledge of any scripting language, XML and cloud

Course Objectives:

Organizations worldwide are waking up to the opportunity of this revolutionary medium to fulfill

various business objectives ranging from Sales,

Marketing, CRM, Product Development and Research. This has created an ever increasing demand

of skilled Web Analytics professionals.The objective is to fulfill this demand.

Course Outcomes:

After taking this course, you will be able to: - Utilize various Application Programming

Interface (API) services to collect data from different social media sources such as YouTube,

Twitter, and Flickr. - Process the collected data - primarily structured - using methods involving

correlation, regression, and classification to derive insights about the sources and people who

generated that data. - Analyze unstructured data - primarily textual comments - for sentiments

expressed in them. - Use different tools for collecting, analyzing, and exploring social media data

for research and development purposes.

Course Contents:

Introduction to web analytics

What’s analysis?

Getting started with Google Analytics

Google Analytics

Getting Started With Google Analytics

How Google Analytics works?

Accounts, profiles, and users navigating

Google Analytics

Content performance analysis

Pages and Landing Pages

Event Tracking and Ad Sense Site Search

Dr. Sudhir Gavhane

Dean, LASC

Page 106: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

Dr. Sudhir Gavhane

Dean, LASC

Visitor analysis

Unique visitors

Geographic and language information

Technical reports

Benchmarking

Social media analytics

Face book insights

Twitter analytics

YouTube analytics

Social Ad analytics / ROI measurement

Social & CRM Analysis

Radian6

Sentiment analysis

Workflow management

Text analytics

Learning Resources:

Reference Books:

Written by none other than Avinash Kaushik, Digital Marketing Evangelist for Google, Co-

Founder and Chief Education Officer for Market Motive, and author of two bestselling

books: Web Analytics 2.0, Web Analytics: An Hour A Day tops the chart when it comes to

best Web Analytics Books.

Supplementary Reading:

Weblinks:

Pedagogy:

Participative learning, discussions, Problem Solving, experiential learning through practical

problem solving, assignment, PowerPoint presentation

Assessment Scheme:

Class Continuous Assessment (CCA): 50 Marks

Assignments Test Presentations Case study MCQ Oral Attendance

20 10 10 - - 10

Term End Examination : 50 Marks

Page 107: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

Syllabus:

Module

No. Contents

Workload in Hrs

Theory Lab Assess

1

Introduction to web analytics

What’s analysis?

Is analysis worth the effort?

• Small businesses

• Medium and large scale businesses

Analysis vs intuition

What is web analytics?

Getting started with Google Analytics

• How Google Analytics works

• Accounts, profiles, and users

5 - -

2

Google Analytics

Getting Started With Google Analytics

How Google Analytics works?

Accounts, profiles, and users navigating

Google Analytics

Basic metrics

The main sections of Google Analytics reports

Traffic Sources

Direct, referring, and search traffic

Campaigns

AdWords, Adsense

7 - 1

3

Content performance analysis

Pages and Landing Pages

Event Tracking and AdSense

Site Search

7 - 1

4

Visitor analysis

Unique visitors

Geographic and language information

Technical reports

Benchmarking

6 - 1

Dr. Sudhir Gavhane

Dean, LASC

Page 108: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

5

Social media analytics

Face book insights

Twitter analytics

YouTube analytics

Social Ad analytics / ROI measurement

8 - 1

6

Social & CRM Analysis

Radian6

Sentiment analysis

Workflow management

Text analytics

7 - 1

Prepared By

Ms. Smita Patil

Assistant Professor

Checked By

Pradnya Mahadik

BOS Chairman

Approved By

Dr. Sudhir Gavhane Dean, LASC

Page 109: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

COURSE STRUCTURE

Course Code MIT-WPU-MBD-2203

Course Category Core Big Data Analytics

Course Title Cloud Computing

Teaching Scheme and Credits

Weekly load hrs

L T Laboratory Credits

-- 04 -- 03

Pre-requisites: 1. Basic understanding about Distributed Computing

2. Basic understanding about networking like VLAN , IP addressing (Class A , B, C ), VNET

, Subnet , Introduction to RFC 1918 , DNS systems and how they work in general

3. Cloud Storage Systems

Course Objectives: This course gives the idea of evolution of cloud computing and its services available today,

which may led to the design and development of simple cloud service. It also focused on some

key challenges and issues around cloud computing.

Course Outcomes: After successfully completing students should be able to

Articulate the main concepts, key technologies, strengths, and limitations of cloud

computing and the possible applications for state-of-the-art cloud computing

Identify the architecture and infrastructure of cloud computing, including SaaS, PaaS, IaaS,

public cloud, private cloud, hybrid cloud, etc.

Explain the core issues of cloud computing such as security, privacy, and interoperability.

Choose the appropriate technologies, algorithms, and approaches for the related issues.

Identify problems, and explain, analyze, and evaluate various cloud computing solutions.

Provide the appropriate cloud computing solutions and recommendations according to the

applications used.

Attempt to generate new ideas and innovations in cloud computing.

Collaboratively research and write a research paper, and present the research online.

Course Contents:

INTRODUCTION

Introduction of Cloud

CLOUD SERVICES

Types of Cloud services

Service providers- Google, Amazon, Microsoft Azure, IBM, Sales force

COLLABORATING USING CLOUD SERVICES

Dr.Sudhir Gavhane

Dean LASC

Page 110: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

Email Communication over the Cloud - CRM Management - Project Management-Event

Management - Task Management – Calendar - Schedules - Word Processing – Presentation

Spreadsheet - Databases – Desktop - Social Networks and Groupware

VIRTUALIZATION FOR CLOUD

Need for Virtualization – Pros and cons of Virtualization – Types of Virtualization –System

Vm, Process VM, Virtual Machine monitor – Virtual machine properties - Interpretation and

Binary translation, HLL VM - Hypervisors – Xen, KVM, VMWare, Virtual Box, Hyper-V.

SECURITY, STANDARDS AND APPLICATIONS

Security in Clouds: Cloud security challenges – Software as a Service Security, Common

Standards: The Open Cloud Consortium – The Distributed management Task Force –

Standards for application Developers – Standards for Messaging – Standards for Security,

End user access to cloud computing, Mobile Internet devices and the cloud.

Learning Resources:

TEXT BOOKS:

1. John Rittinghouse & James Ransome, Cloud Computing, Implementation, Management

and Strategy, CRC Press, 2010.

2. Michael Miller, Cloud Computing: Web-Based Applications That Change the Way You

Work and Collaborate Que Publishing, August 2008.

3. James E Smith, Ravi Nair, Virtual Machines, Morgan Kaufmann Publishers, 2006.

REFERENCES:

1. David E.Y. Sarna Implementing and Developing Cloud Application, CRC press 2011.

2. Lee Badger, Tim Grance, Robert Patt-Corner, Jeff Voas, NIST, Draft cloud computing

synopsis and recommendation, May 2011.

3. Anthony T Velte, Toby J Velte, Robert Elsenpeter, Cloud Computing : A Practical

Approach, Tata McGraw-Hill 2010.

4. Haley Beard, Best Practices for Managing and Measuring Processes for On-demand

Computing, Applications and Data Centers in the Cloud with SLAs, Emereo Pty Limited,

July 2008.

5. G.J.Popek, R.P. Goldberg, Formal requirements for virtualizable third generation

Architectures, Communications of the ACM, No.7 Vol.17, July 1974.

Pedagogy:

Participative learning, discussions, algorithm, Flowchart & Program writing, experiential learning

through practical problem solving, assignment, PowerPoint presentation.

Assessment Scheme:

Class Continuous Assessment (CCA) 50 Marks

Assignments Test Presentations Case study Attendance Oral Any other

10 10 10 10 10 - -

Term End Examination : 50 Marks of external Examination

Dr. Sudhir Gavhane

Dean LASC

Page 111: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

Syllabus:

Module

No. Contents

Workload in Hrs

Theory Lab Assess

1

INTRODUCTION

Cloud-definition, benefits, usage scenarios, History

of Cloud Computing - Cloud Architecture

Types of Clouds - Business models around Clouds –

Major Players in Cloud Computing -

issues in Clouds - Eucalyptus - Nimbus - Open

Nebula, Cloud Sim.

9 - -

2

CLOUD SERVICES

Types of Cloud services: Software as a Service -

Platform as a Service – Infrastructure as

a Service - Database as a Service - Monitoring as a

Service –Communication as services.

Service providers- Google, Amazon, Microsoft

Azure, IBM, Sales force

9 - -

3

UNIT III COLLABORATING USING CLOUD

SERVICES

Email Communication over the Cloud - CRM

Management - Project Management-Event

Management - Task Management – Calendar -

Schedules - Word Processing – Presentation

Spreadsheet - Databases – Desktop - Social

Networks and Groupware

9 - -

4

UNIT IV VIRTUALIZATION FOR CLOUD

Need for Virtualization – Pros and cons of

Virtualization – Types of Virtualization –System

Vm, Process VM, Virtual Machine monitor – Virtual

machine properties - Interpretation and

Binary translation, HLL VM - Hypervisors – Xen,

KVM, VMWare, Virtual Box, Hyper-V.

9 - -

5

UNIT V SECURITY, STANDARDS AND

APPLICATIONS

Security in Clouds: Cloud security challenges –

Software as a Service Security, Common

Standards: The Open C loud Consortium – The

Distributed management Task Force –

9 -

Dr. Sudhir Gavhane

Dean LASC

Page 112: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

Standards for application Developers – Standards for

Messaging – Standards for Security

End user access to cloud computing, Mobile Internet

devices and the cloud.

Prepared By

Nilesh Magar

Assistant professor

Checked By

Pradnya Mahadik

Course Coordinator

Approved By

Dr. Sudhir Gavhane Dean LASC

Page 113: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

COURSE STRUCTURE

Course Code MIT-WPU- MBD-2204

Course Category Lab Big Data Analytics

Course Title Web & Social Intelligence

Teaching Scheme and Credits

Weekly load hrs

L T Laboratory Credits

4 - - 3

Pre-requisites:

Course Objectives:

Organizations worldwide are waking up to the opportunity of this revolutionary medium to fulfill

various business objectives ranging from Sales,

Marketing, CRM, Product Development and Research. This has created an ever increasing demand

of skilled Web Analytics professionals.The objective is to fulfill this demand.

Course Outcomes:

After taking this course, you will be able to: - Utilize various Application Programming

Interface (API) services to collect data from different social media sources such as YouTube,

Twitter, and Flickr. - Process the collected data - primarily structured - using methods involving

correlation, regression, and classification to derive insights about the sources and people who

generated that data. - Analyze unstructured data - primarily textual comments - for sentiments

expressed in them. - Use different tools for collecting, analyzing, and exploring social media data

for research and development purposes.

Course Contents:

Introduction to web analytics

What’s analysis?

Getting started with Google Analytics

Google Analytics

Getting Started With Google Analytics

How Google Analytics works?

Accounts, profiles, and users navigating

Google Analytics

Dr. Sudhir Gavhane

Dean, LASC

Page 114: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

Content performance analysis

Pages and Landing Pages

Event Tracking and Ad Sense Site Search

Visitor analysis

Unique visitors

Geographic and language information

Technical reports

Benchmarking

Social media analytics

Face book insights

Twitter analytics

YouTube analytics

Social Ad analytics / ROI measurement

Social & CRM Analysis

Radian6

Sentiment analysis

Workflow management

Text analytics

Learning Resources:

Reference Books:

Written by none other than Avinash Kaushik, Digital Marketing Evangelist for Google, Co-

Founder and Chief Education Officer for Market Motive, and author of two bestselling books: Web

Analytics 2.0, Web Analytics: An Hour A Day tops the chart when it comes to best Web Analytics

Books

Supplementary Reading:

Weblinks:

Pedagogy:

Participative learning, discussions, Problem Solving, experiential learning through practical

problem solving, assignment, PowerPoint presentation

Dr. Sudhir Gavhane

Dean, LASC

Page 115: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

Syllabus:

Module

No. Contents

Workload in Hrs

Theory Lab Assess

1

Introduction to web analytics

What’s analysis?

Is analysis worth the effort?

• Small businesses

• Medium and large scale businesses

Analysis vs intuition

What is web analytics?

Getting started with Google Analytics

• How Google Analytics works

• Accounts, profiles, and users

5 - -

2

Google Analytics

Getting Started With Google Analytics How Google Analytics works? Accounts, profiles, and users navigating Google Analytics Basic metrics

The main sections of Google Analytics reports

Traffic Sources

Direct, referring, and search traffic

Campaigns

AdWords, Adsense

7 - -

Assessment Scheme:

Class Continuous Assessment (CCA): 50 Marks

Assignments Test Presentations Case study MCQ Oral Attendance

20 10 10 - - 10

Term End Examination : 50 Marks

Dr. Sudhir Gavhane

Dean, LASC

Page 116: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

3

Content performance analysis

Pages and Landing Pages

Event Tracking and AdSense

Site Search

7

-

1

4

Visitor analysis

Unique visitors

Geographic and language information

Technical reports

Benchmarking

6 - 1

5

Social media analytics

Face book insights

Twitter analytics

YouTube analytics

Social Ad analytics / ROI measurement

8 - 1

6

Social & CRM Analysis

Radian6

Sentiment analysis

Workflow management

Text analytics

7 - 1

Prepared By

Ms. Smita Patil

Assistant Professor

Checked By

Pradnya Mahadik

BOS Chairman

Approved By

Dr. Sudhir Gavhane Dean, LASC

Page 117: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

COURSE STRUCTURE

Course Code MIT-WPU-MBD-2206

Course Category Elective Big Data Analytics

Course Title Marketing Analytics

Teaching Scheme and Credits

Weekly load hrs

L T Laboratory Credits

4 -- -- 3

Pre-requisites:

Course Objectives:

1. This course will focus on developing marketing strategies and resource allocation

decisions driven by quantitative analysis.

2. This course covers basic concepts in marketing process Measuring Brand Assets

3. This course includes Customer Lifetime Value, Regression Analysis, and Spreadsheet

with Formulas.

Course Outcomes:

4. Students will know what are the basic marketing strategies

5. Students learn about the core concepts and tools in marketing

6. Students know about measure brand value, calculate brand value

7. Students understand the marketing models.

Course Contents

The Marketing Process

What is marketing process and its Strategic Challenges? What are Marketing Strategies with data

using Text Analytics? How to utilize data to improve marketing strategies?

Metrics for Measuring Brand Assets

What is Metrics for Measuring Brand Assets? What is Snapple and Brand Value?

How to develop brand personality, develop brand architecture, brand pyramid, measure brand

value, calculate brand value?

Customer Lifetime Value

What is Customer Lifetime Value (CLV)? How to calculate CLV, understand the CLV Formula,

apply the CLV Formula, extend the CLV Formula, use CLV to make decisions?

Dr. Sudhir Gavhane Dean

Page 118: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

Marketing Experiments

What is Spreadsheet with Formulas? How to determine cause and effect through experiments?

How to design basic experiments, design before and after experiments, design full factorial web

experiments? How to calculate projected lift?

Regression Basics

What is Regression Analysis? How to interpret Regression Outputs? What is Multivariable

Regressions, Omitted Variable Bias? How to use Price Elasticity to Evaluate Marketing? What is

Log-Log Models and Marketing Mix Models?

Learning Resources:

1. Marketing Analytics A Practitioner's Guide to Marketing Analytics and Research Methods

By (author): Ashok Charan (NUS, Singapore)

2. Managing Customer Value One Stage at a Time By (author): Dilip Soman (University of

Toronto, Canada), Sara N-Marandi (University of Toronto, Canada)

3. Worldwide Casebook in Marketing Management By (author): Luiz Moutinho (Dublin City

University, Ireland)

4. Data-Driven Marketing: The 15 Metrics Everyone in Marketing Should Know Hardcover –

February 8, 2010 by Mark Jeffery (Author)

5. Lean Analytics: Use Data to Build a Better Startup Faster (Lean Series) Hardcover – March

21, 2013 by Alistair Croll (Author), Benjamin Yoskovitz (Author)

6. Digital Marketing Analytics: Making Sense of Consumer Data in a Digital World (Que

Biz-Tech) Paperback – April 25, 2013 by Chuck Hemann (Author), Ken Burbary (Author)

Pedagogy:

Participative learning, discussions, algorithm, Program writing, experiential learning through

practical problem-solving, assignment, PowerPoint presentation

Assessment Scheme:

Class Continuous Assessment (CCA)

Assignments Test Case study-1 Attendance Case study-2 Any other

10 10 10 10 10 -

Term End Examination : 50 marks External

Dr. Sudhir Gavhane Dean

Page 119: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

Syllabus:

Module

No. Contents

Workload in Hrs

Theory Lab Assess

1

The Marketing Process

Introduction to the Marketing Process, Marketing Process,

Strategic Challenge, Marketing Strategy with Data, Using Text

Analytics, Utilizing Data to Improve Marketing Strategy,

Improving the Marketing Process with Analytics, case study

7 - -

2

Metrics for Measuring Brand Assets

Intro to Metrics for Measuring Brand Assets, Snapple and

Brand Value, Developing Brand Personality, Developing Brand

Architecture, Brand Pyramid, Measuring Brand Value, Revenue

Premium as a Measure of Brand Equity, Calculating Brand

Value, case study

10 - -

3

Customer Lifetime Value

Customer Lifetime Value (CLV),Calculating CLV,

Understanding the CLV Formula, Applying the CLV Formula,

Extending the CLV Formula, Using CLV to Make Decisions,

CLV: A Forward Looking Measure, case study

10 - -

4

Marketing Experiments

Spreadsheet with Formulas, Determining Cause and Effect

through Experiments, Designing Basic Experiments, Designing

Before - After Experiments, Designing Full Factorial Web

Experiments, Designing an Experiment, Analyzing an

Experiment, Projecting Lift, Calculating Projected Lift, Pitfalls

of Marketing Experiments, Maximizing Effectiveness, case

study

10 - -

5

Regression Basics

Using Regression Analysis, What Regressions Reveal,

Interpreting Regression Outputs, Multivariable Regressions,

Omitted Variable Bias, Using Price Elasticity to Evaluate

Marketing, Understanding Log-Log Models, Marketing Mix

Models

8

Prepared By

Archana Varade

Assistant Professor

Checked By

Pradnya Mahadik BOS Chairman

Approved By

Dr. Sudhir Gavhane Dean

Page 120: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

`

COURSE STRUCTURE

Course Code MIT-WPU-MBD-2207

Course Category Elective Big Data Analytics

Course Title Human Resource Analytics

Teaching Scheme and Credits

Weekly load hrs

L T Laboratory Credits

4 - - 3

Pre-requisites:

Knowledge of any scripting language, XML.

Course Objectives:

1. To introduce use of data analytics techniques in HR

Course Outcomes:

1. Students will able to use data analytics technique in HR.

Course Contents:

HR Analytics in perspective

Introduction to role of data analytics in HR

A day in the life of HR

Introduction to daily activities of HR using case study

An analytics method

Describes challenges in HR and solution to it using data analytics

Hands-on introduction to HRA

A practical approach to collect and clean data required for HRA.

Toolkits

Introduction to various toolkits required for HRA.

Data challenges

Introduction to statistical methods for processing of data.

Dr. Sudhir Gavhane

Dean, LASC

Page 121: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

`

Making HR data operational

Use of HR data for analysis

Predictive analytics

Introduction to use of predictive analysis for HR data .

Learning Resources:

Reference Books:

1. The New HR Analytics: Predicting the Economic Value of Your Company's Human

By Jac FITZ-ENZ

2. Predictive HR Analytics: Mastering the HR Metric By Dr Martin R. Edwards,

Kirsten Edwards

3. Predictive Analytics for Human Resources By Jac Fitz-enz, John Mattox, II

Supplementary Reading:

1. Applying Advanced Analytics to HR Management Decisions : Methods for

Selection, Developing Incentives and Improving Collaboration First Edition

(English, Paperback, James C. Sesil)

Web Resources:

Weblinks:

1. MOOCs:

Pedagogy: Participative learning, discussions, Problem Solving, experiential learning through

practical problem solving, assignment, PowerPoint presentation

Assessment Scheme:

Class Continuous Assessment (CCA) 50 Marks

Assignments Test Presentations Case study MCQ Oral Attendance

20 10 10 - - 10

Term End Examination : 50 Marks

Dr. Sudhir Gavhane

Dean, LASC

Page 122: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

`

Syllabus:

Module

No. Contents

Workload in Hrs

Theory Lab Assess

1

HR Analytics in perspective

Analytics roles

Defining HR Analytics

Typical problems (working session)

4 - -

2 A day in the life of HR

Case Examples 3 - -

3

An analytics method

Understanding the organizational system (Lean)

Locating the HR challenge in the system

Valuing HR Analytics (working session)

Understanding the organizational system

5 - -

4

Hands-on introduction to HRA

Typical data sources

Typical questions faced (survey)

Typical data issues

Connecting HR Analytics to business benefit (3 x case studies)

Techniques for establishing questions

Building support and interest

Obtaining data

Cleaning data (exercise)

Supplementing data

9 - -

5

Toolkits

Options, advantages and disadvantages

Common toolkits: OrgVue, Tableau, Excel, Alteryx, QlikView

Practical exercises

6 - -

6

Data challenges

Correlation (R2, ecological fallacy, 10 simple stats)

Causation

6 - -

7

Making HR data operational

Case examples

4 - -

8

Predictive analytics

When to use predictive analysis

Importance of innovation

What is “the organization as a system”?

Organization design

8

Dr. Sudhir Gavhane

Dean, LASC

Page 123: DR VISHWANATH KARAD MIT - WORLD PEACE UNIVERSITYmitwpu.edu.in/wp-content/uploads/2018/03/M.Sc_.-BigData-Syllabus.pdf · BCA, BE-IT, Comp., E&TC with 50% of Marks (45% marks aggregate

`

Process led design

Workforce planning

Transition management

Impact analysis

Communication

Real time HR Analytics

ggest the below items:

Prepared By

Ms. Punam Nikam Assistant Professor

Checked By

Ms. Pradnya Mahadik

BOS Chairman

Approved By

Dr. Sudhir Gavhane Dean, LASC