data analytics workshop - analysts · 2019-01-03 · by various datasets of different types...

4
BIG DATA ANALYTICS WORKSHOP FOR ANALYSTS Endorsed by 12 - 14 March, 2019

Upload: nguyendang

Post on 09-Apr-2019

213 views

Category:

Documents


0 download

TRANSCRIPT

BIG DATA ANALYTICS WORKSHOP

FOR ANALYSTS Endorsed by

12 - 14 March, 2019

BIG DATA ANALYTICS WORKSHOP

FOR ANALYSTS

topics include:

WHAT ARE THE MAIN STATISTICAL METHODS AND APPROACHES

USED IN EXAMINING BIG DATA?

WHAT IS SUPERVISED LEARNING?

WHAT IS UNSUPERVISED LEARNING?

Note: Participants are expected to have an adequate knowledge of statistics.

Thus, RIT Dubai will be conducting evening statistics classes a week prior to

the workshops. These classes will be conducted over 3 days which will be

coordinated with participants as the dates approach.

PROGRAM OVERVIEWThe Big Data Analy�cs Workshop for Data Scien�sts & Analysts offered by RIT Dubai provides training in data science to professionals in diverse fields. The workshop covers sta�s�cal methods for data analy�cs, machinelearning and ar�ficial intelligence.

3 Days Workshop

DAY 01

Introduc�on to the R environment for sta�s�cal machine learning and data mining. Understandingh�p://www.r-project.org and installing the latest R version.Installing R studio as the main dashboard for data science hands-on explora�on.Installing Ra�le as a convenient and efficient graphical user interface frontend for R.Discovering R commander, yet another GUI for Data Science func�ons in R.Installing R packages and R CRAN View for Machine Learning and Data Science.Ge�ng familiar with Sci-Kit Learn, the Python gateway to Machine Learning.Among other things: descrip�ve and inferen�al sta�s�cs, sta�s�cal tests, regression, �me series, sta�s�cal modeling and fi�ng, Predic�ve analy�cs.

WHAT ARE THE MAIN STATISTICAL METHODS AND APPROACHES USED IN EXAMINING BIG DATA?

This module provides a hands-on introduc�on to compu�ng environments for data science with a compelling overview of most of the state of the art techniques and methods of machine learning. The explora�on is driven by various datasets of different types carefully chosen to help mo�vate methods of analysis, from basic sta�s�cs to regulariza�on to ensemble learning, with interpreta�on and op�mal predic�on covered.

DAY 02

Fundamentals of supervised learning as Func�on Es�ma�on in low, medium and high dimensional spaces: Regression learning and classifica�on learning.Installa�on and Explora�on of the Machine Learning CRANView.Explora�on of methods of regression learning with R via RStudio, Ra�le and R commander, then with Python via Sci-Kit Learn: kNearest Neighbors regression, Linear regression models, regression tree models, Support Vector Regression.Explora�on of methods of regression learning with R via RStudio, Ra�le and R commander, then with Python via Sci-Kit Learn: kNearest Neighbors, Discriminant Analysis Models, Logis�c regression, boos�ng, random forest, support vector machines, neural networks.Special techniques for high dimensional data.

WHAT IS SUPERVISED LEARNING?

This module introduces techniques and algorithms used in supervised machine learning, along with the funda-mental concepts of sta�s�cal machine learning like: func�on space, loss func�ons, risk func�onals, empirical risk, likelihood, score func�on, Bayes risk, misclassifica�on rate, accuracy, precision, confusion matrix, training error, test error, cross valida�on, cross valida�on error, predic�on error, decision boundary, overfi�ng, underfit-�ng, bias-variance dilemma, bias-variance trade-off, receiver opera�ng characteris�c (ROC) curve, op�mism of the training error, op�miza�on.

DAY 03WHAT IS UNSUPERVISED LEARNING?

This module introduces methods and algorithms used in unsupervised machine learning and presents applica-�ons of ar�ficial intelligence. This module uses some key R packages contained in CRANviews, like clusteringand machine learning to explore some of the most common unsupervised learning techniques

Fundamentals of Unsupervised Learning as Latent Variable Modelling, with Con�nuous and Discrete Latent Spaces.Defini�on and methods of unsupervised machine learning, clustering.Ubiquitous Principal Component Analysis (PCA), Singular Value Decomposi�on (SVD).Dimensionality reduc�on, feature extrac�on, recommended systems, text mining, K-Means, Fuzzy clustering, hierarchical clustering, Par��oning Around Medoids (PAM).Gaussian Mixture Models for Clustering and Density Es�ma�on.Clustering Valida�on and Clustering Visualiza�on in R.Exploring basics of heatmaps for visualizing high dimensional data.Applica�ons of ar�ficial intelligence in machine vision, natural language processing, expert systems, gaming, self-teaching systems, intelligent robots.

DR. ERNEST FOKOUESUBJECT MATTER EXPERT

Associate Professor, School of Mathema�cal Sciences – RIT New York

Dr. Ernest Fokoue earned his Ph.D. in Sta�s�cs from University of Glasgow, United Kingdom. He is an Associate Professor in the School of Mathema�cal Sciences at RIT and prior to joining RIT he was a faculty member in the Mathema�cs department at Ke�ering University in Flint,