predire il futuro con machine learning & big data
TRANSCRIPT
Data Driven Innovation
Codemotion
Presentation title
Antimo Musone
IT Manager
20 Maggio 2016
Page 3
About Me
►Antimo Musone IT Manager / Architect presso EY
Co - Founder Fifth Ingenum Srls.
Ing. Informatica II Università degli Studi di Napoli
email: [email protected]
Page 4
Indice
►What is Machine Learning ?
►Predictive Analytics
►Machine Overview
►Defining Predictive Analytics
►Supervised Learning
►Unsupervised Learning
►Watson Service
► Cortana Analytics Suite
►Demo
Page 5
What is Machine Learning ?
Page 6
Machine Learning / Predictive Analytics
Vision Analytics
Recommenda-tion engines
Advertising analysis
Weather forecasting for business planning
Social network analysis
Legal discovery and document archiving
Pricing analysisFraud detection
Churn analysis
Equipment monitoring
Location-based tracking and services
Personalized Insurance
Machine learning & predictive analytics are core capabilities that are needed throughout your business
Page 7
Machine Learning Overview
► Formal definition: “The field of machine learning is concerned with the question of how to construct computer programs that automatically improve with experience” - Tom M. Mitchell
► Another definition: “The goal of machine learning is to program computers to use example data or past experience to solve a given problem.” – Introduction to Machine Learning, 2nd Edition, MIT Press
► ML often involves two primary techniques:
► Supervised Learning: Finding the mapping between inputs and outputs using correct values to “train” a model
► Unsupervised Learning: Finding patterns in the input data (similar to Density Estimates in Statistics)
Page 8
Machine Learning
Data: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Rules, or Algorithms:about, Learning, language – Spelling and sounding builds wordsLearning about language. – Words build sentences
Learning, or Abstraction:Any new understanding proceeds from previous knowledge.
Data + Rules/ Algorithms = Machine Learning
Page 9
Traditional programming VS Machine Learning
ComputerData
ProgramOutput
Traditional Programming
Data
OutputProgram/Algorithms
Machine Learning
Program can predict the output!
Computer
Page 10
ML : No, more like gardening
Gardener = You
Seeds = AlgorithmsNutrients = Data
Plants = Programs
Page 11
ML Sample Application
► Web search ► Computational biology► Finance► E-commerce► Space exploration► Robotics► Information extraction► Social networks► Debugging► [Your favorite area]
Page 12 Presentation title
What is Predictive Analytics?
Wikipedia Definition: (http://en.wikipedia.org/wiki/Predictive_analytics)
“Predictive analytics encompasses a variety of techniques from statistics, modeling, machine learning, and data mining that analyze current and historical facts to make predictions about future, or otherwise unknown, events.”
Facts Predictions Predictive Analytics Technique
s
Page 13 Presentation title
Breaking it Down
“Predictive analytics encompasses a variety of techniques from statistics, modeling, machine learning, and data mining that analyze current and historical facts to make predictions about future, or otherwise unknown, events.”
Machine Learning Use of computer algorithms to derive complex formulations based on objectives and constraints
Tools and Techniques Data visualization, segmentation, correlations
Use in Predictive Analytics Predictive analytics is often applied in the context of datasets that are too large for manual analysis, so data mining techniques are required
Statistics Focus on learning population characteristics based on samples of data
Tools and Techniques p-values, confidence intervals, sampling, ANOVA
Use in Predictive Analytics Underlying theory behind many parametric models – observed facts are a sample from a population including both known/historic and unknown/future events
Modeling Representations of systems used to understand the underlying dynamics of the system
Tools and Techniques Symbolic logic, proxies
Use in Predictive Analytics Complex relationships can be simplified through modeling – these models can then be used to analyze relationships between factors
Page 14 Presentation title
What is a Model?
A model is a simplified representation of observed effects Key terms: Dependent or target variable – the variable of interest Independent or predictor variable(s) – variable(s) used
for explanation/prediction Effect – the (quantitative) impact of an independent
variable or combination of independent variables on the dependent variable Main Effect – The direct effect of a single independent variable
on the dependent variable Interaction Effect – The effect of a combination of multiple
independent variables on the dependent variable
Page 15 Presentation title
Two types of model
A model is a simplified representation of observed effects
StatisticalParametric Models
Effects are well-quantified and can be examined
An equation can be used to represent the model
Emphasis on explanation “What causes the dependent
variable to change?” Test hypotheses p-values, confidence intervals
Machine Learning Non-parametric models
Effects may be unquantified (“black box”)
No representative equation Model may be stochastic, so results
my vary Emphasis on prediction “What will the value of the next
observation be?” Generate hypotheses
Page 16
Types of Learning
► Supervised (inductive) learning► Training data includes desired outputs► Dependent variable is known► May be statistical or non-statistical
► Unsupervised learning► Training data does not include desired outputs► No dependent variable► Non-statistical
► Semi-supervised learning► Training data includes a few desired outputs
Page 17
Machine Learning Problem
Classification or Categorization Clustering
Regression Dimensionality reduction
Supervised Learning Unsupervised Learning
Dis
cret
eC
ontin
uous
Page 18
What is Logistic Regression?
Regression Models are a form of supervised learning that attempt to fit “linear” functions to training data – the most common type of regression, linear regression, should be familiar to most of you as a “best fit line”
Logistic Regression is closely related to linear regression, but fits a different shape function by using a binomial link function on the dependent variable
Page 19
Machine Learning Example
Predict function F(X) for new examples XDiscrete F(X): ClassificationContinuous F(X): Regression
F(X) = Probability(X): Probability estimation
Given examples of a function (X, F(X))
The probability of an event X, denoted F(X), represents the proportion of all events that have X as their outcome, and is typically represented as a decimal 0<P(X)<1
Page 20
Machine Learning Example
Apply a prediction function to a feature representation of the image to get the desired output:
• Training: given a training set of labeled examples {(x1,y1), …, (xN,yN)}, estimate the prediction function f by minimizing the prediction error on the training set
• Testing: apply f to a never before seen test example x and output the predicted value y = f(x)
output prediction function
Image feature
y = f(x)F( ) = «apple»
F( ) =«tomato»
F( ) = «dog»
Page 21
Supervised Learning
Used when you want to predict unknown answers from answers you already have
Data is divided into two parts: the data you will use to “teach” the system (data set), and the data to test the algorithm (test set)
After you select and clean the data, you select data points that show the right relationships in the data. The answers are “labels”, the categories/columns/attributes are “features” and the values are…values.
Then you select an algorithm to compute the outcome. (Often you choose more than one)
You run the program on the data set, and check to see if you got the right answer from the test set.
Once you perform the experiment, you select the best model. This is the final output – the model is then used against more data to get the answers you need
Page 22
Supervised Learning
Car
Not Car
Page 23
Unsupervised Learning
Used when you want to find unknown answers –
mostly groupings - directly from data
No simple way to evaluate accuracy of what you learn
Evaluates more vectors, groups into sets or classifications
Start with the data
Apply algorithm
Evaluate groups
Page 24
Unsupervised Learning
Example 1 example A Example 2 example B Example 3 example C
example A example B example CExample 1 Example 2 Example 3
The clustering strategies have more tendency to transitively group points even if they are not nearby in feature space
Page 25
Cross-Validation and Model Evaluation
Cross-validation is a method of ensuring that models generalize to data they have not been trained to fit
Given any collection of data points, a model can be developed that fits the data exactly; however, this model will have no predictive power
Page 26 Presentation title
Evaluating Predictive Models
Model evaluation involves a combination of objective criteria and subjective judgment
Objective Measures
Gain or Lift
Sensitivity
Accuracy
Others
Subjective Considerations
Business intuition
Explainability
Simplicity
Usefulness
Page 27
Gain or Lift
Lift is a measure of the effectiveness of a predictive model calculated as the ratio between the results obtained with and without the predictive model.
Cumulative gains and lift charts are visual aids for measuring model performance Both charts consist of a lift curve and a baseline The greater the area between the lift curve and the baseline, the
better
Page 28
Sensitivity
A Receiver Operating Characteristic (ROC) curve is a plot of test sensitivity as a function of (1 - specificity) for several possible (arbitrary) cut off values. The curve illustrates the trade off between type I and type II errors in a given test.
The closer the curve follows the left-hand border and then the top border of the ROC space, the more accurate the test, and the area under the curve is a measure of accuracy.
Page 29
IBM Watson
Page 30
Cognitive Services
Page 31
Cortana Suite
Page 32
Cortana Analytics Suite
DATA
Business apps
Custom apps
Sensors and devices
ACTION
People
Automated Systems
INTELLIGENCE
Cortana Analytics
Page 33
Data Flow and Architecture
Stream Analytics
TransformIngest
Web logs
Present & decide
IoT, Mobile Devices etc.
Social Data
Event Hubs HDInsight
Azure Data Factory
Azure SQL DB
Azure Data Lake
Azure Machine Learning
(Fraud detection etc.)
Power BI
Web dashboards
Mobile devices
DW / Long-term storage
Predictive analytics
Event & data producers
Azure SQL DW
Page 34
Process real-time data in Azure using a simple SQL language
Consumes millions of real-time events from Event Hub collected from devices, sensors, infrastructure, and applications
Performs time-sensitive analysis using SQL-like language against multiple real-time streams and reference data
Outputs to persistent stores, dashboards or back to devices
Point of Service Devices
Self CheckoutStations
Kiosks
Smart Phones
Slates/Tablets
PCs/Laptops
Servers
Digital Signs
DiagnosticEquipmentRemote Medical
MonitorsLogic
Controllers
SpecializedDevicesThin
Clients
Handhelds
Security
POS Terminals
AutomationDevices
VendingMachines
Kinect
ATM
Stream Analytics
Azure Stream Analytics
Page 35
Fully managed service to support orchestration of data movement and processing
Connect to relational or non-relational data that is on-premises or in the cloud
Single pane of glass to monitor and manage data processing pipelines.
Publish to Power BI
Compose and orchestrate data services at scale
No SQL
DB
Blob
C#
MapReduceTrusted data
BI & analyticsHivePig
Stored Procedures
VM
Machine Learning
Azure Data Factory
Page 36
ML Algorithms are best of breed and embrace OSS• MS + R + Python + BYOA
ML Studio for productive development• Faster experiments results in faster improvements• Visual Workflows & ML Experiments
ML Operationalization to remove deployment friction• Build entire ML Apps & Deploy as Cloud APIs
ML Gallery• Provide ML applications like apps in an ‘app store’• Publish/consume APIs in a 2 sided market
Help organizations eliminate undifferentiated heavy lifting
Powerful predictive analytics in Azure
Azure Machine Learning
Page 37
Power BI investments
New data visualizations and touch-optimized exploration in HTML5
Power BI mobile apps across devices including iPad and iPhone
Support for new data sources including SalesForce.com, Dynamics CRM online and SQL Server Analysis Services
Dashboard
Tree Map
Power BI dashboards and KPIs for monitoring the health of your business
Page 38
Demo Cognitive
Page 39
Demo Cortana Suite
Page 40
Vehicle Telemetry Architecture
Event Hubs for ingesting millions of vehicle telemetry events into Azure.
Stream Analytics for gaining real-time insights on vehicle health and persists that data into long-term storage for richer batch analytics.
Machine Learning for anomaly detection in real-time and batch processing to gain predictive insights.
HDInsight is leveraged to transform data at scale
Data Factory handles orchestration, scheduling, resource management and monitoring of the batch processing pipeline.
Power BI gives this solution a rich dashboard for real-time data and predictive analytics visualizations.
Page 41
Microsft Azure Learning Machine
Data It’s all about the data. Here’s where you will acquire, compile, and analyze testing and training data sets for use in creating Azure Machine Learning predictive models.
Create the model Use various machine learning algorithms to create new models that are capable of making predictions based on inferences about the data sets.
Evaluate the model Examine the accuracy of new predictive models based on ability to predict the correct outcome, when both the input and output values are known in advance. Accuracy is measured in terms of confidence factor approaching the whole number one.
Refine and evaluate the model Compare, contrast, and combine alternate predictive models to find the right combination(s) that can consistently produce the most accurate results.
Deploy the model Expose the new predictive model as a scalable cloud web service, one that is easily accessible over the Internet by any web browser or mobile client.
Test and use the model Implement the new predictive model web service in a test or production application scenario.
Page 42
Azure Machine Learning algorithms
Classification algorithms These are used to classify data into
different categories that can then be used to predict one or more
discrete variables, based on the other attributes in the dataset.
Regression algorithms These are used to predict one or more
continuous variables, such as profit or loss, based on other attributes
in the dataset.
Clustering algorithms These determine natural groupings and
patterns in datasets and are used to predict grouping classifications
for a given variable.
Page 43
Thanks
► Questions?