01 machine learning introduction

181
Machine Learning for Data Mining Introduction Andres Mendez-Vazquez May 13, 2015 1 / 56

Upload: andres-mendez-vazquez

Post on 16-Apr-2017

145 views

Category:

Engineering


5 download

TRANSCRIPT

Page 1: 01 Machine Learning Introduction

Machine Learning for Data MiningIntroduction

Andres Mendez-Vazquez

May 13, 2015

1 / 56

Page 2: 01 Machine Learning Introduction

Outline1 Why are we interested in Analyzing Data?

Intuitive Definition: The 3V’sComplexityData Everywhere

2 Machine LearningMachine Learning ProcessFeaturesClassificationClustering Analysis

3 Data MiningDefinitionApplicationsExample: Frequent Itemsets

4 Hardware SupportASICSGPU’s

5 ProjectsWhat projects can you do?

2 / 56

Page 3: 01 Machine Learning Introduction

Outline1 Why are we interested in Analyzing Data?

Intuitive Definition: The 3V’sComplexityData Everywhere

2 Machine LearningMachine Learning ProcessFeaturesClassificationClustering Analysis

3 Data MiningDefinitionApplicationsExample: Frequent Itemsets

4 Hardware SupportASICSGPU’s

5 ProjectsWhat projects can you do?

3 / 56

Page 4: 01 Machine Learning Introduction

Intuitive Definition: Volume

When looking at the Volumes of Information, we have:Volumes of it: Terabyte(1012), Petabyte(1015) and UP!!!

Examples of these Volumes are1 Records2 Transactions3 Web Searches4 etc

4 / 56

Page 5: 01 Machine Learning Introduction

Intuitive Definition: Volume

When looking at the Volumes of Information, we have:Volumes of it: Terabyte(1012), Petabyte(1015) and UP!!!

Examples of these Volumes are1 Records2 Transactions3 Web Searches4 etc

4 / 56

Page 6: 01 Machine Learning Introduction

Intuitive Definition: Volume

When looking at the Volumes of Information, we have:Volumes of it: Terabyte(1012), Petabyte(1015) and UP!!!

Examples of these Volumes are1 Records2 Transactions3 Web Searches4 etc

4 / 56

Page 7: 01 Machine Learning Introduction

Intuitive Definition: Volume

When looking at the Volumes of Information, we have:Volumes of it: Terabyte(1012), Petabyte(1015) and UP!!!

Examples of these Volumes are1 Records2 Transactions3 Web Searches4 etc

4 / 56

Page 8: 01 Machine Learning Introduction

Intuitive Definition: Volume

When looking at the Volumes of Information, we have:Volumes of it: Terabyte(1012), Petabyte(1015) and UP!!!

Examples of these Volumes are1 Records2 Transactions3 Web Searches4 etc

4 / 56

Page 9: 01 Machine Learning Introduction

However

Something NotableWhat constitutes truly “high” volume varies by industry and evengeography!!!

Simply look at the DNA data for a cellular cycle.

Example

5 / 56

Page 10: 01 Machine Learning Introduction

HoweverSomething NotableWhat constitutes truly “high” volume varies by industry and evengeography!!!

Simply look at the DNA data for a cellular cycle.

Example

5 / 56

Page 11: 01 Machine Learning Introduction

Intuitive Definition: Variety

When looking at the Structure of the Information, we have:Variety like there is not tomorrow:

I It is structured, semi-structured, unstructured

SoDo you have some examples of structures in Information?

6 / 56

Page 12: 01 Machine Learning Introduction

Intuitive Definition: Variety

When looking at the Structure of the Information, we have:Variety like there is not tomorrow:

I It is structured, semi-structured, unstructured

SoDo you have some examples of structures in Information?

6 / 56

Page 13: 01 Machine Learning Introduction

Intuitive Definition: Variety

When looking at the Structure of the Information, we have:Variety like there is not tomorrow:

I It is structured, semi-structured, unstructured

SoDo you have some examples of structures in Information?

6 / 56

Page 14: 01 Machine Learning Introduction

Intuitive Definition: Volume

When Looking at the Velocity of this Information?Data in Motion!!!Velocity:

I Dynamic GenerationI Real Time Generation

Problems with that: LatencyLag time between capture or generation and when it is available!!!

7 / 56

Page 15: 01 Machine Learning Introduction

Intuitive Definition: Volume

When Looking at the Velocity of this Information?Data in Motion!!!Velocity:

I Dynamic GenerationI Real Time Generation

Problems with that: LatencyLag time between capture or generation and when it is available!!!

7 / 56

Page 16: 01 Machine Learning Introduction

Intuitive Definition: Volume

When Looking at the Velocity of this Information?Data in Motion!!!Velocity:

I Dynamic GenerationI Real Time Generation

Problems with that: LatencyLag time between capture or generation and when it is available!!!

7 / 56

Page 17: 01 Machine Learning Introduction

Intuitive Definition: Volume

When Looking at the Velocity of this Information?Data in Motion!!!Velocity:

I Dynamic GenerationI Real Time Generation

Problems with that: LatencyLag time between capture or generation and when it is available!!!

7 / 56

Page 18: 01 Machine Learning Introduction

Intuitive Definition: Volume

When Looking at the Velocity of this Information?Data in Motion!!!Velocity:

I Dynamic GenerationI Real Time Generation

Problems with that: LatencyLag time between capture or generation and when it is available!!!

7 / 56

Page 19: 01 Machine Learning Introduction

For example

Imagine that I have a stream of m = 1025 integers with Ranges from[a1, ..., an] with n = 10, 000, 000Now, somebody ask you to find the most frequent item!!!

A naive algorithm1 Take hash table with a counter.2 Then, put numbers in the hash table.

ProblemsWhich problems we have?

8 / 56

Page 20: 01 Machine Learning Introduction

For example

Imagine that I have a stream of m = 1025 integers with Ranges from[a1, ..., an] with n = 10, 000, 000Now, somebody ask you to find the most frequent item!!!

A naive algorithm1 Take hash table with a counter.2 Then, put numbers in the hash table.

ProblemsWhich problems we have?

8 / 56

Page 21: 01 Machine Learning Introduction

For example

Imagine that I have a stream of m = 1025 integers with Ranges from[a1, ..., an] with n = 10, 000, 000Now, somebody ask you to find the most frequent item!!!

A naive algorithm1 Take hash table with a counter.2 Then, put numbers in the hash table.

ProblemsWhich problems we have?

8 / 56

Page 22: 01 Machine Learning Introduction

However

There is theCount-Min Sketch Algorithm

Invented byCharikar, Chen and Farch-Colton in 2004

With PropertiesSpace Used Error Probability Error

O(

1ε log

(1δ

)· (logm + log n)

)δ ε

9 / 56

Page 23: 01 Machine Learning Introduction

However

There is theCount-Min Sketch Algorithm

Invented byCharikar, Chen and Farch-Colton in 2004

With PropertiesSpace Used Error Probability Error

O(

1ε log

(1δ

)· (logm + log n)

)δ ε

9 / 56

Page 24: 01 Machine Learning Introduction

However

There is theCount-Min Sketch Algorithm

Invented byCharikar, Chen and Farch-Colton in 2004

With PropertiesSpace Used Error Probability Error

O(

1ε log

(1δ

)· (logm + log n)

)δ ε

9 / 56

Page 25: 01 Machine Learning Introduction

Outline1 Why are we interested in Analyzing Data?

Intuitive Definition: The 3V’sComplexityData Everywhere

2 Machine LearningMachine Learning ProcessFeaturesClassificationClustering Analysis

3 Data MiningDefinitionApplicationsExample: Frequent Itemsets

4 Hardware SupportASICSGPU’s

5 ProjectsWhat projects can you do?

10 / 56

Page 26: 01 Machine Learning Introduction

Complexity

Given all these thingsIt is necessary to correlate and share data across entities.It is necessary to link, match and transform data across businessentities and systems.

With this...Complexity goes through the roof!!!

11 / 56

Page 27: 01 Machine Learning Introduction

Complexity

Given all these thingsIt is necessary to correlate and share data across entities.It is necessary to link, match and transform data across businessentities and systems.

With this...Complexity goes through the roof!!!

11 / 56

Page 28: 01 Machine Learning Introduction

Complexity

Given all these thingsIt is necessary to correlate and share data across entities.It is necessary to link, match and transform data across businessentities and systems.

With this...Complexity goes through the roof!!!

11 / 56

Page 29: 01 Machine Learning Introduction

And it is through the roof!!! Linking open-data communityproject

12 / 56

Page 30: 01 Machine Learning Introduction

Cautionary Tale

Something NotableIn 1880 the USA made a Census of the Population in different aspects:

I PopulationI MortalityI AgricultureI Manufacturing

HoweverOnce data was collected it took 7 years to say something!!!

13 / 56

Page 31: 01 Machine Learning Introduction

Cautionary Tale

Something NotableIn 1880 the USA made a Census of the Population in different aspects:

I PopulationI MortalityI AgricultureI Manufacturing

HoweverOnce data was collected it took 7 years to say something!!!

13 / 56

Page 32: 01 Machine Learning Introduction

Cautionary Tale

Something NotableIn 1880 the USA made a Census of the Population in different aspects:

I PopulationI MortalityI AgricultureI Manufacturing

HoweverOnce data was collected it took 7 years to say something!!!

13 / 56

Page 33: 01 Machine Learning Introduction

Cautionary Tale

Something NotableIn 1880 the USA made a Census of the Population in different aspects:

I PopulationI MortalityI AgricultureI Manufacturing

HoweverOnce data was collected it took 7 years to say something!!!

13 / 56

Page 34: 01 Machine Learning Introduction

Cautionary Tale

Something NotableIn 1880 the USA made a Census of the Population in different aspects:

I PopulationI MortalityI AgricultureI Manufacturing

HoweverOnce data was collected it took 7 years to say something!!!

13 / 56

Page 35: 01 Machine Learning Introduction

Cautionary Tale

Something NotableIn 1880 the USA made a Census of the Population in different aspects:

I PopulationI MortalityI AgricultureI Manufacturing

HoweverOnce data was collected it took 7 years to say something!!!

13 / 56

Page 36: 01 Machine Learning Introduction

Cautionary Tale

Something NotableIn 1880 the USA made a Census of the Population in different aspects:

I PopulationI MortalityI AgricultureI Manufacturing

HoweverOnce data was collected it took 7 years to say something!!!

13 / 56

Page 37: 01 Machine Learning Introduction

Ahhh...

Thus, Hollering came with the following machine (Circa 1890)!!!

14 / 56

Page 38: 01 Machine Learning Introduction

Hollering Tabulating Machine

It was basically a sorter and counterUsing punching cards as memories.And Mercury Sensors.

Example

15 / 56

Page 39: 01 Machine Learning Introduction

Hollering Tabulating Machine

It was basically a sorter and counterUsing punching cards as memories.And Mercury Sensors.

Example

15 / 56

Page 40: 01 Machine Learning Introduction

It was FAST!!!

It took only!!!

2 years!!!Nevertheless in 1837Babbage’s Difference engine was

The First General Computer!!!Turing-complete!!!Way more complex than the tabulator!!! 53 years earlier!!!

16 / 56

Page 41: 01 Machine Learning Introduction

It was FAST!!!

It took only!!!

2 years!!!Nevertheless in 1837Babbage’s Difference engine was

The First General Computer!!!Turing-complete!!!Way more complex than the tabulator!!! 53 years earlier!!!

16 / 56

Page 42: 01 Machine Learning Introduction

It was FAST!!!

It took only!!!

2 years!!!Nevertheless in 1837Babbage’s Difference engine was

The First General Computer!!!Turing-complete!!!Way more complex than the tabulator!!! 53 years earlier!!!

16 / 56

Page 43: 01 Machine Learning Introduction

It was FAST!!!

It took only!!!

2 years!!!Nevertheless in 1837Babbage’s Difference engine was

The First General Computer!!!Turing-complete!!!Way more complex than the tabulator!!! 53 years earlier!!!

16 / 56

Page 44: 01 Machine Learning Introduction

It was FAST!!!

It took only!!!

2 years!!!Nevertheless in 1837Babbage’s Difference engine was

The First General Computer!!!Turing-complete!!!Way more complex than the tabulator!!! 53 years earlier!!!

16 / 56

Page 45: 01 Machine Learning Introduction

Funny!!!

Funny!!!

17 / 56

Page 46: 01 Machine Learning Introduction

The Problem

Actually, it never reached completion because

Babbage was actually a yucky project manager!!!

18 / 56

Page 47: 01 Machine Learning Introduction

The Problem

Actually, it never reached completion because

Babbage was actually a yucky project manager!!!

18 / 56

Page 48: 01 Machine Learning Introduction

Outline1 Why are we interested in Analyzing Data?

Intuitive Definition: The 3V’sComplexityData Everywhere

2 Machine LearningMachine Learning ProcessFeaturesClassificationClustering Analysis

3 Data MiningDefinitionApplicationsExample: Frequent Itemsets

4 Hardware SupportASICSGPU’s

5 ProjectsWhat projects can you do?

19 / 56

Page 49: 01 Machine Learning Introduction

Data is Everywhere!

Lots of data is being collected and warehousedWeb data, e-commercePurchases at department/ grocery storesBank/Credit Card transactionsSocial Network

Many Places

20 / 56

Page 50: 01 Machine Learning Introduction

Data is Everywhere!

Lots of data is being collected and warehousedWeb data, e-commercePurchases at department/ grocery storesBank/Credit Card transactionsSocial Network

Many Places

20 / 56

Page 51: 01 Machine Learning Introduction

Data is Everywhere!

Lots of data is being collected and warehousedWeb data, e-commercePurchases at department/ grocery storesBank/Credit Card transactionsSocial Network

Many Places

20 / 56

Page 52: 01 Machine Learning Introduction

Data is Everywhere!

Lots of data is being collected and warehousedWeb data, e-commercePurchases at department/ grocery storesBank/Credit Card transactionsSocial Network

Many Places

20 / 56

Page 53: 01 Machine Learning Introduction

Data is Everywhere!

Lots of data is being collected and warehousedWeb data, e-commercePurchases at department/ grocery storesBank/Credit Card transactionsSocial Network

Many Places

20 / 56

Page 54: 01 Machine Learning Introduction

The Staggering Numbers

A Ocean of DataHow many data in the world?

I 800 Terabytes, 2000I 160 Exabytes, 2006I 500 Exabytes (Internet), 2009I 2.7 Zettabytes, 2012I 35 Zettabytes by 2020

GenerationHow many data generated ONEday?

I 7 TB, TwitterI 10 TB, Facebook

Source: “Big data: The next frontier for innovation, competition, and pro-ductivity”McKinsey Global Institute 2011 21 / 56

Page 55: 01 Machine Learning Introduction

The Staggering Numbers

A Ocean of DataHow many data in the world?

I 800 Terabytes, 2000I 160 Exabytes, 2006I 500 Exabytes (Internet), 2009I 2.7 Zettabytes, 2012I 35 Zettabytes by 2020

GenerationHow many data generated ONEday?

I 7 TB, TwitterI 10 TB, Facebook

Source: “Big data: The next frontier for innovation, competition, and pro-ductivity”McKinsey Global Institute 2011 21 / 56

Page 56: 01 Machine Learning Introduction

The Staggering Numbers

A Ocean of DataHow many data in the world?

I 800 Terabytes, 2000I 160 Exabytes, 2006I 500 Exabytes (Internet), 2009I 2.7 Zettabytes, 2012I 35 Zettabytes by 2020

GenerationHow many data generated ONEday?

I 7 TB, TwitterI 10 TB, Facebook

Source: “Big data: The next frontier for innovation, competition, and pro-ductivity”McKinsey Global Institute 2011 21 / 56

Page 57: 01 Machine Learning Introduction

The Staggering Numbers

A Ocean of DataHow many data in the world?

I 800 Terabytes, 2000I 160 Exabytes, 2006I 500 Exabytes (Internet), 2009I 2.7 Zettabytes, 2012I 35 Zettabytes by 2020

GenerationHow many data generated ONEday?

I 7 TB, TwitterI 10 TB, Facebook

Source: “Big data: The next frontier for innovation, competition, and pro-ductivity”McKinsey Global Institute 2011 21 / 56

Page 58: 01 Machine Learning Introduction

The Staggering Numbers

A Ocean of DataHow many data in the world?

I 800 Terabytes, 2000I 160 Exabytes, 2006I 500 Exabytes (Internet), 2009I 2.7 Zettabytes, 2012I 35 Zettabytes by 2020

GenerationHow many data generated ONEday?

I 7 TB, TwitterI 10 TB, Facebook

Source: “Big data: The next frontier for innovation, competition, and pro-ductivity”McKinsey Global Institute 2011 21 / 56

Page 59: 01 Machine Learning Introduction

The Staggering Numbers

A Ocean of DataHow many data in the world?

I 800 Terabytes, 2000I 160 Exabytes, 2006I 500 Exabytes (Internet), 2009I 2.7 Zettabytes, 2012I 35 Zettabytes by 2020

GenerationHow many data generated ONEday?

I 7 TB, TwitterI 10 TB, Facebook

Source: “Big data: The next frontier for innovation, competition, and pro-ductivity”McKinsey Global Institute 2011 21 / 56

Page 60: 01 Machine Learning Introduction

The Staggering Numbers

A Ocean of DataHow many data in the world?

I 800 Terabytes, 2000I 160 Exabytes, 2006I 500 Exabytes (Internet), 2009I 2.7 Zettabytes, 2012I 35 Zettabytes by 2020

GenerationHow many data generated ONEday?

I 7 TB, TwitterI 10 TB, Facebook

Source: “Big data: The next frontier for innovation, competition, and pro-ductivity”McKinsey Global Institute 2011 21 / 56

Page 61: 01 Machine Learning Introduction

The Staggering Numbers

A Ocean of DataHow many data in the world?

I 800 Terabytes, 2000I 160 Exabytes, 2006I 500 Exabytes (Internet), 2009I 2.7 Zettabytes, 2012I 35 Zettabytes by 2020

GenerationHow many data generated ONEday?

I 7 TB, TwitterI 10 TB, Facebook

Source: “Big data: The next frontier for innovation, competition, and pro-ductivity”McKinsey Global Institute 2011 21 / 56

Page 62: 01 Machine Learning Introduction

The Staggering Numbers

A Ocean of DataHow many data in the world?

I 800 Terabytes, 2000I 160 Exabytes, 2006I 500 Exabytes (Internet), 2009I 2.7 Zettabytes, 2012I 35 Zettabytes by 2020

GenerationHow many data generated ONEday?

I 7 TB, TwitterI 10 TB, Facebook

Source: “Big data: The next frontier for innovation, competition, and pro-ductivity”McKinsey Global Institute 2011 21 / 56

Page 63: 01 Machine Learning Introduction

Type of Data

ThusRelational Data (Tables/Transaction/Legacy Data)Text Data (Web)Semi-structured Data (XML)

And more...Graph Data

I Social Network, Semantic Web (RDF), . . .

Streaming DataI You can only scan the data once

22 / 56

Page 64: 01 Machine Learning Introduction

Type of Data

ThusRelational Data (Tables/Transaction/Legacy Data)Text Data (Web)Semi-structured Data (XML)

And more...Graph Data

I Social Network, Semantic Web (RDF), . . .

Streaming DataI You can only scan the data once

22 / 56

Page 65: 01 Machine Learning Introduction

Type of Data

ThusRelational Data (Tables/Transaction/Legacy Data)Text Data (Web)Semi-structured Data (XML)

And more...Graph Data

I Social Network, Semantic Web (RDF), . . .

Streaming DataI You can only scan the data once

22 / 56

Page 66: 01 Machine Learning Introduction

Type of Data

ThusRelational Data (Tables/Transaction/Legacy Data)Text Data (Web)Semi-structured Data (XML)

And more...Graph Data

I Social Network, Semantic Web (RDF), . . .

Streaming DataI You can only scan the data once

22 / 56

Page 67: 01 Machine Learning Introduction

Type of Data

ThusRelational Data (Tables/Transaction/Legacy Data)Text Data (Web)Semi-structured Data (XML)

And more...Graph Data

I Social Network, Semantic Web (RDF), . . .

Streaming DataI You can only scan the data once

22 / 56

Page 68: 01 Machine Learning Introduction

Type of Data

ThusRelational Data (Tables/Transaction/Legacy Data)Text Data (Web)Semi-structured Data (XML)

And more...Graph Data

I Social Network, Semantic Web (RDF), . . .

Streaming DataI You can only scan the data once

22 / 56

Page 69: 01 Machine Learning Introduction

Type of Data

ThusRelational Data (Tables/Transaction/Legacy Data)Text Data (Web)Semi-structured Data (XML)

And more...Graph Data

I Social Network, Semantic Web (RDF), . . .

Streaming DataI You can only scan the data once

22 / 56

Page 70: 01 Machine Learning Introduction

The Ever Growing Landscape

23 / 56

Page 71: 01 Machine Learning Introduction

Machine LearningDefinitionAlgorithms or techniques that enable computer (machine) to “learn” fromdata. Related with many areas such as data mining, statistics, informationtheory, etc.

Algorithm Types:Unsupervised LearningSupervised learningReinforcement learning

ExamplesArtificial Neural Network (ANN)Support Vector Machine (SVM)Expectation-Maximization (EM)Deterministic Annealing (DA)

24 / 56

Page 72: 01 Machine Learning Introduction

Machine LearningDefinitionAlgorithms or techniques that enable computer (machine) to “learn” fromdata. Related with many areas such as data mining, statistics, informationtheory, etc.

Algorithm Types:Unsupervised LearningSupervised learningReinforcement learning

ExamplesArtificial Neural Network (ANN)Support Vector Machine (SVM)Expectation-Maximization (EM)Deterministic Annealing (DA)

24 / 56

Page 73: 01 Machine Learning Introduction

Machine LearningDefinitionAlgorithms or techniques that enable computer (machine) to “learn” fromdata. Related with many areas such as data mining, statistics, informationtheory, etc.

Algorithm Types:Unsupervised LearningSupervised learningReinforcement learning

ExamplesArtificial Neural Network (ANN)Support Vector Machine (SVM)Expectation-Maximization (EM)Deterministic Annealing (DA)

24 / 56

Page 74: 01 Machine Learning Introduction

Machine LearningDefinitionAlgorithms or techniques that enable computer (machine) to “learn” fromdata. Related with many areas such as data mining, statistics, informationtheory, etc.

Algorithm Types:Unsupervised LearningSupervised learningReinforcement learning

ExamplesArtificial Neural Network (ANN)Support Vector Machine (SVM)Expectation-Maximization (EM)Deterministic Annealing (DA)

24 / 56

Page 75: 01 Machine Learning Introduction

Machine LearningDefinitionAlgorithms or techniques that enable computer (machine) to “learn” fromdata. Related with many areas such as data mining, statistics, informationtheory, etc.

Algorithm Types:Unsupervised LearningSupervised learningReinforcement learning

ExamplesArtificial Neural Network (ANN)Support Vector Machine (SVM)Expectation-Maximization (EM)Deterministic Annealing (DA)

24 / 56

Page 76: 01 Machine Learning Introduction

Machine LearningDefinitionAlgorithms or techniques that enable computer (machine) to “learn” fromdata. Related with many areas such as data mining, statistics, informationtheory, etc.

Algorithm Types:Unsupervised LearningSupervised learningReinforcement learning

ExamplesArtificial Neural Network (ANN)Support Vector Machine (SVM)Expectation-Maximization (EM)Deterministic Annealing (DA)

24 / 56

Page 77: 01 Machine Learning Introduction

Machine LearningDefinitionAlgorithms or techniques that enable computer (machine) to “learn” fromdata. Related with many areas such as data mining, statistics, informationtheory, etc.

Algorithm Types:Unsupervised LearningSupervised learningReinforcement learning

ExamplesArtificial Neural Network (ANN)Support Vector Machine (SVM)Expectation-Maximization (EM)Deterministic Annealing (DA)

24 / 56

Page 78: 01 Machine Learning Introduction

Machine LearningDefinitionAlgorithms or techniques that enable computer (machine) to “learn” fromdata. Related with many areas such as data mining, statistics, informationtheory, etc.

Algorithm Types:Unsupervised LearningSupervised learningReinforcement learning

ExamplesArtificial Neural Network (ANN)Support Vector Machine (SVM)Expectation-Maximization (EM)Deterministic Annealing (DA)

24 / 56

Page 79: 01 Machine Learning Introduction

Outline1 Why are we interested in Analyzing Data?

Intuitive Definition: The 3V’sComplexityData Everywhere

2 Machine LearningMachine Learning ProcessFeaturesClassificationClustering Analysis

3 Data MiningDefinitionApplicationsExample: Frequent Itemsets

4 Hardware SupportASICSGPU’s

5 ProjectsWhat projects can you do?

25 / 56

Page 80: 01 Machine Learning Introduction

Machine Learning Process

Process1 Feature Extraction/Feature Generation2 Clustering ≈ Class Identification ≈ Unsupervised Learning3 Classification ≈ Supervised Learning

Then...We start thinking: We need to process a lot of data...

Or...LARGE SCALE MACHINE LEARNING

26 / 56

Page 81: 01 Machine Learning Introduction

Machine Learning Process

Process1 Feature Extraction/Feature Generation2 Clustering ≈ Class Identification ≈ Unsupervised Learning3 Classification ≈ Supervised Learning

Then...We start thinking: We need to process a lot of data...

Or...LARGE SCALE MACHINE LEARNING

26 / 56

Page 82: 01 Machine Learning Introduction

Machine Learning Process

Process1 Feature Extraction/Feature Generation2 Clustering ≈ Class Identification ≈ Unsupervised Learning3 Classification ≈ Supervised Learning

Then...We start thinking: We need to process a lot of data...

Or...LARGE SCALE MACHINE LEARNING

26 / 56

Page 83: 01 Machine Learning Introduction

Outline1 Why are we interested in Analyzing Data?

Intuitive Definition: The 3V’sComplexityData Everywhere

2 Machine LearningMachine Learning ProcessFeaturesClassificationClustering Analysis

3 Data MiningDefinitionApplicationsExample: Frequent Itemsets

4 Hardware SupportASICSGPU’s

5 ProjectsWhat projects can you do?

27 / 56

Page 84: 01 Machine Learning Introduction

Feature Generation/Dimensionality Reduction

Feature GenerationGiven a set of measurements, the goal is to discover compact andinformative representations of the obtained data.

Examples1 The Karhunen–Loève transform ≈ Principal Component Analysis

1 Popular for feature generation and Dimensionality Reduction2 The Singular Value Decomposition

1 Used for Dimensionality Reduction

28 / 56

Page 85: 01 Machine Learning Introduction

Feature Generation/Dimensionality Reduction

Feature GenerationGiven a set of measurements, the goal is to discover compact andinformative representations of the obtained data.

Examples1 The Karhunen–Loève transform ≈ Principal Component Analysis

1 Popular for feature generation and Dimensionality Reduction2 The Singular Value Decomposition

1 Used for Dimensionality Reduction

28 / 56

Page 86: 01 Machine Learning Introduction

Feature Generation/Dimensionality Reduction

Feature GenerationGiven a set of measurements, the goal is to discover compact andinformative representations of the obtained data.

Examples1 The Karhunen–Loève transform ≈ Principal Component Analysis

1 Popular for feature generation and Dimensionality Reduction2 The Singular Value Decomposition

1 Used for Dimensionality Reduction

28 / 56

Page 87: 01 Machine Learning Introduction

Feature Generation/Dimensionality Reduction

Feature GenerationGiven a set of measurements, the goal is to discover compact andinformative representations of the obtained data.

Examples1 The Karhunen–Loève transform ≈ Principal Component Analysis

1 Popular for feature generation and Dimensionality Reduction2 The Singular Value Decomposition

1 Used for Dimensionality Reduction

28 / 56

Page 88: 01 Machine Learning Introduction

Dimension Reduction/Feature Extraction

DefinitionProcess to transform high-dimensional data into low-dimensional ones forimproving accuracy, understanding, or removing noises.

Why?Curse of dimensionality: Complexity grows exponentially in volume byadding extra dimensions.

29 / 56

Page 89: 01 Machine Learning Introduction

Dimension Reduction/Feature Extraction

DefinitionProcess to transform high-dimensional data into low-dimensional ones forimproving accuracy, understanding, or removing noises.

Why?Curse of dimensionality: Complexity grows exponentially in volume byadding extra dimensions.

29 / 56

Page 90: 01 Machine Learning Introduction

Feature SelectionFeature SelectionWhich features should be used for the classifier?

Why? The Curse of Dimensionality!!!

Hypothesis Testing to discriminate good features

30 / 56

Page 91: 01 Machine Learning Introduction

Feature SelectionFeature SelectionWhich features should be used for the classifier?

Why? The Curse of Dimensionality!!!

Hypothesis Testing to discriminate good features30 / 56

Page 92: 01 Machine Learning Introduction

What can be done?

Measures for Class SeparabilityExample: Between-class scatter matrix:

Sb =M∑

i=1Pi (µi − µ0) (µi − µ0)T (1)

Where:µ0 is the global mean vector, µ0 =

∑Mi=1 Piµi .

µi the median of class ωi .Pi ∼= ni

N .

31 / 56

Page 93: 01 Machine Learning Introduction

What can be done?

Measures for Class SeparabilityExample: Between-class scatter matrix:

Sb =M∑

i=1Pi (µi − µ0) (µi − µ0)T (1)

Where:µ0 is the global mean vector, µ0 =

∑Mi=1 Piµi .

µi the median of class ωi .Pi ∼= ni

N .

31 / 56

Page 94: 01 Machine Learning Introduction

What can be done?

Measures for Class SeparabilityExample: Between-class scatter matrix:

Sb =M∑

i=1Pi (µi − µ0) (µi − µ0)T (1)

Where:µ0 is the global mean vector, µ0 =

∑Mi=1 Piµi .

µi the median of class ωi .Pi ∼= ni

N .

31 / 56

Page 95: 01 Machine Learning Introduction

What can be done?

Measures for Class SeparabilityExample: Between-class scatter matrix:

Sb =M∑

i=1Pi (µi − µ0) (µi − µ0)T (1)

Where:µ0 is the global mean vector, µ0 =

∑Mi=1 Piµi .

µi the median of class ωi .Pi ∼= ni

N .

31 / 56

Page 96: 01 Machine Learning Introduction

What can be done?

Feature Subset SelectionExamples:

I Filter ApproachF All combinations of features are used together with a separability

measure.I Wrapper Approach:

F Use the decided classifier itself to find the best set.

32 / 56

Page 97: 01 Machine Learning Introduction

What can be done?

Feature Subset SelectionExamples:

I Filter ApproachF All combinations of features are used together with a separability

measure.I Wrapper Approach:

F Use the decided classifier itself to find the best set.

32 / 56

Page 98: 01 Machine Learning Introduction

What can be done?

Feature Subset SelectionExamples:

I Filter ApproachF All combinations of features are used together with a separability

measure.I Wrapper Approach:

F Use the decided classifier itself to find the best set.

32 / 56

Page 99: 01 Machine Learning Introduction

What can be done?

Feature Subset SelectionExamples:

I Filter ApproachF All combinations of features are used together with a separability

measure.I Wrapper Approach:

F Use the decided classifier itself to find the best set.

32 / 56

Page 100: 01 Machine Learning Introduction

What can be done?

Feature Subset SelectionExamples:

I Filter ApproachF All combinations of features are used together with a separability

measure.I Wrapper Approach:

F Use the decided classifier itself to find the best set.

32 / 56

Page 101: 01 Machine Learning Introduction

Outline1 Why are we interested in Analyzing Data?

Intuitive Definition: The 3V’sComplexityData Everywhere

2 Machine LearningMachine Learning ProcessFeaturesClassificationClustering Analysis

3 Data MiningDefinitionApplicationsExample: Frequent Itemsets

4 Hardware SupportASICSGPU’s

5 ProjectsWhat projects can you do?

33 / 56

Page 102: 01 Machine Learning Introduction

Classification

DefinitionA procedure dividing data into the given set of categories based on thetraining set in a supervised way.

What we want from classification?Generalization Vs. Specification

I Hard to achieve both

Avoid - overfitting/overtrainingI Early stoppingI Holdout validationI K-fold cross validationI Leave-one-out cross-validation

34 / 56

Page 103: 01 Machine Learning Introduction

Classification

DefinitionA procedure dividing data into the given set of categories based on thetraining set in a supervised way.

What we want from classification?Generalization Vs. Specification

I Hard to achieve both

Avoid - overfitting/overtrainingI Early stoppingI Holdout validationI K-fold cross validationI Leave-one-out cross-validation

34 / 56

Page 104: 01 Machine Learning Introduction

Classification

DefinitionA procedure dividing data into the given set of categories based on thetraining set in a supervised way.

What we want from classification?Generalization Vs. Specification

I Hard to achieve both

Avoid - overfitting/overtrainingI Early stoppingI Holdout validationI K-fold cross validationI Leave-one-out cross-validation

34 / 56

Page 105: 01 Machine Learning Introduction

Classification

DefinitionA procedure dividing data into the given set of categories based on thetraining set in a supervised way.

What we want from classification?Generalization Vs. Specification

I Hard to achieve both

Avoid - overfitting/overtrainingI Early stoppingI Holdout validationI K-fold cross validationI Leave-one-out cross-validation

34 / 56

Page 106: 01 Machine Learning Introduction

Classification

DefinitionA procedure dividing data into the given set of categories based on thetraining set in a supervised way.

What we want from classification?Generalization Vs. Specification

I Hard to achieve both

Avoid - overfitting/overtrainingI Early stoppingI Holdout validationI K-fold cross validationI Leave-one-out cross-validation

34 / 56

Page 107: 01 Machine Learning Introduction

Classification

DefinitionA procedure dividing data into the given set of categories based on thetraining set in a supervised way.

What we want from classification?Generalization Vs. Specification

I Hard to achieve both

Avoid - overfitting/overtrainingI Early stoppingI Holdout validationI K-fold cross validationI Leave-one-out cross-validation

34 / 56

Page 108: 01 Machine Learning Introduction

Classification

DefinitionA procedure dividing data into the given set of categories based on thetraining set in a supervised way.

What we want from classification?Generalization Vs. Specification

I Hard to achieve both

Avoid - overfitting/overtrainingI Early stoppingI Holdout validationI K-fold cross validationI Leave-one-out cross-validation

34 / 56

Page 109: 01 Machine Learning Introduction

Classification

DefinitionA procedure dividing data into the given set of categories based on thetraining set in a supervised way.

What we want from classification?Generalization Vs. Specification

I Hard to achieve both

Avoid - overfitting/overtrainingI Early stoppingI Holdout validationI K-fold cross validationI Leave-one-out cross-validation

34 / 56

Page 110: 01 Machine Learning Introduction

Avoid - overfitting/overtraining

Validation and Training ErrorUnderfitting Overfitting

Validation Error

Training Error

35 / 56

Page 111: 01 Machine Learning Introduction

Examples of Classification Algorithms

Many Possible AlgorithmsLinear Classifiers: PerceptronProbability Classifiers: Naive BayesKernel Methods Classifiers : Support Vector MachinesNon-Linear Classifiers: Artificial Neural NetworksGraph Model Classifiers:. . .

36 / 56

Page 112: 01 Machine Learning Introduction

Examples of Classification Algorithms

Many Possible AlgorithmsLinear Classifiers: PerceptronProbability Classifiers: Naive BayesKernel Methods Classifiers : Support Vector MachinesNon-Linear Classifiers: Artificial Neural NetworksGraph Model Classifiers:. . .

36 / 56

Page 113: 01 Machine Learning Introduction

Examples of Classification Algorithms

Many Possible AlgorithmsLinear Classifiers: PerceptronProbability Classifiers: Naive BayesKernel Methods Classifiers : Support Vector MachinesNon-Linear Classifiers: Artificial Neural NetworksGraph Model Classifiers:. . .

36 / 56

Page 114: 01 Machine Learning Introduction

Examples of Classification Algorithms

Many Possible AlgorithmsLinear Classifiers: PerceptronProbability Classifiers: Naive BayesKernel Methods Classifiers : Support Vector MachinesNon-Linear Classifiers: Artificial Neural NetworksGraph Model Classifiers:. . .

36 / 56

Page 115: 01 Machine Learning Introduction

Examples of Classification Algorithms

Many Possible AlgorithmsLinear Classifiers: PerceptronProbability Classifiers: Naive BayesKernel Methods Classifiers : Support Vector MachinesNon-Linear Classifiers: Artificial Neural NetworksGraph Model Classifiers:. . .

36 / 56

Page 116: 01 Machine Learning Introduction

Examples of Classification Algorithms

Many Possible AlgorithmsLinear Classifiers: PerceptronProbability Classifiers: Naive BayesKernel Methods Classifiers : Support Vector MachinesNon-Linear Classifiers: Artificial Neural NetworksGraph Model Classifiers:. . .

36 / 56

Page 117: 01 Machine Learning Introduction

Outline1 Why are we interested in Analyzing Data?

Intuitive Definition: The 3V’sComplexityData Everywhere

2 Machine LearningMachine Learning ProcessFeaturesClassificationClustering Analysis

3 Data MiningDefinitionApplicationsExample: Frequent Itemsets

4 Hardware SupportASICSGPU’s

5 ProjectsWhat projects can you do?

37 / 56

Page 118: 01 Machine Learning Introduction

Clustering Analysis

DefinitionGrouping unlabeled data into clusters, for the purpose of inference ofhidden structures or information.

Using, for exampleDissimilarity measurement

Angle : Inner product, . . .Non-metric : Rank, Intensity, . . .Distance : Euclidean (l2), Manhattan(l1), . . .

38 / 56

Page 119: 01 Machine Learning Introduction

Clustering Analysis

DefinitionGrouping unlabeled data into clusters, for the purpose of inference ofhidden structures or information.

Using, for exampleDissimilarity measurement

Angle : Inner product, . . .Non-metric : Rank, Intensity, . . .Distance : Euclidean (l2), Manhattan(l1), . . .

38 / 56

Page 120: 01 Machine Learning Introduction

Clustering Analysis

DefinitionGrouping unlabeled data into clusters, for the purpose of inference ofhidden structures or information.

Using, for exampleDissimilarity measurement

Angle : Inner product, . . .Non-metric : Rank, Intensity, . . .Distance : Euclidean (l2), Manhattan(l1), . . .

38 / 56

Page 121: 01 Machine Learning Introduction

Clustering Analysis

DefinitionGrouping unlabeled data into clusters, for the purpose of inference ofhidden structures or information.

Using, for exampleDissimilarity measurement

Angle : Inner product, . . .Non-metric : Rank, Intensity, . . .Distance : Euclidean (l2), Manhattan(l1), . . .

38 / 56

Page 122: 01 Machine Learning Introduction

Clustering Analysis

DefinitionGrouping unlabeled data into clusters, for the purpose of inference ofhidden structures or information.

Using, for exampleDissimilarity measurement

Angle : Inner product, . . .Non-metric : Rank, Intensity, . . .Distance : Euclidean (l2), Manhattan(l1), . . .

38 / 56

Page 123: 01 Machine Learning Introduction

Example

39 / 56

Page 124: 01 Machine Learning Introduction

Examples of Clustering Algorithms

Clustering1 Basic Clustering Algorithms

1 K-means2 Clustering Based in Cost Functions

1 Fuzzy C-means2 Possibilistic

3 Hierarchical Clustering1 Entropy based

4 Clustering Based in Graph Theory

40 / 56

Page 125: 01 Machine Learning Introduction

Examples of Clustering Algorithms

Clustering1 Basic Clustering Algorithms

1 K-means2 Clustering Based in Cost Functions

1 Fuzzy C-means2 Possibilistic

3 Hierarchical Clustering1 Entropy based

4 Clustering Based in Graph Theory

40 / 56

Page 126: 01 Machine Learning Introduction

Examples of Clustering Algorithms

Clustering1 Basic Clustering Algorithms

1 K-means2 Clustering Based in Cost Functions

1 Fuzzy C-means2 Possibilistic

3 Hierarchical Clustering1 Entropy based

4 Clustering Based in Graph Theory

40 / 56

Page 127: 01 Machine Learning Introduction

Examples of Clustering Algorithms

Clustering1 Basic Clustering Algorithms

1 K-means2 Clustering Based in Cost Functions

1 Fuzzy C-means2 Possibilistic

3 Hierarchical Clustering1 Entropy based

4 Clustering Based in Graph Theory

40 / 56

Page 128: 01 Machine Learning Introduction

Examples of Clustering Algorithms

Clustering1 Basic Clustering Algorithms

1 K-means2 Clustering Based in Cost Functions

1 Fuzzy C-means2 Possibilistic

3 Hierarchical Clustering1 Entropy based

4 Clustering Based in Graph Theory

40 / 56

Page 129: 01 Machine Learning Introduction

Examples of Clustering Algorithms

Clustering1 Basic Clustering Algorithms

1 K-means2 Clustering Based in Cost Functions

1 Fuzzy C-means2 Possibilistic

3 Hierarchical Clustering1 Entropy based

4 Clustering Based in Graph Theory

40 / 56

Page 130: 01 Machine Learning Introduction

Examples of Clustering Algorithms

Clustering1 Basic Clustering Algorithms

1 K-means2 Clustering Based in Cost Functions

1 Fuzzy C-means2 Possibilistic

3 Hierarchical Clustering1 Entropy based

4 Clustering Based in Graph Theory

40 / 56

Page 131: 01 Machine Learning Introduction

Examples of Clustering Algorithms

Clustering1 Basic Clustering Algorithms

1 K-means2 Clustering Based in Cost Functions

1 Fuzzy C-means2 Possibilistic

3 Hierarchical Clustering1 Entropy based

4 Clustering Based in Graph Theory

40 / 56

Page 132: 01 Machine Learning Introduction

Outline1 Why are we interested in Analyzing Data?

Intuitive Definition: The 3V’sComplexityData Everywhere

2 Machine LearningMachine Learning ProcessFeaturesClassificationClustering Analysis

3 Data MiningDefinitionApplicationsExample: Frequent Itemsets

4 Hardware SupportASICSGPU’s

5 ProjectsWhat projects can you do?

41 / 56

Page 133: 01 Machine Learning Introduction

What Is Data Mining?

Data mining (knowledge discovery in databases):Extraction of interesting information or patterns from data in largedatabases.

Alternative names and their “inside stories”:Knowledge discovery(mining) in databases (KDD)Knowledge extractionData/pattern analysisData archeologyBusiness intelligenceetc.

42 / 56

Page 134: 01 Machine Learning Introduction

What Is Data Mining?

Data mining (knowledge discovery in databases):Extraction of interesting information or patterns from data in largedatabases.

Alternative names and their “inside stories”:Knowledge discovery(mining) in databases (KDD)Knowledge extractionData/pattern analysisData archeologyBusiness intelligenceetc.

42 / 56

Page 135: 01 Machine Learning Introduction

What Is Data Mining?

Data mining (knowledge discovery in databases):Extraction of interesting information or patterns from data in largedatabases.

Alternative names and their “inside stories”:Knowledge discovery(mining) in databases (KDD)Knowledge extractionData/pattern analysisData archeologyBusiness intelligenceetc.

42 / 56

Page 136: 01 Machine Learning Introduction

What Is Data Mining?

Data mining (knowledge discovery in databases):Extraction of interesting information or patterns from data in largedatabases.

Alternative names and their “inside stories”:Knowledge discovery(mining) in databases (KDD)Knowledge extractionData/pattern analysisData archeologyBusiness intelligenceetc.

42 / 56

Page 137: 01 Machine Learning Introduction

What Is Data Mining?

Data mining (knowledge discovery in databases):Extraction of interesting information or patterns from data in largedatabases.

Alternative names and their “inside stories”:Knowledge discovery(mining) in databases (KDD)Knowledge extractionData/pattern analysisData archeologyBusiness intelligenceetc.

42 / 56

Page 138: 01 Machine Learning Introduction

What Is Data Mining?

Data mining (knowledge discovery in databases):Extraction of interesting information or patterns from data in largedatabases.

Alternative names and their “inside stories”:Knowledge discovery(mining) in databases (KDD)Knowledge extractionData/pattern analysisData archeologyBusiness intelligenceetc.

42 / 56

Page 139: 01 Machine Learning Introduction

Examples: What is (not) Data Mining?

What is not Data Mining?1 Look up phone number in phone directory2 Query a Web search engine for information about “Amazon”

What is Data Mining?1 Certain names are more prevalent in certain US locations (O’Brien,

O’Rurke, O’Reilly. . . in Boston area)2 Group together similar documents returned by search engine

according to their context (e.g. Amazon rainforest, Amazon.com)

43 / 56

Page 140: 01 Machine Learning Introduction

Examples: What is (not) Data Mining?

What is not Data Mining?1 Look up phone number in phone directory2 Query a Web search engine for information about “Amazon”

What is Data Mining?1 Certain names are more prevalent in certain US locations (O’Brien,

O’Rurke, O’Reilly. . . in Boston area)2 Group together similar documents returned by search engine

according to their context (e.g. Amazon rainforest, Amazon.com)

43 / 56

Page 141: 01 Machine Learning Introduction

Examples: What is (not) Data Mining?

What is not Data Mining?1 Look up phone number in phone directory2 Query a Web search engine for information about “Amazon”

What is Data Mining?1 Certain names are more prevalent in certain US locations (O’Brien,

O’Rurke, O’Reilly. . . in Boston area)2 Group together similar documents returned by search engine

according to their context (e.g. Amazon rainforest, Amazon.com)

43 / 56

Page 142: 01 Machine Learning Introduction

Examples: What is (not) Data Mining?

What is not Data Mining?1 Look up phone number in phone directory2 Query a Web search engine for information about “Amazon”

What is Data Mining?1 Certain names are more prevalent in certain US locations (O’Brien,

O’Rurke, O’Reilly. . . in Boston area)2 Group together similar documents returned by search engine

according to their context (e.g. Amazon rainforest, Amazon.com)

43 / 56

Page 143: 01 Machine Learning Introduction

Outline1 Why are we interested in Analyzing Data?

Intuitive Definition: The 3V’sComplexityData Everywhere

2 Machine LearningMachine Learning ProcessFeaturesClassificationClustering Analysis

3 Data MiningDefinitionApplicationsExample: Frequent Itemsets

4 Hardware SupportASICSGPU’s

5 ProjectsWhat projects can you do?

44 / 56

Page 144: 01 Machine Learning Introduction

Data mining Applications

ApplicationsMining the Web for Structured DataNear Neighbor Search in High Dimensional Data.

I Frequent itemsets and Association Rules

Structure of the webgraphI PageRankI Link Analysis

Proximity on GraphsMining data streams.Large scale supervised machine learning techniques.

45 / 56

Page 145: 01 Machine Learning Introduction

Data mining Applications

ApplicationsMining the Web for Structured DataNear Neighbor Search in High Dimensional Data.

I Frequent itemsets and Association Rules

Structure of the webgraphI PageRankI Link Analysis

Proximity on GraphsMining data streams.Large scale supervised machine learning techniques.

45 / 56

Page 146: 01 Machine Learning Introduction

Data mining Applications

ApplicationsMining the Web for Structured DataNear Neighbor Search in High Dimensional Data.

I Frequent itemsets and Association Rules

Structure of the webgraphI PageRankI Link Analysis

Proximity on GraphsMining data streams.Large scale supervised machine learning techniques.

45 / 56

Page 147: 01 Machine Learning Introduction

Data mining Applications

ApplicationsMining the Web for Structured DataNear Neighbor Search in High Dimensional Data.

I Frequent itemsets and Association Rules

Structure of the webgraphI PageRankI Link Analysis

Proximity on GraphsMining data streams.Large scale supervised machine learning techniques.

45 / 56

Page 148: 01 Machine Learning Introduction

Data mining Applications

ApplicationsMining the Web for Structured DataNear Neighbor Search in High Dimensional Data.

I Frequent itemsets and Association Rules

Structure of the webgraphI PageRankI Link Analysis

Proximity on GraphsMining data streams.Large scale supervised machine learning techniques.

45 / 56

Page 149: 01 Machine Learning Introduction

Data mining Applications

ApplicationsMining the Web for Structured DataNear Neighbor Search in High Dimensional Data.

I Frequent itemsets and Association Rules

Structure of the webgraphI PageRankI Link Analysis

Proximity on GraphsMining data streams.Large scale supervised machine learning techniques.

45 / 56

Page 150: 01 Machine Learning Introduction

Data mining Applications

ApplicationsMining the Web for Structured DataNear Neighbor Search in High Dimensional Data.

I Frequent itemsets and Association Rules

Structure of the webgraphI PageRankI Link Analysis

Proximity on GraphsMining data streams.Large scale supervised machine learning techniques.

45 / 56

Page 151: 01 Machine Learning Introduction

Data mining Applications

ApplicationsMining the Web for Structured DataNear Neighbor Search in High Dimensional Data.

I Frequent itemsets and Association Rules

Structure of the webgraphI PageRankI Link Analysis

Proximity on GraphsMining data streams.Large scale supervised machine learning techniques.

45 / 56

Page 152: 01 Machine Learning Introduction

Data mining Applications

ApplicationsMining the Web for Structured DataNear Neighbor Search in High Dimensional Data.

I Frequent itemsets and Association Rules

Structure of the webgraphI PageRankI Link Analysis

Proximity on GraphsMining data streams.Large scale supervised machine learning techniques.

45 / 56

Page 153: 01 Machine Learning Introduction

Outline1 Why are we interested in Analyzing Data?

Intuitive Definition: The 3V’sComplexityData Everywhere

2 Machine LearningMachine Learning ProcessFeaturesClassificationClustering Analysis

3 Data MiningDefinitionApplicationsExample: Frequent Itemsets

4 Hardware SupportASICSGPU’s

5 ProjectsWhat projects can you do?

46 / 56

Page 154: 01 Machine Learning Introduction

Example: Frequent Itemsets

Based in the Market-Basket Model1 On the one hand, we have items.2 On the other we have baskets, sometimes called “transactions.”

1 Each basket consists of a set of items (an itemset)2 They are small.

Examples1 {Cat, and, dog, bites}2 {Yahoo, news, claims, cat, dog, and, produced, viable, offspring}3 {Cat, killer, likely, is, a, big, dog}4 {Professional, free, advice, on, dog, training, puppy}

47 / 56

Page 155: 01 Machine Learning Introduction

Example: Frequent Itemsets

Based in the Market-Basket Model1 On the one hand, we have items.2 On the other we have baskets, sometimes called “transactions.”

1 Each basket consists of a set of items (an itemset)2 They are small.

Examples1 {Cat, and, dog, bites}2 {Yahoo, news, claims, cat, dog, and, produced, viable, offspring}3 {Cat, killer, likely, is, a, big, dog}4 {Professional, free, advice, on, dog, training, puppy}

47 / 56

Page 156: 01 Machine Learning Introduction

Example: Frequent Itemsets

Based in the Market-Basket Model1 On the one hand, we have items.2 On the other we have baskets, sometimes called “transactions.”

1 Each basket consists of a set of items (an itemset)2 They are small.

Examples1 {Cat, and, dog, bites}2 {Yahoo, news, claims, cat, dog, and, produced, viable, offspring}3 {Cat, killer, likely, is, a, big, dog}4 {Professional, free, advice, on, dog, training, puppy}

47 / 56

Page 157: 01 Machine Learning Introduction

Example: Frequent Itemsets

Based in the Market-Basket Model1 On the one hand, we have items.2 On the other we have baskets, sometimes called “transactions.”

1 Each basket consists of a set of items (an itemset)2 They are small.

Examples1 {Cat, and, dog, bites}2 {Yahoo, news, claims, cat, dog, and, produced, viable, offspring}3 {Cat, killer, likely, is, a, big, dog}4 {Professional, free, advice, on, dog, training, puppy}

47 / 56

Page 158: 01 Machine Learning Introduction

Example: Frequent Itemsets

Based in the Market-Basket Model1 On the one hand, we have items.2 On the other we have baskets, sometimes called “transactions.”

1 Each basket consists of a set of items (an itemset)2 They are small.

Examples1 {Cat, and, dog, bites}2 {Yahoo, news, claims, cat, dog, and, produced, viable, offspring}3 {Cat, killer, likely, is, a, big, dog}4 {Professional, free, advice, on, dog, training, puppy}

47 / 56

Page 159: 01 Machine Learning Introduction

Example: Frequent Itemsets

Based in the Market-Basket Model1 On the one hand, we have items.2 On the other we have baskets, sometimes called “transactions.”

1 Each basket consists of a set of items (an itemset)2 They are small.

Examples1 {Cat, and, dog, bites}2 {Yahoo, news, claims, cat, dog, and, produced, viable, offspring}3 {Cat, killer, likely, is, a, big, dog}4 {Professional, free, advice, on, dog, training, puppy}

47 / 56

Page 160: 01 Machine Learning Introduction

Example: Frequent Itemsets

Based in the Market-Basket Model1 On the one hand, we have items.2 On the other we have baskets, sometimes called “transactions.”

1 Each basket consists of a set of items (an itemset)2 They are small.

Examples1 {Cat, and, dog, bites}2 {Yahoo, news, claims, cat, dog, and, produced, viable, offspring}3 {Cat, killer, likely, is, a, big, dog}4 {Professional, free, advice, on, dog, training, puppy}

47 / 56

Page 161: 01 Machine Learning Introduction

Example: Frequent Itemsets

Based in the Market-Basket Model1 On the one hand, we have items.2 On the other we have baskets, sometimes called “transactions.”

1 Each basket consists of a set of items (an itemset)2 They are small.

Examples1 {Cat, and, dog, bites}2 {Yahoo, news, claims, cat, dog, and, produced, viable, offspring}3 {Cat, killer, likely, is, a, big, dog}4 {Professional, free, advice, on, dog, training, puppy}

47 / 56

Page 162: 01 Machine Learning Introduction

Example: Frequent Itemsets

Then, we do the followingTransaction ID Cat Dog and a mated

1 1 1 1 0 02 1 1 1 1 13 1 1 0 1 04 0 1 0 0 0

48 / 56

Page 163: 01 Machine Learning Introduction

Combinatorial Problem

ProblemHow many subsets we have?

But we can do the followingGiven the itemset x in a database D and a set of transactions {ti}i∈I

supp(x ,D) = |{ti ∈ D|x ∈ ti}| (2)

Then, setting a threshold εHow many frequent (supp(x ,D) > ε) itemsets?

49 / 56

Page 164: 01 Machine Learning Introduction

Combinatorial Problem

ProblemHow many subsets we have?

But we can do the followingGiven the itemset x in a database D and a set of transactions {ti}i∈I

supp(x ,D) = |{ti ∈ D|x ∈ ti}| (2)

Then, setting a threshold εHow many frequent (supp(x ,D) > ε) itemsets?

49 / 56

Page 165: 01 Machine Learning Introduction

Combinatorial Problem

ProblemHow many subsets we have?

But we can do the followingGiven the itemset x in a database D and a set of transactions {ti}i∈I

supp(x ,D) = |{ti ∈ D|x ∈ ti}| (2)

Then, setting a threshold εHow many frequent (supp(x ,D) > ε) itemsets?

49 / 56

Page 166: 01 Machine Learning Introduction

Outline1 Why are we interested in Analyzing Data?

Intuitive Definition: The 3V’sComplexityData Everywhere

2 Machine LearningMachine Learning ProcessFeaturesClassificationClustering Analysis

3 Data MiningDefinitionApplicationsExample: Frequent Itemsets

4 Hardware SupportASICSGPU’s

5 ProjectsWhat projects can you do?

50 / 56

Page 167: 01 Machine Learning Introduction

Hardware Solutions: ASICS

Application-Specific Integrated Circuit (ASIC)An ASIC is an integrated circuit customized for a particular use, ratherthan intended for general-purpose use.

It allows for1 Lower Power Consumption.2 Better Colling Approaches.

Example: From Microsoft Research

51 / 56

Page 168: 01 Machine Learning Introduction

Hardware Solutions: ASICS

Application-Specific Integrated Circuit (ASIC)An ASIC is an integrated circuit customized for a particular use, ratherthan intended for general-purpose use.

It allows for1 Lower Power Consumption.2 Better Colling Approaches.

Example: From Microsoft Research

51 / 56

Page 169: 01 Machine Learning Introduction

Hardware Solutions: ASICS

Application-Specific Integrated Circuit (ASIC)An ASIC is an integrated circuit customized for a particular use, ratherthan intended for general-purpose use.

It allows for1 Lower Power Consumption.2 Better Colling Approaches.

Example: From Microsoft Research

51 / 56

Page 170: 01 Machine Learning Introduction

Outline1 Why are we interested in Analyzing Data?

Intuitive Definition: The 3V’sComplexityData Everywhere

2 Machine LearningMachine Learning ProcessFeaturesClassificationClustering Analysis

3 Data MiningDefinitionApplicationsExample: Frequent Itemsets

4 Hardware SupportASICSGPU’s

5 ProjectsWhat projects can you do?

52 / 56

Page 171: 01 Machine Learning Introduction

Hardware Solutions: GPU’sIDEAS

Based on CUDA parallel computing architecture from NvidiaEmphasis on executing many concurrent LIGHT threads instead ofone HEAVY thread as in CPUs

Hardware for 8800

53 / 56

Page 172: 01 Machine Learning Introduction

AdvantagesI Massively parallelI Hundreds of cores, millions of threadsI High throughput

LimitationsI May not be applicable for all tasksI Generic hardware (CPUs) closing the gap

54 / 56

Page 173: 01 Machine Learning Introduction

Outline1 Why are we interested in Analyzing Data?

Intuitive Definition: The 3V’sComplexityData Everywhere

2 Machine LearningMachine Learning ProcessFeaturesClassificationClustering Analysis

3 Data MiningDefinitionApplicationsExample: Frequent Itemsets

4 Hardware SupportASICSGPU’s

5 ProjectsWhat projects can you do?

55 / 56

Page 174: 01 Machine Learning Introduction

Projects

Possible topic are:Oil exploration detection.Association Rule Preprocessing Project.Neural Network-Based Financial Market Forecasting Project.Page Ranking - Improving over the Google MatrixInfluence Maximization in Social Networks.Web Word Relevance Measures.Recommendation Systems.

There are more possibilities at https://www.kaggle.com/competitions

56 / 56

Page 175: 01 Machine Learning Introduction

Projects

Possible topic are:Oil exploration detection.Association Rule Preprocessing Project.Neural Network-Based Financial Market Forecasting Project.Page Ranking - Improving over the Google MatrixInfluence Maximization in Social Networks.Web Word Relevance Measures.Recommendation Systems.

There are more possibilities at https://www.kaggle.com/competitions

56 / 56

Page 176: 01 Machine Learning Introduction

Projects

Possible topic are:Oil exploration detection.Association Rule Preprocessing Project.Neural Network-Based Financial Market Forecasting Project.Page Ranking - Improving over the Google MatrixInfluence Maximization in Social Networks.Web Word Relevance Measures.Recommendation Systems.

There are more possibilities at https://www.kaggle.com/competitions

56 / 56

Page 177: 01 Machine Learning Introduction

Projects

Possible topic are:Oil exploration detection.Association Rule Preprocessing Project.Neural Network-Based Financial Market Forecasting Project.Page Ranking - Improving over the Google MatrixInfluence Maximization in Social Networks.Web Word Relevance Measures.Recommendation Systems.

There are more possibilities at https://www.kaggle.com/competitions

56 / 56

Page 178: 01 Machine Learning Introduction

Projects

Possible topic are:Oil exploration detection.Association Rule Preprocessing Project.Neural Network-Based Financial Market Forecasting Project.Page Ranking - Improving over the Google MatrixInfluence Maximization in Social Networks.Web Word Relevance Measures.Recommendation Systems.

There are more possibilities at https://www.kaggle.com/competitions

56 / 56

Page 179: 01 Machine Learning Introduction

Projects

Possible topic are:Oil exploration detection.Association Rule Preprocessing Project.Neural Network-Based Financial Market Forecasting Project.Page Ranking - Improving over the Google MatrixInfluence Maximization in Social Networks.Web Word Relevance Measures.Recommendation Systems.

There are more possibilities at https://www.kaggle.com/competitions

56 / 56

Page 180: 01 Machine Learning Introduction

Projects

Possible topic are:Oil exploration detection.Association Rule Preprocessing Project.Neural Network-Based Financial Market Forecasting Project.Page Ranking - Improving over the Google MatrixInfluence Maximization in Social Networks.Web Word Relevance Measures.Recommendation Systems.

There are more possibilities at https://www.kaggle.com/competitions

56 / 56

Page 181: 01 Machine Learning Introduction

Projects

Possible topic are:Oil exploration detection.Association Rule Preprocessing Project.Neural Network-Based Financial Market Forecasting Project.Page Ranking - Improving over the Google MatrixInfluence Maximization in Social Networks.Web Word Relevance Measures.Recommendation Systems.

There are more possibilities at https://www.kaggle.com/competitions

56 / 56