artificial intelligence e data mining and its applications e l g j g i...

96
Artificial Intelligence Artificial Intelligence | Chung-Ang University | Narration: Prof. Jaesung Lee Data Mining and its Applications My name is Joohyung Jeon and today's topic of the presentation is "Data Mining".

Upload: others

Post on 28-May-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Artificial Intelligence

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee

Data Mining and its Applications

My name is Joohyung Jeon and today's topic of the presentation is "Data Mining".

Page 2: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Introduction

• Rapid advances in Information Technology opens• Explosive growth of Data Generation• Increment of Data Collection Capability

• Examples• Commercial transactions on very-large databases

generated by retailers and e-Commerce• Huge amount of scientific data from

Human Genome Project and World Wide Web

With rapid advances in information technology, an explosive growth is witnessed in data generation and data collection capabilities across all domains. In the business world, very large databases on commercial transactions have been generated by retailers and e-Commerce.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee2

Relation to other fields

Page 3: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Introduction

• Rapid advances in Information Technology opens• Explosive growth of Data Generation• Increment of Data Collection Capability

• Examples• Commercial transactions on very-large databases

generated by retailers and e-Commerce• Huge amount of scientific data from

Human Genome Project and World Wide Web

Huge amount of scientific data have been generated in various fields as well. One case is the human genome project which has aggregated gigabytes of data on the human genetic code. The World Wide Web provides another example with billions of web pages consisting of textual and multimedia information that are used by millions of people.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee3

Human Genome Project

Data from World Wide Web

Page 4: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Introduction

• Manual inspection of such huge dataset is impossible

• For this case, Data Mining allows• Automation of the data analysis• Exploration of large and complex data sets

Analyzing huge data that can be understood and used efficiently, remains a challenging problem. Data Mining addresses this problem by providing techniques and software to automate the analysis and exploration of large and complex data sets.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee4

Goal of Data Mining

Page 5: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

What is Data Mining?

• Data Mining = Process of Knowledge Discovery in Data• Interdisciplinary subfield of Computer Science• Goal

• Extract information with intelligent methods• Transform information into

a comprehensive and compact structure• Eliminate randomness to discover hidden Patterns

Data Mining is the process of the "Knowledge Discovery in Databases" process, or KDD and is an interdisciplinary subfield of computer science with an overall goal to extract information with intelligent methods from a data set and transform the information into a comprehensible structure for further use.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee5

KDD Process

Page 6: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

What is Data Mining?

• Data Mining = Process of Knowledge Discovery in Data• Interdisciplinary subfield of Computer Science• Goal

• Extract information with intelligent methods• Transform information into

a comprehensive and compact structure• Eliminate randomness to discover hidden Patterns

It includes a set of methodologies and theories applicable to large and complex databases to eliminate the randomness and discover the hidden pattern, revealing patterns in data. There are several driving forces for why Data Mining has become such an important area of study.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee6

Patterns in patch

Page 7: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

What is Data Mining?

• Supporters of Data Mining• Explosive growth of data• Varying sources of data• Cheaper storage devices (cloud storage)• Increment of computing Power• Improved database management system• Faster communication (network)• Lots of open-source software such as R (software)

The explosive growth of data in a great variety of fields supported by cheaper storage devices with unlimited capacity, such as cloud storage, faster communication with faster connection speeds, and better database management systems and software support. The computing power is now also quickly increasing.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee7

Data mining

Supporters of Data Mining

Page 8: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

What is Data Mining?

• Data Mining improves• Database and data management techniques• Data pre-processing techniques• Inference considerations• Interestingness metrics• Complexity considerations• Post-processing of discovered structures• Visualization of complex data

With high-volume of varied data available, Data Mining aids to extract information out of the data. Aside from the raw analysis step, it also involves database and data management aspects, data pre-processing, model and inference considerations, interestingness metrics, complexity considerations, post-processing of discovered structures, visualization.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee8

Goal of Data Mining

Page 9: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Stages of Knowledge Discovery in Data

• Stages of Knowledge Discovery in Data• Preprocessing• Data Exploration• Data Mining (Modeling)• Results Validation

The KDD process is commonly defined with the four stages: (1) Preprocessing, (2) Data Exploration, (3) Data Mining, and (4) Results Validation.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee9

Main steps of KDD

Page 10: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Step 1: Preprocessing

• Stages of Knowledge Discovery in Data• Preprocessing

• Collecting target dataset• A large collection of data is preferable• A compact collection of data is preferable

• Trade-off: volume vs. compactness

Before Data Mining algorithms can be used, a target data set must be collected. As Data Mining can only uncover patterns actually present in the data, the target data set must be large enough to contain these patterns while remaining concise enough to be mined within an acceptable time limit.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee1

0

Missing, Noisy, Inconsistent Data

Page 11: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Step 1: Preprocessing

• Stages of Knowledge Discovery in Data• Preprocessing

• Collecting target dataset• A large collection of data is preferable• A compact collection of data is preferable

• Trade-off: volume vs. compactness

Thus, data should be cleansed by removing the observations containing noise and those with missing data. Preprocessing is essential to analyze the multivariate data sets before Data Mining.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee11

Preprocessing step

Page 12: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Step 2: Exploratory Data Analysis

• Stages of Knowledge Discovery in Data• Preprocessing• Data Exploration

• Which method can be applied to my dataset?• Preliminary investigation to get

a deeper understanding of instances and features• Data summarization

• Using Summary statistics• Using Visualization

Given a complex set of observations, often EDA provides the initial pointers towards various learning techniques, indicating which method can be used or should be applied. The data is examined for structures that may indicate deeper relationships among cases or variables, aiding make a right choice of Data Mining method.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee1

2

Exploratory Data Analysis

Page 13: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Step 2: Exploratory Data Analysis

• Stages of Knowledge Discovery in Data• Preprocessing• Data Exploration

• Which method can be applied to my dataset?• Preliminary investigation to get

a deeper understanding of instances and features• Data summarization

• Using Summary statistics• Using Visualization

Vast amount of numbers on a large number of variables need to be properly organized to extract information from them. Broadly speaking, there are two methods to summarize data: summary statistics and visualization. Both have their advantages and disadvantages and

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee1

3

Statistics and Visualization

if applied jointly they will get the maximum information from raw data.

Page 14: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Summary Statistics

• List of frequently-used Summary Statistics• Measures of location• Measures of dispersion• Measures of skewness• Similarity and Dissimilarity

Summary statistics are numbers computed from the sample that present a summary of the variable, attribute or feature. It includes 1) Measures of location, 2) Measures of dispersion, 3) Measures of skewness, and Similarity and Dissimilarity.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee1

4

An example of a report using Summary Statistics

Page 15: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Measures of Location

• List of frequently-used Summary Statistics• Measures of location

• Central tendency: representative values of the set of observations

• Mean, Median, Mode, and Quartile

Measures of location are single numbers representing a set of observations. Measures of location also includes measures of central tendency. Measures of central tendency can also be taken as the most representative values of the set of observations. The most common measures of location are the Mean, the Median, the Mode and the Quartiles.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee1

5

Mean Median

Mode Quartile

Measures of location

Page 16: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Measures of Location

• List of frequently-used Summary Statistics• Measures of location

• Mean• Arithmetic average of all the observations

• Median• Middle-most value of the ranked observations

Mean is the arithmetic average of all the observations. The mean equals the sum of all observations divided by the sample size. Median is the middle-most value of the ranked set of observations so that half the observations are greater than the median and the other half is less. Median is a robust measure of central tendency.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee1

6

Mean, Median, and Mode

Page 17: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Measures of Location

• List of frequently-used Summary Statistics• Measures of location

• Mean• Median• Mode

• Most frequently occurring value• Quartiles (Q1, Q2, and Q3)

• Division points which split data intofour equal parts after rank-ordering

Mode is the most frequently occurring value in the data set. This makes more sense when attributes are not continuous. Quartiles are division points which split data into four equal parts after rank-ordering them. Division points are called Q1-the first quartile, Q2-the second quartile or median, and Q3-the third quartile.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee1

7

Quartiles and Data Distribution

Page 18: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Measures of Location

• List of frequently-used Summary Statistics• Measures of location

• Mean• Sensitive to outliers

• Median• Insensitive to outliers

• Mode• Quartiles (Q1, Q2, and Q3)

Note that the mean is very sensitive to outliers such as extreme or unusual observations whereas the median is not. The mean is affected if even a single observation is changed. The median, on the other hand, has a 50% breakdown which means that unless 50% values in a sample change, median will not change.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee1

8

Mean Median

Mode Quartile

Measures of location

Page 19: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Measures of Dispersion

• List of frequently-used Summary Statistics• Measures of location• Measures of dispersion (spread)

• Variability of data• Variance, Standard deviation,

Inter-quartile Range, and Range

Measures of location is not enough to capture all aspects of the features. Measures of dispersion are necessary to understand the variability of the data. The most common measure of dispersion are the Variance, the Standard Deviation, the Inter-quartile Range(IQR) and Range.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee1

9

VarianceStandarddeviation

IQR Range

Measures of dispersion

Page 20: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Measures of Dispersion

• List of frequently-used Summary Statistics• Measures of location• Measures of dispersion (spread)

• Variance• How far data values lie from the mean

• Standard deviation• Square root of the variance• Average distance from the mean

Variance measures how far data values lie from the mean. It is defined as the average of the squared differences between the mean and the individual data values. Standard Deviation is the square root of the variance. It is defined as the average distance between the mean and the individual data values.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee2

0

An Illustration of Variance

Page 21: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Measures of Dispersion

• List of frequently-used Summary Statistics• Measures of location• Measures of dispersion (spread)

• Variance• Standard deviation• Interquartile range (IQR)

• Difference between Q3 and Q1 (middle 50%)• Range

• Difference between max. and min. values

Interquartile range or IQR for short is the difference between Q3 and Q1. IQR contains the middle 50% of data. Range is the difference between the maximum and minimum values in the sample.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee2

1

Inter-Quartile Range

Page 22: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Measures of Skewness

• List of frequently-used Summary Statistics• Measures of location• Measures of dispersion (spread)• Measure of Skewness

• Symmetric distribution• Right-skewed distribution• Left-skewed distribution

The shape of the data distribution is also of considerable interest. The most 'well-behaved' distribution is a symmetric distribution where the mean and the median are coincident. The symmetry is lost if there exists a tail in either direction. Skewness measures whether or not a distribution has a single long tail.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee2

2

� � �� − � �̅

∑ �� − � �̅��

Pearson’s moment coefficient of skewness

Page 23: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Measure of Skewness

• List of frequently-used Summary Statistics• Measures of location• Measures of dispersion (spread)• Measure of Skewness

• Symmetric distribution• Right-skewed distribution• Left-skewed distribution

Skewness is measured based on this equation where � is the number of samples, �� is the observed value of �th sample, and � i̅s the mean of all samples.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee2

3

� � �� − � �̅

∑ �� − � �̅��

Pearson’s moment coefficient of skewness

Page 24: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Measure of Skewness

• List of frequently-used Summary Statistics• Measures of location• Measures of dispersion (spread)• Measure of Skewness

• Symmetric distribution• Right-skewed distribution• Left-skewed distribution

The figure gives examples of skewed distributions. Note that these diagrams are generated from theoretical distributions and in practice one is likely to see only approximations.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee2

4

� � �� − � �̅

∑ �� − � �̅��

Pearson’s moment coefficient of skewness

Page 25: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Similarity and Dissimilarity

• List of frequently-used Summary Statistics• Measures of location• Measures of dispersion• Measures of skewness• Similarity and Dissimilarity

• Essential measure for Pattern Recognition• Measure how close two distributions are• Similarity vs. Dissimilarity

Distance or similarity measures are essential to solve many Pattern Recognition problems such as Classification and Clustering. Various measures are available in literature to compare two data distributions. As the names suggest, a similarity measures how close two distributions are.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee2

5

Similar Images

Page 26: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Similarity and Dissimilarity

• List of frequently-used Summary Statistics• Measures of location• Measures of dispersion• Measures of skewness• Similarity and Dissimilarity

• Essential measure for Pattern Recognition• Measure how close two distributions are• Similarity vs. Dissimilarity

Similarity measure is a numerical measure of how alike two data objects are. It often falls between 0 means no similarity and 1 stands for complete similarity. On the other hands, Dissimilarity Measure is a numerical measure of how different two data objects are, ranging from 0 meaning that objects are similar to each other, ∞ meaning that objects are different.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee2

6

3-Class Classification

Page 27: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Common Properties of Dissimilarity Measures

• Conditions for being Metric• �(�, �) ≥ 0 for all � and �, and �(�, �) = 0

if and only if � = �• �(�, �) = �(�, �) for all � and �• �(�, �) ≤ �(�, �) + �(�, �) for all �, �, and �,

where �(�, �) is the distance or dissimilarity between data points, � and �

Distance, such as the Euclidean distance, is a dissimilarity measure and has some well-known properties shown here. A distance that satisfies these properties is called a metric. Following is a list of several common distance measures to compare multivariate data. We will assume that the attributes are all continuous.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee2

7

Triangular Inequality

Page 28: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Dissimilarity Measure: Euclidean Distance

• Euclidean Distance

• Weighted Euclidean Distance

Assume that we have measurements ��� where � = 1,… ,�, and � = 1,… , �.The Euclidean distance between the �th and �th objects is defined as above equation for every pair (�, �) of observations. Next, the weighted Euclidean distance is defined as below equation. If scales of the attributes differ substantially, standardization is necessary.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee2

8

�� �, � = � ��� − �����

���

��

��� �, � = � �� ��� − �����

���

��

Euclidean Distance

Page 29: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Dissimilarity Measure: Minkowski Distance

• Minkowski Distance

• Supremum Distance

The Minkowski distance is a generalization of the Euclidean distance. With the measurement, ��� where � = 1,… , � and � = 1,… , �, the Minkowski distance is defined like this where λ ≥ 1. Because lots of variations are possible according to the value of �, it is also called the L� metric.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee2

9

�� �, � = � ��� − ����

���

��

lim�→�

� ��� − ����

���

��

= max ��� − ��� , … , ��� − ��� Points at a distance 1 from the centerusing the Minkowski distance

Page 30: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Dissimilarity Measure: Minkowski Distance

• Minkowski Distance

• Supremum Distance

When λ = 1, it is called L� metric, Manhattan distance, or City-block distance. When λ = 2, it is called L� metric or Euclidean distance. When λ → ∞, it is called L� metric and Supremum distance which can be simplified as below equation. Note that � and � are two different parameters. Dimension of the data matrix � remains finite.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee3

0

�� �, � = � ��� − ����

���

��

lim�→�

� ��� − ����

���

��

= max ��� − ��� , … , ��� − ��� Points at a distance 1 from the centerusing the Minkowski distance

Page 31: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Dissimilarity Measure: Mahalanobis Distance

• Mahalanobis Distance

• Distance metric (or measure) usually determinesthe similarity among patterns that is very important for applications based on Classification and Clustering

Let � be a � × � matrix where the �th row of � is ��� = ���, … , ��� . Then the Mahalanobis

distance is defined as above equation where Σ�� is the � × � sample covariance matrix.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee3

1

��� �, � = �� − ���Σ�� �� − ��

��

Mahalanobis Distance

Page 32: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Visualization

• Benefits of data visualization• Revealing hidden information through

simple chart and diagrams• Helping data exploration and formulation of

analytical relationship among variables• Providing important insight about data

in a quick and intuitive way

• Consideration: the number of and types of variables

To understand thousands of rows of data in a limited time, there is no alternative to visual representation. Objective of visualization is to reveal the hidden information through simple charts and diagrams. Visual representation of data is the first step towards data exploration and formulation of analytical relationship among the variables.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee3

2

Visualization of Data

Page 33: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Visualization

• Benefits of data visualization• Revealing hidden information through

simple chart and diagrams• Helping data exploration and formulation of

analytical relationship among variables• Providing important insight about data

in a quick and intuitive way

• Consideration: the number of and types of variables

In a complex and huge data, visualization in one, two and three dimension aids data analysts to sift through data in a logical manner and understand the data dynamics. It is instrumental in identifying patterns and relationships among groups of variables.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee3

3

Goal of Data Mining

Page 34: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Visualization

• Benefits of data visualization• Revealing hidden information through

simple chart and diagrams• Helping data exploration and formulation of

analytical relationship among variables• Providing important insight about data

in a quick and intuitive way

• Consideration: the number of, and types of variables

Visualization techniques depend on the type of variables. Techniques available to represent nominal variables are generally not suitable for visualizing continuous variables and vice versa. Graphs, charts and other visual representation provide quick and focused summarization.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee3

4

Types of Variables

Page 35: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Visualization of Single Variable: Histogram

• Histogram is• Representation of the distribution of numerical data• Estimation of probability distribution of a variable

• Wage dataset• 3,000 instances and 11 variables• A small bimodality in the right tail of the distribution• Separating based on “Race” categorical variable• Overlaid histogram is also possible

Histograms are the most common graphical tool to represent continuous data. On the horizontal axis the range of the sample is plotted. On the vertical axis is plotted the frequencies or relative frequencies of each class. The class width has an impact on the shape of the histogram.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee3

5

An Example of Histogram

Page 36: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Visualization of Single Variable: Histogram

• Histogram is• Representation of the distribution of numerical data• Estimation of probability distribution of a variable

• Wage dataset• 3,000 instances and 11 variables• A small bimodality in the right tail of the distribution• Separating based on “Race” categorical variable• Overlaid histogram is also possible

We consider Wage dataset that is related to wage and other data for a group of 3,000 male worker in the Mid-Atlantic region, and is composed of 3,000 instances and 11 variables such as year, age, marital status, race, education, region, and so on. The figure in the right illustrates the distribution of wage for all 3,000 workers.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee3

6

Distribution of Wage

Page 37: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Visualization of Single Variable: Histogram

• Histogram is• Representation of the distribution of numerical data• Estimation of probability distribution of a variable

• Wage dataset• 3,000 instances and 11 variables• A small bimodality in the right tail of the distribution• Separating based on “Race” categorical variable• Overlaid histogram is also possible

The data is mostly symmetrically distributed but there is a small bimodality which is indicated by a small hump towards the right tail of the distribution. Wage data set contains a number of categorical variables one of which is Race.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee3

7

Distribution of Wage

Page 38: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Visualization of Single Variable: Histogram

• Histogram is• Representation of the distribution of numerical data• Estimation of probability distribution of a variable

• Wage dataset• 3,000 instances and 11 variables• A small bimodality in the right tail of the distribution• Separating based on “Race” categorical variable• Overlaid histogram is also possible

A natural question is whether the wage distribution is the same across Race. These histograms are drawn for each Race separately. Because of huge disparity among the counts of the different races, the previous histograms may not be very informative. A visual display of the same information using overlaid histograms may be more informative.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee3

8

Histogram of Wage by Race

Page 39: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Visualization of Single Variable: Histogram

• Histogram is• Representation of the distribution of numerical data• Estimation of probability distribution of a variable

• Wage dataset• 3,000 instances and 11 variables• A small bimodality in the right tail of the distribution• Separating based on “Race” categorical variable• Overlaid histogram is also possible

The second type of histogram also may not be the best way of presenting all the information. However further clarity is seen in the small concentration at the right tail.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee3

9

Overlaid Histogram of Wage by Race

Page 40: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Visualization of Single Variable: Boxplot

• Boxplot is• Depicting groups of numerical data based on quartiles• Describing data distribution and identifying outliers

• Range of outliers• Strictly �1 − 1.5 × ���,�3 + 1.5 × ���• Roughly �1 − 3 × ���,�3 + 3 × ���• Separation is also possible

Boxplot is used to describe shape of a data distribution and especially to identify outliers. Typically an observation is an outlier if it is either less than Q1 - 1.5×IQR or greater than Q3 + 1.5×IQR, where IQR is the Inter-Quartile Range defined as Q3 - Q1.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee4

0

Distribution of Wage

Page 41: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Visualization of Single Variable: Boxplot

• Boxplot is• Depicting groups of numerical data based on quartiles• Describing data distribution and identifying outliers

• Range of outliers• Strictly �1 − 1.5 × ���,�3 + 1.5 × ���• Roughly �1 − 3 × ���,�3 + 3 × ���• Separation is also possible

This rule is conservative and often too many points are identified as outliers. Hence sometimes only those points outside of [Q1 - 3×IQR, Q3 + 3×IQR] are only identified as outliers. In the right, there is a boxplot that results.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee4

1

Distribution of Wage

Page 42: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Visualization of Single Variable: Boxplot

• Boxplot is• Depicting groups of numerical data based on quartiles• Describing data distribution and identifying outliers

• Range of outliers• Strictly �1 − 1.5 × ���,�3 + 1.5 × ���• Roughly �1 − 3 × ���,�3 + 3 × ���• Separation is also possible

The boxplot of the Wage distribution clearly identifies many outliers. It is a reflection of the histogram depicting the distribution of Wage. The story is clearer from the boxplots drawn on the wage distribution for individual races. Here is the boxplot that results. Please see the figure in the right.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee4

2

Boxplot of Wage by Race

Page 43: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Visualization of Two Variable: Scatterplot

• Scatterplot is• Showing the direction and strength of

association between two variables

• Possible observations from Wage data scatterplot• The association between Wage and Age are not

so strong because they are mostly well-spread• Hypothesis: Whites are rich?

• Not really. Just there are too many “White”

The most standard way to visualize relation between two variables is a scatterplot. It shows the direction and strength of association between two variables, but does not quantify. Scatterplots also aid to identify unusual observations. The right figure shows the scatterplot of the variables Age and Wage for the Wage data.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee4

3

Relationship betweenAge and Wage

Page 44: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Visualization of Two Variable: Scatterplot

• Scatterplot is• Showing the direction and strength of

association between two variables

• Possible observations from Wage data scatterplot• The association between Wage and Age are not

so strong because they are mostly well-spread• Hypothesis: Whites are rich?

• Not really. Just there are too many “White”

It is clear from the scatterplot that the Wage does not seem to depend on Age very strongly. However a set of points is towards top are very different from the rest. A natural follow-up question is whether Race has any impact on the Age-Wage dependency, or the lack of it.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee4

4

Relationship betweenAge and Wage

Page 45: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Visualization of Two Variable: Scatterplot

• Scatterplot is• Showing the direction and strength of

association between two variables

• Possible observations from Wage data scatterplot• The association between Wage and Age are not

so strong because they are mostly well-spread• Hypothesis: Whites are rich?

• Not really. Just there are too many “White”

In Wage dataset, there are a large number of instances regarding Whites, and it masks the effects of the other races. There does not seem to be any association between Age and Wage, controlling for Race. This is a frequently-experienced situation when we do exploration analysis on unknown dataset.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee4

5

Relationship betweenAge and Wage

Page 46: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Visualization of Two Variable: Contour Plot

• Contour plot is• Representing contour lines that form the boundaries

of regions connecting points with equal values• Composed of contour lines that can be viewed as

slices of a bivariate density

• Perfect circles• Two random variables are independent• Weak association

Contour plot is useful when a continuous attribute is measured on a spatial grid. They partition the plane into regions of similar values. The contour lines that form the boundaries of these regions connect points with equal values. In spatial statistics contour plots have a lot of applications.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee4

6

An Example of Contour Plot

Page 47: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Visualization of Two Variable: Contour Plot

• Contour plot is• Representing contour lines that form the boundaries

of regions connecting points with equal values• Composed of contour lines that can be viewed as

slices of a bivariate density

• Perfect circles• Two random variables are independent• Weak association

Contour plots join points of equal probability. Within the contour lines concentration of bivariate distribution is the same. One may think of the contour lines as slices of a bivariate density, sliced horizontally.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee4

7

Slice of Bivariate Density

Page 48: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Visualization of Two Variable: Contour Plot

• Contour plot is• Representing contour lines that form the boundaries

of regions connecting points with equal values• Composed of contour lines that can be viewed as

slices of a bivariate density

• Perfect circles• Two random variables are independent• Weak association

Contour plots are concentric. If they are perfect circles then the random variables are independent. The more oval-shaped they are, the farther they are from independence. In the right plot the two disjoint shapes in the interior-most part indicate that a small part of the data is very different from the rest.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee4

8

Contour Plot of Age and Wage

Page 49: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Visualization of more than Two Variables: Scatterplot Matrix

• Scatterplot Matrix is• Displaying all the pairwise scatterplots of the variables

on a single view with multiple scatterplots

Displaying more than two variables on a single scatterplot is not possible. Scatterplot matrix is one possible visualization of three or more continuous variables taken two at a time. The data set used to display scatterplot matrix is the College data that is related to the statistics for a large number of US Colleges from the 1995 issue of US News and World Report.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee4

9

Scatterplot Matrix ofCollege Attributes

Page 50: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Step 3: Data Mining

• Methodologies for Data Mining• Association rule mining• Anomaly detection• Clustering• Classification and Regression• Summarization

Data Mining involves roughly five classes of tasks such as 1) Association rule mining, 2) Anomaly detection, 3) Clustering, 4) Classification and Regression, and 5) Summarization.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee5

0

Methodologies for Data Mining

Page 51: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Step 3: Data Mining

• Methodologies for Data Mining• Association rule mining

• Relationships between variables• Anomaly detection

• Identification of unusual data records• Clustering

• Discovering structures hidden in the data• Classification and Regression• Summarization

Association Rule Mining searches for relationships between variables. Anomaly detectionrefers to the identification of unusual data records, that might be interesting or data errors that require further investigation. Clustering is the task of discovering structures in the data that are in some way or another "similar", without using known structures in the data.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee5

1

Association Rule Mining

Anomaly Detection

Clustering

Page 52: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Step 3: Data Mining

• Methodologies for Data Mining• Association rule mining• Anomaly detection• Clustering• Classification and Regression

• Generalizing known structure to apply to new data• Categorical vs. Numerical

• Summarization• Finding a compact representation of the data

Classification is the task of generalizing known structure to apply to new data whereas Regression attempts to find a function which models the data with the least error that is, for estimating the relationships among data or datasets. Summarization can provide a more compact representation of the data set, including visualization and report generation.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee5

2

Classification

Regression

Summarization

Page 53: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Step 3: Data Mining

• Methodologies for Data Mining• Association rule mining• Anomaly detection• Clustering• Classification and Regression• Summarization

In this class, the Association Rule Mining that is one of the most representative Data Mining technique will be discussed briefly.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee5

3

Association Rule Mining inDescriptive Data Analysis

Page 54: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Association Rule Mining

• Goal• Discover interesting relations among variables• Discover strong rule in the viewpoint of a measure

• Market basket analysis• Rule example: onions, potatoes ⇒ burger

• Other examples• Web usage mining, Intrusion detection, Bioinformatics

Association Rule Mining is a popular and well-studied method for discovering interesting relations among variables. It can be also described as a task of analyzing and presenting strong rules discovered in databases using different measures of interestingness.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee5

4

Association Rule Mining

Page 55: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Association Rule Mining

• Goal• Discover interesting relations among variables• Discover strong rule in the viewpoint of a measure

• Market basket analysis• Rule example: onions, potatoes ⇒ burger

• Other examples• Web usage mining, Intrusion detection, Bioinformatics

Based on the concept of strong rules, a research group introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (POS) systems in supermarkets.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee5

5

Market Basket Analysis

Page 56: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Association Rule Mining

• Goal• Discover interesting relations among variables• Discover strong rule in the viewpoint of a measure

• Market basket analysis• Rule example: onions, potatoes ⇒ burger

• Other examples• Web usage mining, Intrusion detection, Bioinformatics

For example, the rule onions, potatoes ⇒ burger found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, he or she is likely to also buy hamburger meat. Such information can be used as the basis for decisions about marketing activities such as promotional pricing or product placements.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee5

6

Market Basket Analysis

Page 57: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Association Rule Mining

• Goal• Discover interesting relations among variables• Discover strong rule in the viewpoint of a measure

• Market basket analysis• Rule example: onions, potatoes ⇒ burger

• Other examples• Web usage mining, Intrusion detection, Bioinformatics

In addition to the above example from market basket analysis association rules are employed today in many application areas including Web usage mining, intrusion detection and bioinformatics. As opposed to sequence mining, association rule learning typically does not consider the order of items either within a transaction or across transactions.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee5

7

Web Mining

Page 58: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Definition

• Formal definition of ARM• A set of items: � = ��, ��, … , �� where

each element is binary attribute• A set of transactions: � = ��, ��, … , �� where

each transaction contains a subset of �• An implication rule � ⇒ � where�, � ⊆ � and � ∩ � = ∅

The problem of association rule mining is defined Like this. Let � = ��, ��, … , �� be a set of � binary attributes called items. Let � = ��, ��, … , �� be a set of transactions called the database. Each transaction in � has a unique transaction ID and contains a subset of the items in �.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee5

8

TID milk bread butter beer

1 1 1 0 0

2 0 0 1 0

3 0 0 0 1

4 1 1 1 0

5 0 1 0 0

Item ��

��

Page 59: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Definition

• Formal definition of ARM• A set of items: � = ��, ��, … , �� where

each element is binary attribute• A set of transactions: � = ��, ��, … , �� where

each transaction contains a subset of �• An implication rule � ⇒ � where�, � ⊆ � and � ∩ � = ∅

A rule is defined as an implication of the form � ⇒ � where �, � ⊆ � and � ∩ � = ∅. The sets of items or itemsets for short, � and � are called antecedent and consequent of the rule respectively.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee5

9

TID milk bread butter beer

1 1 1 0 0

2 0 0 1 0

3 0 0 0 1

4 1 1 1 0

5 0 1 0 0

Page 60: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Definition

• An example of each notation• A set of items: � = {milk, bread, butter, beer}• A set of transactions: � = ��, ��, … , �� where

• �� = {milk, bread}, �� = {butter}, �� = beer , �� = {milk, bread, butter}, �� = {bread}

• An implication rule: butter, bread ⇒ milk where• butter, bread : Antecedent• milk : Consequence

To illustrate the concepts, we use a small example from the supermarket domain. The set of items is � = {milk, bread, butter, beer} and a small database containing the items is shown in the table to the right where 1 codes presence and 0 absence of an item in a transaction.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee6

0

TID milk bread butter beer

1 1 1 0 0

2 0 0 1 0

3 0 0 0 1

4 1 1 1 0

5 0 1 0 0

Page 61: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Definition

• An example of each notation• A set of items: � = {milk, bread, butter, beer}• A set of transactions: � = ��, ��, … , �� where

• �� = {milk, bread}, �� = {butter}, �� = beer , �� = {milk, bread, butter}, �� = {bread}

• An implication rule: butter, bread ⇒ milk where• butter, bread : Antecedent• milk : Consequence

An example rule for the supermarket could be butter, bread ⇒ milk meaning that if butter and bread are bought, customers also buy milk.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee6

1

TID milk bread butter beer

1 1 1 0 0

2 0 0 1 0

3 0 0 0 1

4 1 1 1 0

5 0 1 0 0

Page 62: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Definition

• An example of each notation• A set of items: � = {milk, bread, butter, beer}• A set of transactions: � = ��, ��, … , �� where

• �� = {milk, bread}, �� = {butter}, �� = beer , �� = {milk, bread, butter}, �� = {bread}

• An implication rule: butter, bread ⇒ milk where• butter, bread : Antecedent• milk : Consequence

Note that this example is extremely small. In practical applications, a rule needs a support of several hundred transactions before it can be considered statistically significant, and datasets often contain thousands or millions of transactions.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee6

2

TID milk bread butter beer

1 1 1 0 0

2 0 0 1 0

3 0 0 0 1

4 1 1 1 0

5 0 1 0 0

Page 63: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Useful Concepts

• Goal• Discover interesting relations among variables• Discover strong rule in the viewpoint of a measure

• Interestingness measure• Support supp(�)

• Proportion of transactions containing itemset �• Confidence conf(� ⇒ �)

• Proportion of transactions among � co-occuring �

To select interesting rules from the set of all possible rules, constraints on various measures of significance and interest can be used. The best-known constraints are minimum thresholds on Support and Confidence.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee6

3

TID milk bread butter beer

1 1 1 0 0

2 0 0 1 0

3 0 0 0 1

4 1 1 1 0

5 0 1 0 0

Page 64: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Useful Concepts

• Goal• Discover interesting relations among variables• Discover strong rule in the viewpoint of a measure

• Interestingness measure• Support supp(�)

• Proportion of transactions containing itemset �• Confidence conf(� ⇒ �)

• Proportion of transactions among � co-occuring �

The Support supp(�) of an itemset � is defined as the proportion of transactions in the data set which contain the itemset. In the example database, the itemset milk, bread, butter has

a support of �

�= 0.2 since it occurs in 20% of all transactions-1 out of 5 transactions.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee6

4

TID milk bread butter beer

1 1 1 0 0

2 0 0 1 0

3 0 0 0 1

4 1 1 1 0

5 0 1 0 0

Page 65: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Useful Concepts

• Goal• Discover interesting relations among variables• Discover strong rule in the viewpoint of a measure

• Interestingness measure• Support supp(�)

• Proportion of transactions containing itemset �• Confidence conf(� ⇒ �)

• Proportion of transactions among � co-occuring �

The Confidence of a rule is defined conf � ⇒ � = supp � ∩ � /supp � . For example, the rule milk, bread ⇒ butter has a confidence of 0.2/0.4 = 0.5 in the database, which means that for 50% of the transactions containing milk and bread the rule is correct. 50% of the times a customer buys milk and bread, butter is bought as well.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee6

5

TID milk bread butter beer

1 1 1 0 0

2 0 0 1 0

3 0 0 0 1

4 1 1 1 0

5 0 1 0 0

Page 66: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Process

• Steps based on user-defined parameter values• Minimum Support is used to find all frequent itemsets• Minimum Confidence is used to find useful rules

• Issues• Size of search space: there can be � 2� itemsets• Downward-closure property

• Support of parent itemset is always larger thanSupport of child itemset (monotinicity)

Association rules are usually required to satisfy a user-specified minimum Support and a user-specified minimum Confidence at the same time. Based on two measures, ARM can be split up into two steps: 1) Minimum Support is applied to find all frequent itemsets and 2) Found frequent itemsets and the minimum Confidence constraint are used to form rules.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee6

6

Page 67: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Process

• Steps based on user-defined parameter values• Minimum Support is used to find all frequent itemsets• Minimum Confidence is used to find useful rules

• Issues• Size of search space: there can be � 2� itemsets• Downward-closure property

• Support of parent itemset is always larger thanSupport of child itemset (monotinicity)

Finding all frequent itemsets in a database is difficult since it involves searching all possible itemsets or item combinations. The set of possible itemsets is the power set over � and has size 2� − 1 excluding the empty set which is not a valid itemset.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee6

7

TID bread butter beer

1 0 0 1

2 0 1 0

3 0 1 1

4 1 0 0

5 1 0 1

6 1 1 0

7 1 1 1

All Possible Rules When � = �

Page 68: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Process

• Steps based on user-defined parameter values• Minimum Support is used to find all frequent itemsets• Minimum Confidence is used to find useful rules

• Issues• Size of search space: there can be � 2� itemsets• Downward-closure property

• Support of parent itemset is always larger thanSupport of child itemset (monotinicity)

Although the size of the powerset grows exponentially in the number of items � in �, efficient search is possible using the downward-closure property of Support which guarantees that for a frequent itemset, all its subsets are also frequent and thus for an infrequent itemset, all its supersets must also be infrequent.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee6

8

Downward-closure Property

Page 69: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Process

• Steps based on user-defined parameter values• Minimum Support is used to find all frequent itemsets• Minimum Confidence is used to find useful rules

• Issues• Size of search space: there can be � 2� itemsets• Downward-closure property

• Support of parent itemset is always larger thanSupport of child itemset (monotinicity)

Frequent itemset lattice, where the color of the box indicates how many transactions contain the combination of items. Note that lower levels of the lattice can contain at most the minimum number of their parents’ items. For example {a, c} can have only at most min(a, c)items. This is called the downward-closure property.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee6

9

Downward-closure Property

Page 70: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Step 4: Results Evaluation

• Validation• Monitoring that the patterns mined by algorithms

occur in larger dataset• Not all patterns found are necessarily valid!• Splitting data into Training set and Test set

• If precious patterns are mined then interpret them,if not go back to pre-processing step and do again

The final step of KDD is to verify that the patterns produced by the Data Mining algorithms occur in the wider dataset. Not all patterns found by the Data Mining algorithms are necessarily valid. It is common for the Data Mining algorithms to find patterns in the training set which are not present in the general data set like overfitting.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee7

0

An Example of Overfitting

Page 71: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Step 4: Results Evaluation

• Validation• Monitoring that the patterns mined by algorithms

occur in larger dataset• Not all patterns found are necessarily valid!• Splitting data into Training set and Test set

• If precious patterns are mined then interpret them,if not go back to pre-processing step and do again

To overcome this, the evaluation uses a test set of data on which the Data Mining algorithm was not trained. The learned patterns are applied to this test set, and the resulting output is compared to the desired output.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee7

1

Cross-validation

Page 72: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Step 4: Results Evaluation

• Validation• Monitoring that the patterns mined by algorithms

occur in larger dataset• Not all patterns found are necessarily valid!• Splitting data into Training set and Test set

• If precious patterns are mined then interpret them,if not go back to pre-processing step and do again

If the learned patterns do not meet the desired standards, subsequently it is necessary to re-evaluate and change the pre-processing and Data Mining steps. If the learned patterns do meet the desired standards, then the final step is to interpret the learned patterns and turn them into knowledge.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee7

2

Data Mining Cycle

Page 73: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Examples of Data Mining

• Impact of Data Mining on practice• Reveal hidden patterns and trends from historical

business activities written in statistical summary

• Tools for combatting large amount of data• Pattern Recognition algorithms

• Representative examples• Market basket analysis based on ARM

Data Mining, the process of discovering patterns in large data sets, has been used in many applications. For example, in business domain, Data Mining is the analysis of historical business activities, stored as static data in data warehouse databases where the goal is to reveal hidden patterns and trends.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee7

3

Page 74: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Examples of Data Mining

• Impact of Data Mining on practice• Reveal hidden patterns and trends from historical

business activities written in statistical summary

• Tools for combatting large amount of data• Pattern Recognition algorithms

• Representative examples• Market basket analysis based on ARM

Data Mining software uses advanced Pattern Recognition algorithms to sift through large amounts of data to assist in discovering previously unknown strategic business information. This includes famous Machine Learning algorithms such as naïve Bayes, Nearest Neighbor, Support Vector Machine, Artificial Neural Networks, k-Means, Apriori, and so on.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee7

4

Top Data Mining Algorithms

Page 75: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Examples of Data Mining

• Impact of Data Mining on practice• Reveal hidden patterns and trends from historical

business activities written in statistical summary

• Tools for combatting large amount of data• Pattern Recognition algorithms

• Representative examples• Market basket analysis based on ARM

Examples of what businesses use Data Mining for is to include performing market analysis to identify new product bundles, finding the root cause of manufacturing problems, to prevent customer attrition and acquire new customers, cross-selling to existing customers, and profiling customers with more accuracy. Here we will review some popular applications.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee7

5

Applications of Data Mining

Page 76: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Examples of Data Mining• Market Basket Analysis

• Walmart collects 20 million POS data every day• By applying Data Mining techniques, we can

• Develop marketing campaign (Diapers and Beer)• Predict customer loyalty

• Product Recommendation, User Behavior Mining, Customer Relationship Management, Human Resource Management, Human Genome Analysis,Electrical Power Engineering, Educational Data Mining, Music Data Mining, Terrorism Prevention

The first one is the Market Basket Analysis. In today’s world raw data is being collected by companies at an exploding rate. For example, Walmart processes over 20 million point-of-sale transactions every day. This information is stored in a centralized database, but would be useless without some type of Data Mining software to analyze it.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee7

6

Displaying Stuffs according toData Mining on POS Transactions

Page 77: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Examples of Data Mining• Market Basket Analysis

• Walmart collects 20 million POS data every day• By applying Data Mining techniques, we can

• Develop marketing campaign (Diapers and Beer)• Predict customer loyalty

• Product Recommendation, User Behavior Mining, Customer Relationship Management, Human Resource Management, Human Genome Analysis,Electrical Power Engineering, Educational Data Mining, Music Data Mining, Terrorism Prevention

If Walmart analyzed their POS data with Data Mining techniques they would be able to determine sales trends, develop marketing campaigns, and more accurately predict customer loyalty. One such example for Walmart would be that of diapers and beer sales, discovered through Data Mining.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee7

7

Displaying Stuffs according toData Mining on POS Transactions

Page 78: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Examples of Data Mining• Market Basket Analysis• Product Recommendation

• Categorization and grouping similar items• Improving user experience• Formulated as Classification task

• Input: words for textual description of the items• Output: item categories

• User Behavior Mining, Customer Relationship Management, Human Resource Management, Human Genome Analysis, Electrical Power Engineering, Educational Data Mining, Music Data Mining, Terrorism Prevention

The second one is Product Recommendation. Categorization of the items available in the e-Commerce site is a fundamental problem. A correct item categorization system is essential for user experience as it aids determine the items relevant to him for search and browsing.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee7

8

An Example ofProduct Categorization

Page 79: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Examples of Data Mining• Market Basket Analysis• Product Recommendation

• Categorization and grouping similar items• Improving user experience• Formulated as Classification task

• Input: words for textual description of the items• Output: item categories

• User Behavior Mining, Customer Relationship Management, Human Resource Management, Human Genome Analysis, Electrical Power Engineering, Educational Data Mining, Music Data Mining, Terrorism Prevention

Item categorization can be formulated as a supervised classification problem in Data Mining where the categories are the target classes and the features are the words composing some textual description of the items.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee7

9

An Example ofProduct Categorization

Page 80: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Examples of Data Mining• Market Basket Analysis• Product Recommendation

• Categorization and grouping similar items• Improving user experience• Formulated as Classification task

• Input: words for textual description of the items• Output: item categories

• User Behavior Mining, Customer Relationship Management, Human Resource Management, Human Genome Analysis, Electrical Power Engineering, Educational Data Mining, Music Data Mining, Terrorism Prevention

One of the approaches is to find groups initially which are similar and place them together in a latent group. Now given a new item, first classify into a latent group which is called Coarse-level classification. Then, do a second round of classification known as Fine-level classification to find the category to which the item belongs to.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee8

0

Page 81: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Examples of Data Mining• Market Basket Analysis• Product Recommendation• User Behavior Mining

• Predict user’s behavior based on card usage data• Warn drivers based on customer driving patterns

• Customer Relationship Management, Human Resource Management, Human Genome Analysis, Electrical Power Engineering, Educational Data Mining, Music Data Mining, Terrorism Prevention

The third one is User Behavior Mining. Every time a credit card or a store loyalty card is being used, or a warranty card is being filled, data is being collected about the user's behavior. Many people find the amount of information stored about us from companies, such as Google, Facebook, and Amazon, disturbing and are concerned about privacy.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee8

1

User Behavior Prediction

Page 82: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Examples of Data Mining• Market Basket Analysis• Product Recommendation• User Behavior Mining

• Predict user’s behavior based on card usage data• Warn drivers based on customer driving patterns

• Customer Relationship Management, Human Resource Management, Human Genome Analysis, Electrical Power Engineering, Educational Data Mining, Music Data Mining, Terrorism Prevention

Although there is the potential for our personal data to be used in harmful, or unwanted, ways it is also being used to make our lives better. For example, Ford and Audi hope to one day collect information about customer driving patterns so they can recommend safer routes and warn drivers about dangerous road conditions.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee8

2

Driving Pattern Analysis

Page 83: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Examples of Data Mining• Market Basket Analysis• Product Recommendation• User Behavior Mining• Customer Relationship Management

• Resource optimization: contact customers who are predicted to have a high likelihood of accepting

• Automated mailing: recommending new products• Uplift modeling for passive customers

• Human Resource Management, Human Genome Analysis, Electrical Power Engineering, Educational Data Mining, Music Data Mining, Terrorism Prevention

Data Mining in Customer Relationship Management applications can contribute significantly to the bottom line of income. Rather than randomly contacting a prospect or customer through a call center or sending mail, a company can concentrate its efforts on prospects that are predicted to have a high likelihood of responding to an offer.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee8

3

Customer Relationship Management

Page 84: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Examples of Data Mining• Market Basket Analysis• Product Recommendation• User Behavior Mining• Customer Relationship Management

• Resource optimization: contact customers who are predicted to have a high likelihood of accepting

• Automated mailing: recommending new products• Uplift modeling for passive customers

• Human Resource Management, Human Genome Analysis, Electrical Power Engineering, Educational Data Mining, Music Data Mining, Terrorism Prevention

More sophisticated methods may be used to optimize resources across campaigns so that one may predict to which channel and to which offer an individual is most likely to respond across all potential offers.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee8

4

Personalized Marketing

Page 85: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Examples of Data Mining• Market Basket Analysis• Product Recommendation• User Behavior Mining• Customer Relationship Management

• Resource optimization: contact customers who are predicted to have a high likelihood of accepting

• Automated mailing: recommending new products• Uplift modeling for passive customers

• Human Resource Management, Human Genome Analysis, Electrical Power Engineering, Educational Data Mining, Music Data Mining, Terrorism Prevention

Additionally, sophisticated applications could be used to automate mailing. Once the potential prospect/customer and channel/offer are determined, this "sophisticated application" can either automatically send an e-mail or a regular mail.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee8

5

Automated Mailing

Page 86: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Examples of Data Mining• Market Basket Analysis• Product Recommendation• User Behavior Mining• Customer Relationship Management

• Resource optimization: contact customers who are predicted to have a high likelihood of accepting

• Automated mailing: recommending new products• Uplift modeling for passive customers

• Human Resource Management, Human Genome Analysis, Electrical Power Engineering, Educational Data Mining, Music Data Mining, Terrorism Prevention

Finally, in cases where many people will take an action without an offer, "uplift modeling" can be used to determine which people have the greatest increase in response if given an offer. Uplift modeling thereby enables marketers to focus mailings and offers on persuadable people, and not to send offers to people who will buy the product without an offer.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee8

6

Uplift Modeling

Page 87: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Examples of Data Mining• Market Basket Analysis• Product Recommendation• User Behavior Mining• Customer Relationship Management• Human Resource Management

• Identifying talented ones based on university activities• Human Genome Analysis, Electrical Power Engineering, Educational Data

Mining, Music Data Mining, Terrorism Prevention

Data Mining can be helpful to Human Resources (HR) Management in identifying the characteristics of their most successful employees. Obtained information such as universities attended by highly successful employees, can aid HR-focus recruiting efforts accordingly.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee8

7

Human-Resource Management

Page 88: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Examples of Data Mining• Market Basket Analysis• Product Recommendation• User Behavior Mining• Customer Relationship Management• Human Resource Management• Human Genome Analysis

• Identifying the mapping relationship betweengenotype and phenotype such as disease susceptibility

• Electrical Power Engineering, Educational Data Mining, Music Data Mining, Terrorism Prevention

Human Genome Analysis. In the study of human genetics, sequence mining aids address the important goal of understanding the mapping relationship between the inter-individual variations in human DNA sequence and the variability in disease susceptibility.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee8

8

Flow of DNA Sequencing

Page 89: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Examples of Data Mining• Market Basket Analysis• Product Recommendation• User Behavior Mining• Customer Relationship Management• Human Resource Management• Human Genome Analysis

• Identifying the mapping relationship betweengenotype and phenotype such as disease susceptibility

• Electrical Power Engineering, Educational Data Mining, Music Data Mining, Terrorism Prevention

In simple terms, it aims to find out how the changes in an individual's DNA sequence affects the risks of developing common diseases such as cancer, which is of great importance to improving methods of diagnosing, preventing, and treating these diseases. One Data Mining method that is used to perform this task is known as multifactor dimensionality reduction.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee8

9

Multifactor Dimensionality Reduction

Page 90: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Examples of Data Mining• Market Basket Analysis• Product Recommendation• User Behavior Mining• Customer Relationship Management• Human Resource Management• Human Genome Analysis• Electrical Power Engineering

• Condition monitoring based on voltage data• Educational Data Mining, Music Data Mining, Terrorism Prevention

In the area of Electrical Power Engineering, Data Mining methods have been widely used for condition monitoring of high voltage electrical equipment. The purpose of condition monitoring is to obtain valuable information on, for example, the status of the insulation or other important safety-related parameters.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee9

0

Diagnosis for Complex Equipment

Page 91: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Examples of Data Mining• Market Basket Analysis• Product Recommendation• User Behavior Mining• Customer Relationship Management• Human Resource Management• Human Genome Analysis• Electrical Power Engineering• Educational Data Mining

• Identify important factors for learning• Music Data Mining, Terrorism Prevention

Educational Data Mining. In educational research, where Data Mining has been used to study the factors leading students to choose to engage in behaviors which reduce their learning, and to understand factors influencing university student retention.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee9

1

Usage of Educational Data Mining

Page 92: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Examples of Data Mining• Market Basket Analysis• Product Recommendation• User Behavior Mining• Customer Relationship Management• Human Resource Management• Human Genome Analysis• Electrical Power Engineering• Educational Data Mining• Music Data Mining• Terrorism Prevention

Music Data Mining techniques, and in particular co-occurrence analysis, has been used to discover relevant similarities among music corpora such as radio lists, CD databases for purposes including classifying music into emotions in a more objective manner.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee9

2

Plutchik’s Wheel of Emotions

Page 93: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Examples of Data Mining• Market Basket Analysis• Product Recommendation• User Behavior Mining• Customer Relationship Management• Human Resource Management• Human Genome Analysis• Electrical Power Engineering• Educational Data Mining• Music Data Mining• Terrorism Prevention

Pattern Mining includes new areas such a Music Information Retrieval (MIR) where patterns seen both in the temporal and non-temporal domains are imported to classical knowledge discovery search methods.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee9

3

Music Recommendation System

Page 94: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Examples of Data Mining• Market Basket Analysis• Product Recommendation• User Behavior Mining• Customer Relationship Management• Human Resource Management• Human Genome Analysis• Electrical Power Engineering• Educational Data Mining• Music Data Mining• Terrorism Prevention

In the context of combating terrorism, the National Research Council provides the following definition. “Subject-based Data Mining uses an initiating individual or other datum that is considered, based on other information, to be of high interest.”

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee9

4

Subject-based Data Mining

Page 95: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Examples of Data Mining• Market Basket Analysis• Product Recommendation• User Behavior Mining• Customer Relationship Management• Human Resource Management• Human Genome Analysis• Electrical Power Engineering• Educational Data Mining• Music Data Mining• Terrorism Prevention

“The goal is to determine what other persons or financial transactions or movements, are related to that initiating datum.” For this case, Subject-based Data Mining that searches for associations between individuals in data, can be involved.

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee9

5

An Example of Big Data Policing

Page 96: Artificial Intelligence e Data Mining and its Applications e L g J g I Ami.cau.ac.kr/teaching/lecture_aai/DM.pdf · 2019-04-05 · large databases on commercial transactions have

Examples of Data Mining• Market Basket Analysis• Product Recommendation• User Behavior Mining• Customer Relationship Management• Human Resource Management• Human Genome Analysis• Electrical Power Engineering• Educational Data Mining• Music Data Mining• Terrorism Prevention

Art

ific

ial

Inte

llig

ence

| C

hung

-Ang

Uni

vers

ity

| Nar

rati

on:

Pro

f. J

aesu

ng L

ee9

6

An Example of Big Data Policing

This is the end of my presentation. Thanks for listening.