expert systems, 4 th nf association rules cs 157b prof. sin-min lee

52
Expert Systems, 4 th NF Association Rules CS 157B Prof. Sin-Min Lee

Upload: derek-garrett

Post on 03-Jan-2016

224 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Expert Systems, 4 th NF Association Rules CS 157B Prof. Sin-Min Lee

Expert Systems, 4th NFAssociation Rules

CS 157BProf. Sin-Min Lee

Page 2: Expert Systems, 4 th NF Association Rules CS 157B Prof. Sin-Min Lee

COMPONENTS OF AN EXPERT SYSTEM

KNOWLEDGE BASE

KNOWLEDGE REPRESENTATION

. PREDICATE CALCULUS

. LISTS

. FRAMES

. SCRIPTS

. SEMANTIC NETWORKS

. PRODUCTION RULES

Page 3: Expert Systems, 4 th NF Association Rules CS 157B Prof. Sin-Min Lee

Knowledgebase

DynamicData base

Inference engine

Expert system architecture

Page 4: Expert Systems, 4 th NF Association Rules CS 157B Prof. Sin-Min Lee

People and ComputersWHAT COMPUTERS CAN DO BETTER THAN PEOPLE

. Numerical Computation

. Information storage

. Repetitive Operations

. Computers are “Just Machines”

Page 5: Expert Systems, 4 th NF Association Rules CS 157B Prof. Sin-Min Lee

The major application areas of AI

Artificial Intelligence

-GENERAL PROBLEM SOLVING

-EXPERT SYSTEMS

-NATURAL LANGUAGE PROCESSING

-COMPUTER VISION

-ROBOTICS

-COMPUTER AIDED INSTRUCTION

-AUTOMATIC PROGRAMMING

- PLANNING AND DECISION SUPPORT

Page 6: Expert Systems, 4 th NF Association Rules CS 157B Prof. Sin-Min Lee

Heuristics

- Subconscious Heuristic

- Conscious Heuristic

Page 7: Expert Systems, 4 th NF Association Rules CS 157B Prof. Sin-Min Lee

COMPONENTS OF AN EXPERT SYSTEM KNOWLEDGE BASE - PRODUCTION RULES

ADVANTAGES OF USING A PRODUCTION SYSTEM

TO REPRESENT KNOWLEDGE

. Explanation

. Modification

. Understanding

Page 8: Expert Systems, 4 th NF Association Rules CS 157B Prof. Sin-Min Lee

COMPONENTS OF AN EXPERT SYSTEMINFERENCE ENGINE

BLIND SEARCH TECHNIQUES

. BREADTH FIRST OR DEPTH FIRST SEARCH

. FORWARD OR BACKWARD CHAINING

Page 9: Expert Systems, 4 th NF Association Rules CS 157B Prof. Sin-Min Lee

COMPONENTS OF AN EXPERT SYSTEMINFERENCE ENGINE

SEARCH TECHNIQUES. IN PRACTICE A COMBINATION OF THE TWO CHAINNING TECHNIQUES ARE USED . BIDIRECTIONAL SEARCH

. PRUNING THE SEARCH TREE . USE HEURISTIC SEARCH TECHNIQUES

Page 10: Expert Systems, 4 th NF Association Rules CS 157B Prof. Sin-Min Lee

COMPONENTS OF AN EXPERT SYSTEMINFERENCE ENGINE

HEURISTIC SEARCH TECHNIQUES

LIMIT THE SEARCH PROCESS IN AN EFFORT TO REACH THE SOLUTION

FASTER

Page 11: Expert Systems, 4 th NF Association Rules CS 157B Prof. Sin-Min Lee

COMPONENTS OF AN EXPERT SYSTEMINFERENCE ENGINE

HEURISTIC SEARCH TECHNIQUES

. BACKTRACKING

. MINIMAX

. STATIC EVALUATION

Page 12: Expert Systems, 4 th NF Association Rules CS 157B Prof. Sin-Min Lee

An Expert system is a program which has a wide base of knowledge in a restricted domain, and uses complex inferential reasoning to perform tasks which a human expert could do.

Wehbank 1983

Page 13: Expert Systems, 4 th NF Association Rules CS 157B Prof. Sin-Min Lee

SOLUTION

FACTS

INTERPRETER

KNOWLEDGE BASE

DATA

BASIC ARCHITECTURE OF EXPERT SYSTEM

Page 14: Expert Systems, 4 th NF Association Rules CS 157B Prof. Sin-Min Lee

PHYSIOLOGY OF EXPERT SYSTEM

Two methods for triggering ( I.e putting into operation ) rules

. Forward Chaining ( fact-directed reasoning )A process of examining the left part of each rule in turn and applying the rule whenever the conditions for this past are found to hold, the process ends when it ceases to give any new fact.

. Backward Chaining ( goal-directed reasoning )The goal to be attained is given and the right parts of the rule are examined to find which of these include this goal, this sets up new goals which are subgoals for the original goal, and so on, until a known fact is established.

Page 15: Expert Systems, 4 th NF Association Rules CS 157B Prof. Sin-Min Lee

FEATURES OF AN EXPERT SYSTEM

THE PROGRAM SHOULD BE

. DEVELOPED TO MEET A SPECIFIC NEED

. EASY TO USE

. EDUCATIONAL, WHEN APPROPRIATE

. ABLE TO EXPLAIN ITS ADVICE

. ABLE TO RESPOND TO SIMPLE QUESTIONS

. ABLE TO LEARN NEW KNOWLEDGE

KNOWLEDGE IN THE PROGRAM SHOULD BE EASY MODIFIED

. CORRECT ERROR

. ADD NEW INFORMATION

Page 16: Expert Systems, 4 th NF Association Rules CS 157B Prof. Sin-Min Lee

EXAMPLE

R5 IF Z AND L THEN SR1 IF A AND N THEN ER3 IF D OR M THEN ZR2 IF A THEN MR4 IF Q AND (NOT W) AND (NOT Z) THEN NR6 IF L AND M THEN ER7 IF B AND C THEN Q

KNOWN FACTS (FACTS BASE) - (A,L)GOAL TO BE ESTABLISHED = E

Page 17: Expert Systems, 4 th NF Association Rules CS 157B Prof. Sin-Min Lee

FORWARD CHAININGRegard the process as a sequence of iterations through the rules

First iteration R2 and R6 are triggered ( A, L, M)Second iteration R3 is triggered ( A, L, M, E, Z)Third iteration R5 is triggered ( A, L, M, E, Z, S)

The goal E is reached at this stage; as it happens, not further iteration is possible and the process stops here.

Page 18: Expert Systems, 4 th NF Association Rules CS 157B Prof. Sin-Min Lee

BACKWARD CHAINING

B

C

QR7

NOT W

NOT Z

NR4

ATRUE

R1

R6

LTRUE

M

E

GOAL

ATRUE

Page 19: Expert Systems, 4 th NF Association Rules CS 157B Prof. Sin-Min Lee

Knowledge Representation

1- Semantics Networks2- Object- Attribute - Value triplets3- Rules4- Frames

Page 20: Expert Systems, 4 th NF Association Rules CS 157B Prof. Sin-Min Lee

Object - Attribute - ValueTriplets

-- Static knowledge vs. Instances-- Objects are ordered and related-- Handles uncertainty with a certainty factorO - A - V

Page 21: Expert Systems, 4 th NF Association Rules CS 157B Prof. Sin-Min Lee

Frames-- a description of an object that contains slot for all the information associated with the object--slots may contain default values, pointers of other frames, set of rules, or procedures by which values may be obtained.Figure 4.12 Frame for Wilson’s coat

COAT

Slots: Entries:OwnerConditionCondition of cuffsCondition of elbow

Number of armsFabricsPockets?

Size

Style

WilsonRumpledWorm, shinyWorm, shiny

Default: 2Default: woolDefault: yes

If needed, find owner’s height and weight, and compare to Table X.

If needed, find out collar; pockets; pockets and length; the look in table Y

Page 22: Expert Systems, 4 th NF Association Rules CS 157B Prof. Sin-Min Lee

Inference EngineFunctions:

1- Inference:Examines existing facts and rules, and add new facts when possible.

2- Control:Decides the order in which inference are made.

Page 23: Expert Systems, 4 th NF Association Rules CS 157B Prof. Sin-Min Lee

Limitations-- confined to well defined problem, unable to reason over a field of expertise.

-- cannot reason from axioms or general theory.

-- do not learn, limited to use specific facts and heuristics that were “taught” by a human expert.

-- lack common sense, cannot reason by analog

-- performance deteriorates rapidly when problems extend beyond the narrow task they were designed to perform.

Page 24: Expert Systems, 4 th NF Association Rules CS 157B Prof. Sin-Min Lee

ControlDepth-First v.s Breadth-First Search:

-- In a depth-first search, the inference engine takes every opportunity to produce a subgoal, searching for detail first in a depth-first manner.

-- A breadth-first search sweeps across all premises in a rule before digging of greater detail.

Page 25: Expert Systems, 4 th NF Association Rules CS 157B Prof. Sin-Min Lee

Advantages

-- do not display biased judgment-- do not jump to conclusion and then seek to maintain those conclusions in the face if disconfirming evidence.

-- do not have “bad day”

-- always attend to details, always systematically consider all of the possible alternatives

-- equipped with thousands of heuristic rules, able to perform their specialized task better than a human expert.

Page 26: Expert Systems, 4 th NF Association Rules CS 157B Prof. Sin-Min Lee
Page 27: Expert Systems, 4 th NF Association Rules CS 157B Prof. Sin-Min Lee
Page 28: Expert Systems, 4 th NF Association Rules CS 157B Prof. Sin-Min Lee
Page 29: Expert Systems, 4 th NF Association Rules CS 157B Prof. Sin-Min Lee

Multivalued Dependencies

• There are database schemas in BCNF that do not seem to be sufficiently normalized

• Consider a database classes(course, teacher, book)

such that (c,t,b) classes means that t is qualified to teach c, and b is a required textbook for c

• The database is supposed to list for each course the set of teachers any one of which can be the course’s instructor, and the set of books, all of which are required for the course (no matter who teaches it).

Page 30: Expert Systems, 4 th NF Association Rules CS 157B Prof. Sin-Min Lee

• There are no non-trivial functional dependencies and therefore the relation is in BCNF

• Insertion anomalies – i.e., if Sara is a new teacher that can teach database, two tuples need to be inserted

(database, Sara, DB Concepts)(database, Sara, Ullman)

course teacher book

databasedatabasedatabasedatabasedatabasedatabaseoperating systemsoperating systemsoperating systemsoperating systems

AviAviHankHankSudarshanSudarshanAviAvi Jim Jim

DB ConceptsUllmanDB ConceptsUllmanDB ConceptsUllmanOS ConceptsShawOS ConceptsShaw

classes

Multivalued Dependencies

Page 31: Expert Systems, 4 th NF Association Rules CS 157B Prof. Sin-Min Lee

• Therefore, it is better to decompose classes into:

course teacher

databasedatabasedatabaseoperating systemsoperating systems

AviHankSudarshanAvi Jim

teaches

course book

databasedatabaseoperating systemsoperating systems

DB ConceptsUllmanOS ConceptsShaw

text

We shall see that these two relations are in Fourth Normal Form (4NF)

Multivalued Dependencies

Page 32: Expert Systems, 4 th NF Association Rules CS 157B Prof. Sin-Min Lee

What are Association Rules?- Techniques used to detect

associations or relationships between elements in large data sets.

- Show value conditions that occur frequently in a data set.

- Association Rules are basically if-then rules supported by data

- Application of this is called Market Basket Analysis

Page 33: Expert Systems, 4 th NF Association Rules CS 157B Prof. Sin-Min Lee

What are Association Rules?

• Finding frequent patterns, correlations, or associations among a set of items or objects in transaction databases, relational databases, or other information repositories

Page 34: Expert Systems, 4 th NF Association Rules CS 157B Prof. Sin-Min Lee

Applications of Association Rules• Market Basket Analysis• Classification• Clustering• Cross-marketing• Loss-leader Analysis

Page 35: Expert Systems, 4 th NF Association Rules CS 157B Prof. Sin-Min Lee

Characteristics of Association Rules

• Consists of a set of items, the rule body, leading to another item, the rule head

X Y• Association Rules relate the rule

body X to the rule head Y

Page 36: Expert Systems, 4 th NF Association Rules CS 157B Prof. Sin-Min Lee

Itemsets

• itemset = a set of itemsX = {apple, orange} is an itemset

• k-itemset = a set of k itemsX = {apple, orange, banana} = 3-itemset

Page 37: Expert Systems, 4 th NF Association Rules CS 157B Prof. Sin-Min Lee

Rules

• Let I = {i1, i2, …, im} be a set of items

• Let transaction t be a set of items where

t I• Let T be the Transaction Database

or set of transactions where T = {t1, t2, …, tn}.

Page 38: Expert Systems, 4 th NF Association Rules CS 157B Prof. Sin-Min Lee

Rules

• Transaction t contains itemset X which belongs to I, a set of items

• Formal association rule is defined as:

X Y, where X, Y I, and X Y =

Page 39: Expert Systems, 4 th NF Association Rules CS 157B Prof. Sin-Min Lee

Support

• Support is the percentage of transactions that support an association rule.

• It is the percentage of transactions that contain product X and product Y

Support = Probability(X Y).

Page 40: Expert Systems, 4 th NF Association Rules CS 157B Prof. Sin-Min Lee

Confidence• Confidence is the strength or reliability of

an association rule• It is the probability that if there is X in a

transaction, then Y will also be present.

Confidence = Support (X,Y) / Support (X) or

Confidence = Probability (Y | X)

Page 41: Expert Systems, 4 th NF Association Rules CS 157B Prof. Sin-Min Lee

Thresholds

• Minimum Support Threshold and Minimum Confidence Threshold

• The association rule is more valuable if they satisfy these minimum values

• The higher the percentage, the more useful the data is

Page 42: Expert Systems, 4 th NF Association Rules CS 157B Prof. Sin-Min Lee

Uses of Association Rules?

• Retail Shopping• Credit Card transactions• Online purchases• Medical patient histories• Banking services• Insurance claims

Page 43: Expert Systems, 4 th NF Association Rules CS 157B Prof. Sin-Min Lee

Market Basket Analysis• Modeling technique based on theory

that if you buy one group of items, you are also likely to buy another group of items

• In consumer behavior, most purchases are bought on impulse

• Market Basket Analysis seeks to find a relationship between purchases

Page 44: Expert Systems, 4 th NF Association Rules CS 157B Prof. Sin-Min Lee

Uses of Market Basket Analysis• Identify unexpected shopping

patterns• Targeted marketing towards specific

types of people• Predicting customer response rates to

marketing campaigns• Distinguish between profitable and

unprofitable customers

Page 45: Expert Systems, 4 th NF Association Rules CS 157B Prof. Sin-Min Lee

Maximizing Profitability- Arrangement of items

in a store- Planning of specific

sales during times of the year

- Pricing Policy of certain goods

Page 46: Expert Systems, 4 th NF Association Rules CS 157B Prof. Sin-Min Lee

Example-A market has 100 transactions-15 transactions contain product X-Of the 15 transactions, 5

transactions contain product Y

Support = 5/100 or 5%Confidence = 5/15 or 33.3%

Page 47: Expert Systems, 4 th NF Association Rules CS 157B Prof. Sin-Min Lee

Example•Milk + Bread => Butter

X = Milk + Bread

Y = Butter

“Milk + Bread” is the Rule Body and

“Butter” is the Rule Head

Page 48: Expert Systems, 4 th NF Association Rules CS 157B Prof. Sin-Min Lee

Example

- 5% of all transactions will contain combination of Milk, Bread, and Butter.

-If customers buy Milk and Bread, there is a 33.3% possibility that they will buy Butter

Support = 5%Confidence = 33.3%

Page 49: Expert Systems, 4 th NF Association Rules CS 157B Prof. Sin-Min Lee

Mining Algorithms

• Analyze a set of data, looking for patterns or trends

• They differ in the strategy and data structure used

• Their efficiency and memory requirements also differ

Page 50: Expert Systems, 4 th NF Association Rules CS 157B Prof. Sin-Min Lee

Apriori Algorithm

• Most classic and well-known algorithm for association rules

• Useful in databases that contain transactions

• Principle: Any subset of a frequent itemset must be frequent

if {A,B} is a frequent itemset, {A} and {B} must also be frequent itemsets

Page 51: Expert Systems, 4 th NF Association Rules CS 157B Prof. Sin-Min Lee

Apriori Algorithm Method

• Count 1-itemsets, then the 2-itemsets, then the 3-itemsets and so on

• When counting k-itemsets, only consider those itemsets where all subsets of length k have been determined to be frequent in the previous step

• Non frequent itemsets are pruned or discarded

Page 52: Expert Systems, 4 th NF Association Rules CS 157B Prof. Sin-Min Lee

Apriori Algorithm Diagramnull

AB AC AD AE BC BD BE CD CE DE

A B C D E

ABC ABD ABE ACD ACE ADE BCD BCE BDE CDE

ABCD ABCE ABDE ACDE BCDE

ABCDEPruned supersets