artificial intelligence and expert systmmycsvtunotes.weebly.com/uploads/1/0/1/7/10174835/... ·...
TRANSCRIPT
2010
Shikha Sharma
RCET,Bhilai
3/11/2010
ARTIFICIAL INTELLIGENCE AND EXPERT SYSTM
Shikha Sharma RCET Bhilai Page 2
What is Artificial intelligence? • It is the science and engineering of making intelligent machines, especially
intelligent computer programs. It is related to the similar task of using computers to understand human intelligence.
• ―Intelligence implies that a machine must be able to adapt to new situations‖ – Ability to learn
– Ability to think abstractly – To solve problems – To percieve relationship
– To adjust to one’s environment – To profit by experience
• Woodworth intelligence is a way of acting. • Woadrow intelligence is an acquiring capacity • Binet comprehension, invention , direction and criticism– intelligence
contained in these four words. • Ryburn intelligence is the power which enables us to solve problems and to
achieve our purpose. Intelligence is not a single power or capacity or abilitiy which operates equally well in all situations.
It is rather than composite of several different abilities.
What is the objective of “AI”
One term is • ―the ability to reason, to trigger new thoughts, to perceive and learn is
intelligence‖.
Second term is
―thought‖
A thought is a mechanism which
1. Stimulates
a. action
b. further thought
c.information generation
d. knowledge generation
2. Is triggered by
a. External stimulus or
b. internal stimulus
3. Acts through
a. Present environment
b. past memory
4. Is stored as
Shikha Sharma RCET Bhilai Page 3
a. charged /discharged state of neurons.
b. electromagnetic thought waves
Definition of AI
• ―John McCarthy ― gives in 1956 ―Developing computer programs to solve complex problems by applications of processes that are analogous to human
reasoning processes. • ―Ai is the branch of computer science that is concerned with the automation of
intelligent behavior.‖ • AI is the study of how to make computers do things which, at the moment, people
do better.
• the intelligent is behavior , when we call this man Intelligent, we mean by that (he
have the ability to Think, understand, learn and make decision) so if we a combine this word with system to become (Intelligent System(IS))we mean by that , the system able to (Think, understand, learn and make decision) in other
word. • It is the science and engineering of making intelligent machines, especially
intelligent computer programs. It is related to the similar task of using computers to understand human intelligence, but AI does not have to confine itself to methods that are biologically observable.
Que. Explain areas of AI.
Ans. Areas of Artificial Intelligence
Perception o Machine vision o Speech understanding o Touch ( tactile or haptic) sensation
Robotics Natural Language Processing
o Natural Language Understanding o Speech Understanding o Language Generation o Machine Translation
Planning Expert Systems Machine Learning Theorem Proving Symbolic Mathematics Game Playing
Shikha Sharma RCET Bhilai Page 4
Perception
Machine Vision:
It is easy to interface a TV camera to a computer and get an image into memory; the problem is understanding what the image represents. Vision takes lots of computation; in humans, roughly 10% of all calories consumed are burned in vision computation.
Speech Understanding: Speech understanding is available now. Some systems must be trained for the individual user and require pauses between words. Understanding continuous speech with a larger
vocabulary is harder.
Touch ( tactile or haptic) Sensation: Important for robot assembly tasks.
Robotics
Although industrial robots have been expensive, robot hardware can be cheap: Radio
Shack has sold a working robot arm and hand for $15. The limiting factor in application of robotics is not the cost of the robot hardware itself.
What is needed is perception and intelligence to tell the robot what to do; ``blind'' robots
are limited to very well-structured tasks (like spray painting car bodies).
Natural Language Understanding:
Natural languages are human languages such as English. Making computers understand English allows non-programmers to use them with little training. Applications in limited
areas (such as access to data bases) are easy.
(askr '(where can i get ice cream in berkeley))
Natural Language Generation:
Easier than NL understanding. Can be an inexpensive output device.
Machine Translation:
Usable translation of text is available now. Important for organizations that operate in many countries.
In a not too far future develops for eleven-year old David in a research lab the first intelligent
robot with human feelings in the shape. But its "foster parents" are overtaxed with the artificial
Shikha Sharma RCET Bhilai Page 5
spare child and suspend it. Posed on itself alone David tries to fathom its origin and the secret of
its existence.
Planning
Planning attempts to order actions to achieve goals.Planning applications include
logistics, manufacturing scheduling, planning manufacturing steps to construct a desired product. There are huge amounts of money to be saved through better planning.
Expert Systems
Expert Systems attempt to capture the knowledge of a human expert and make it
available through a computer program. There have been many successful and
economically valuable applications of expert systems.
Benefits:
Reducing skill level needed to operate complex devices. Diagnostic advice for device repair. Interpretation of complex data. ``Cloning'' of scarce expertise. Capturing knowledge of expert who is about to retire. Combining knowledge of multiple experts. Intelligent training.
Theorem Proving
Proving mathematical theorems might seem to be mainly of academic interest. However, many practical problems can be cast in terms of theorems. A general theorem prover can
therefore be widely applicable.
Examples:
Automatic construction of compiler code generators from a description of a CPU's instruction set.
J Moore and colleagues proved correctness of the floating-point division algorithm on AMD CPU chip.
Symbolic Mathematics
Symbolic mathematics refers to manipulation of formulas, rather than arithmetic on numeric values.
Shikha Sharma RCET Bhilai Page 6
Algebra Differential and Integral Calculus
Symbolic manipulation is often used in conjunction with ordinary sc ientific computation
as a generator of programs used to actually do the calculations. Symbolic manipulation programs are an important component of scientific and engineering workstations.
> (solvefor
'(= v (* v0 (- 1 (exp (- (/ t (* r c)))))))
't)
(= T (* (- (LOG (- 1 (/ V V0)))) (* R C)))
Game Playing
Games are good vehicles for research because they are well formalized, small, and self-contained. They are therefore easily programmed. Games can be good models of
competitive situations, so principles discovered in game-playing programs may be applicable to practical problems.
AI Tree
Fruits: Applications
Branches: Expert Systems, Natural Language processing, Speech Understanding,
Robotics and Sensory Systems, Computer Vision, Neural Computing, Fuzzy Logic, GA
Roots: Psychology, Philosophy, Electrical Engg, Management Science, Computer science, Linguistics
Shikha Sharma RCET Bhilai Page 7
Difference between AI & conventional S/W
Features AI programs Conventional
s/w
Processing type Symbolic type Numeric
Technique used Heuristic search Algorithm search
Solutions steps Indefinite definite
Answers sought Satisfactory Optimal
Knowledge Imprecise Precise
Modification Frequent Rare
Involves Large knowledge Large DB
Process Inferential repetitive
How problems can be represented in AI
Before a solution can be found the prime condition is that the problem must be very
precisely defined. So to build a system to solve a particular problem, we need to do four things.
1. Define the problem precisely. like what is initial situation, what will be the final, acceptable solutions.
2. Analyze the problem. various possible techniques for solving the problem. 3. Isolate and represent the task knowledge that is necessary to solve the problem. 4. Choose the best problem solving technique and apply it
The most common methods of problem representation in AI
State space representation
―A set of all possible states for a given problem is known as the state space of the
problem.‖or
―A state space represents a problem in terms of states and operators that change states.‖
A problem space consists of 1. Precondition/An initial state 2. Post condition/Final states 3. Actions 4. Total Cost
Shikha Sharma RCET Bhilai Page 8
Water jug problem?
• States– amount of water in both jugs. • Actions—Empty large/small, pour from large/small • Goal—specified amount of water in both jug
• Path cost—total no of actions applied
State Space Search: Playing Chess
• State space is a set of legal positions.
• Starting at the initial state.
Shikha Sharma RCET Bhilai Page 9
• Using the set of rules to move from one state to another. • Attempting to end up in a goal state.
State Space Search: Water Jug Problem
―You are given two jugs, a 4- litre one and a 3- litre one. Neither has any measuring
markers on it. There is a pump that can be used to fill the jugs with water. How can
you get exactly 2 litres of water into 4-litre jug.‖
• State: (x, y)
x = 0, 1, 2, 3, or 4 y = 0, 1, 2, 3
• Start state: (0, 0).
• Goal state: (2, n) for any n.
• Attempting to end up in a goal state.
1. current state = (0, 0) 2. Loop until reaching the goal state (2, 0) Apply a rule whose left side matches the current state
Set the new current state to be the resulting state
(0, 0) (0, 3)
(3, 0) (3, 3) (4, 2)
(0, 2) (2, 0) Find a driving route from city A to city B
• States– location specified by city . • Actions– driving along the roads between cities
• Goal— city B • Path cost—total distance or expected travel time.
Explain State space search. Solve Tic-Tac-Toe using state space search.
Ans. A state space represents a problem in terms of states and operators that change
states. A state space consists of:
A representation of the states the system can be in. In a board game, for example, the board represents the current state of the game.
A set of operators that can change one state into another state. In a board game, the operators are the legal moves from any given state. Often the operators are represented as programs that change a state representation to represent the new state.
An initial state.
Shikha Sharma RCET Bhilai Page 10
A set of final states; some of these may be desirable, others undesirable. This set is often represented implicitly by a program that detects terminal states.
Tic-Tac-Toe as a State Space
State spaces are good representations for board games such as Tic-Tac-Toe. The state of
a game can be described by the contents of the board and the player whose turn is next.
The board can be represented as an array of 9 cells, each of which may contain an X or O
or be empty.
State: o Player to move next: X or O. o Board configuration:
X O
O
X X
Operators: Change an empty cell to X or O. Start State: Board empty; X's turn. Terminal States:
Three X's in a row; Three O's in a row; All cells full.
Search Tree
The sequence of states formed by possible moves is called a search tree. Each level of
the tree is called a ply .
Shikha Sharma RCET Bhilai Page 11
Since the same state may be reachable by different sequences of moves, the state space may in
general be a graph. It may be treated as a tree for simplicity, at the cost of duplicating states.
production system
A production system (or production rule system) is a computer program typically used
to provide some form of artificial intelligence, which consists primarily of a set of rules about behavior. These rules, termed productions, are a basic representation found useful in automated planning, expert systems and action selection. A production system
provides the mechanism necessary to execute productions in order to achieve some goal for the system.
A production system consists of four basic components:
1. A set of rules of the form Ci ® Ai or
C1, C2, … Cn => A1 A2 …Am
Shikha Sharma RCET Bhilai Page 12
Left hand side (LHS) Right hand side (RHS)
Conditions/antecedents Conclusion/consequence
where Ci is the condition part and Ai is the action part.
1. The condition determines when a given rule is applied, and the action determines what
happens when it is applied.
2. knowledge databases/ working memory that contain whatever information is relevant
for the given problem & also maintains data about current state or knowledge.
Some parts of the database may be permanent, while others may temporary and only exist during the solution of the current problem. The information in the databases may be structured in any appropriate manner.
3. A control strategy that determines the order in which the rules are applied to the
database, and provides a way of resolving any conflicts that can arise when several rules match at once.
4. A rule applier which is the computational system that implements the control strategy
and applies the rules.
Productions consist of two parts: a sensory precondition (or "IF" statement) and an
action (or "THEN"). If a production's precondition matches the current state of the world, then the
production is said to be triggered. If a production's action is executed, it is said to have fired. A production system also contains a database, sometimes called working memory, which maintains data about current state or knowledge, and a rule
interpreter. The rule interpreter must provide a mechanism for prioritizing productions when more than one is triggered. A production system is a tool used in
artificial intelligence and especially within the applied AI domain known as expert systems. Production systems consist of a database of rules, a working memory, a matcher, and a procedure that resolves conflicts between rules. PS is a computer
program typically used to provide some form of AI, which consists a set of rules about behavior. A PS provides the mechanism necessary to execute productions in
order to achieve some goal for the system. it is used as the basis for many rule-based expert systems
Production rule for water jug problem 1. (x, y) (4, y), If x < 4 fill the 4-gallon jug. 2. (x, y) (x,3), If y < 3 fill the 3-gallon jug.
3. (x, y) (x- d , y), If x > 0 pour some water out of the 4-gallon jug
Shikha Sharma RCET Bhilai Page 13
4. (x, y) (x, y - d), If y > 0 pour some water out of the 4-gallon jug 5. (x, y) (0, y) If x > 0 empty the 4-gallon jug.
6. (x, y) (x, 0), If y > 0 empty the 3-gallon jug. 7 (x, y) (4, y – (4 – x) ), if x + y >= 4 & y > 0 pour water from the
3-gallon jug into the 4-gallon jug until the 4-gallon jug is full.
8. (x, y) (x – (3 – y), 3 ), if x + y >= 4 & y > 0 pour water from the
4-gallon jug into the 3-gallon jug until the 3-gallon jug is full.
9. (x, y) (x + y, 0 ), if x + y <= 4 & y > 0 pour all the
water from the 3-gallon jug nto the 4-gallon jug.
10. (x, y) (0, x + y), if x + y <= 3 & x > 0 pour all the water from the 4-gallon jug into the 3-gallon jug.
11. (0, 2) (2, 0), pour 2-g from 3-g to 4-g
12. (2, y) (0, y)
One solution of water jug problem
Rule applied 4-Gallon 3-Gallon
Initial state 0 0
Rule 2 0 3
Rule 9 3 0
Rule 2 3 3
Rule 7 4 2
Rule 5 or 12 0 2
Rule 9 or 11 2 0
Problem of Conflict Resolution
• When there are more then one rule that can be fired in a situation and the rule interpreter can not be decide which is to be fired, what is the order of triggering
and whether to apply it .
Some Resolution Strategies
Shikha Sharma RCET Bhilai Page 14
• Perform the first. the system chooses the first rule that matches. • Sequencing techniques. adopt the rules in the sequence they are.
• Perform the most specific. if there are two matching rules and one rule is more specific than the other, activate the most specific.
• Most recent policy. chooses newly added rule.
sseeaarrcchh • Search process of locating a solution to a problem by any method in a search
tree or search space until a goal node is found. • Search Space A set of possible permutation that can be examined by any
search method in order to find solution.
• Search Tree A tree that is used to represent a search problem and is examined by search method to search for a solution.
To do a search process the following are needed :--
The initial state description.
Shikha Sharma RCET Bhilai Page 15
A set of legal operators.
The final or goal state.
Search Tree – Terminology
• Root Node: The node from which the search starts.
• Leaf Node: A node in the search tree having no children. • Ancestor/Descendant: X is an ancestor of Y is either X is Y’s parent or X is an
ancestor of the parent of Y. If S is an ancestor of Y, Y is said to be a descendant of X.
• Branching factor: the maximum number of children of a non- leaf node in the
search tree • Path: A path in the search tree is a complete path if it begins with the start node
and ends with a goal node. Otherwise it is a partial path. • We also need to introduce some data structures that will be used in the search
algorithms.
Evaluating Search strategies
• We will look at various search strategies and evaluate their problem solving
performance. What are the characteristics of the different search algorithms and what is their efficiency? We will look at the following three factors to measure this.
Completeness: We will say a search method is ―complete‖ if it has both the following properties:
– if a goal exists then the search will always find it
– if no goal exists then the search will eventually finish and be able to say that no goal exists
Time complexity: how long does it take?( number of nodes expanded)
Space complexity: how much memory is needed?
Optimality: is a high-quality solution found? Does the solution have low cost or the minimal cost? What is the search cost associated
with the time and memory required to find a solution?
Which path to find? The objective of a search problem is to find a path from the initial state to a goal state.
If there are several paths which path should be chosen? Our objective could be to find any path, or we may need to find the shortest path or least cost path.
The different search strategies that we will consider include the following:
1. Blind Search strategies or Uninformed search a. Depth first search
b. Breadth first search c. Iterative deepening search
Shikha Sharma RCET Bhilai Page 16
d. Iterative broadening search 2. Informed Search
3. Constraint Satisfaction Search 4. Adversary Search
• Uninformed or blind or Brute force search – No information about the number of steps
– No information about the path cost – blind search or uninformed search that does not use any extra information
about the problem domain. • Informed or heuristic search
– Information about possible path costs or number of steps is used
Uninformed Search
Breadth-first search • Root node is expanded first • All nodes at depth d in the search tree are expanded before the nodes at depth d+1 • Implemented by putting all the newly generated nodes at the end of the queue
Algorithm of BFS
Step 1: put the initial node on a list S.
Step 2 : if ( S is empty) or (S = goal) terminate search.
Step 3 : remove the first node from S. call this
node a.
Step 4 : if (a = goal) terminate search with success.
Step 5 :Else if node a has successor, generate all
of them and add them at the tail of S.
Step 6 : go to to step 2.
Breadth-first search merits
Complete: If there is a solution, it will be found Optimal: Finds the nearest goal state
Breadth-first search problem: – Time complexity
– Memory intensive – Remembers all unwanted nodes
Breadth first search is: • Complete. : The algorithm is optimal (i.e., admissible) if all operators have the
same cost. Otherwise, breadth first search finds a solution with the shortest path length.
Shikha Sharma RCET Bhilai Page 17
• The algorithm has exponential time and space complexity. Then the time and space complexity of the algorithm is O(bd) where d is the depth of the solution
and b is the branching factor (i.e., number of children) at each node. • A complete search tree of depth d where each non- leaf node has b children, has a
total of 1 + b + b2
+ ... + bd
= (b(d+1)
- 1)/(b-1) nodes • Consider a complete search tree of depth 15, where every node at depths 0 to14
has 10 children and every node at depth 15 is a leaf node. The complete search
tree in this case will have O(1015
) nodes. If BFS expands 10000 nodes per second
and each node uses 100 bytes of storage, then BFS will take 3500 years to run in the worst case, and it will use 11100 terabytes of memory. So you can see that the breadth first search algorithm cannot be effectively used unless the search space is
quite small. You may also observe that even if you have all the time at your disposal, the search algorithm cannot run because it will run out of memory very
soon.
Depth-first search • Always expands one of the node at the deepest level of the tree • Only returns when the search hits a dead end • Implemented by putting the newly generated nodes at the front of the queue
Algorithm of DFS
Step 1: put the initial node on a list S.
Step 2 : if ( S is empty) or (S = goal) terminate search.
Step 3 : remove the first node from S. call this
node a.
Step 4 : if (a = goal) terminate search with success.
Step 5 :Else if node a has successor, generate all
of them and add them at the beginning of S.
Step 6 : go to to step 2.
Time Complexity :
1 + b + b
2
+ b
3
+…+……b
d.
Hence Time complexity = O (
b
d)
Space Complexity :
space complexity = O (d)
Shikha Sharma RCET Bhilai Page 18
Depth-first search merits
Modest memory requirements:- only the current path from the root to the leaf node needs to be stored.
Time complexity : - With many solutions, depth-first search is often faster than breadth-first search,
but the worst case is still O (bm
)
Properties of Depth First Search
Let us now examine some properties of the DFS algorithm. The algorithm takes exponential time. If N is the maximum depth of a node in the search space, in the
worst case the algorithm will take time O(bd
). However the space taken is linear in the
depth of the search tree, O(bN). Note that the time taken by the algorithm is related to the maximum depth of the
search tree. If the search tree has infinite depth, the algorithm may not terminate. This can happen if the search space is infinite. It can also happen if the search space
contains cycles. The latter case can be handled by checking for cycles in the algorithm. Thus Depth First Search is not complete.
CCoonnssttrraaiinntt SSaattiissffaaccttiioonn
• A constraint problem is a task where you have to
– Arrange objects – Schedule tasks
– Assign values – … – subject to a number of constraints
Shikha Sharma RCET Bhilai Page 19
Example of constraint problems
S E N DM O R E
M O N E Y
+
Each letter stands for a different digit. Assign digits to the letters so that the sum is
correct. A constraint problem consists of
– A set of variables x1, x2,… xn
– For each variable xi a finite set Di of its possible values (its domain)
– A set of constraints restricting the values that the variables can take
– Goal: find an assignment of values to the variables which satisfies all
the constraints
Cryptarithmetic problems:
Constraint: when the values are assigned, the sum must add up correctly.
Some easy examples
• AS + A = MOM • I + DID = TOO
• A + FAT = ASS • SO + SO = TOO
• US + AS = ALL • ED + DI = DID • DI + IS = ILL
6
Another example
Shikha Sharma RCET Bhilai Page 20
The 8 Queens puzzle.Place 8 queens on a chessboard so that no two queens are attacking one another.
Constraints: no two queens must be on the same row, the same column, or the same diagonal.
A more practical example
• Timetabling/scheduling
– Assign classes to rooms so that
• Students aren’t required to be in two different rooms at the same
time
• Similarly for lecturers
• Two classes aren’t booked into the same room at the same time
• Rooms are sufficiently large to hold classes assigned to them
• Labs have enough computers for the classes assigned to them
summary
• Constraint problem-solving can be applied to a wide variety of real-world
problems
• Formally, a constraint problem consists of
– A set of variables and their domains
– A set of constraints
• The goal
– Find a valid set of values
– Find all sets of values
Shikha Sharma RCET Bhilai Page 21
– Find the best set of values
• The method
– Combine search and constraint propagation
AI – Game Playing
Why has game playing been a focus of AI?
games have well-defined rules, which can be implemented in programs interfaces required are usually simple
Many human expert exist to assist in the developing of the programs. Games provide a structured task wherein success or failure can be measured with
least effort.
Classification of Games
1. Single Person playing
2. Two player or Multi person playing
Formal Description of Game :
Initial State:- from where game start.
Successor function:- for each state, list of legal moves and consequent states
Terminal State:-test to determine if a state is a terminal state- the end of the game
Shikha Sharma RCET Bhilai Page 22
Utility function :- computes a single numeric value
● Games are represented by game trees in which
● Each node represents a position
● Each link represents a legal move
● Leaf nodes are final positions
The aim is to reach the goal node from the root node.
Components of Game Playing
Plausible move generator :- it generates the necessary states for further expansion.
Static Evaluation function generator:-Ranks each of the position
The basic methods Minimax Strategy
Minimax Strategy with alfa-beta cutoffs
Minimax Strategy ―brute force‖; not recognizing abstract patterns an optimal strategy
1st
player ―MAX‖ tries to maximize the utility fn
2nd
player ―MIN‖ tries to minimize the utility fn assumes the opponent always makes the best possible move
• not always assumed by a human player under such conditions gives best possible outcome—maximizes the worst-case
outcome
when a leaf node is evaluated, a large value is good for player ―MAX‖; a small value is good for player ―MIN‖
which player is making the move alternates between adjacent levels
(level 0 MAX, level 1 MIN, level 2 MAX, etc.)
Minimax Algorithm
minimaxValue(n) = • utility(n)
if n is a terminal state • max minimaxValue(s) of all successors, s
if n is a MAX node • min minimaxValue(s) of all successors, s
if n is a MIN node
Alpha-Beta Pruning
Shikha Sharma RCET Bhilai Page 23
improves minimax algorithm by pruning needless evaluations computes same result without searching the entire tree don’t explore a move which is inferior to a known alternative if cannot search to terminal state, use a heuristic to approximate the eventual
terminal state
Alpha-Beta Pruning Alpha: -minimal score that player MAX is guaranteed to attain
(best known so far, but possibility of improvement. Minimum attainable) Beta: - best score that player MIN can attain so far
(lowest score known so far, lower score may yet be found. Maximum attainable)
Knowledge representation Knowledge representation is a study of ways of how knowledge is actua lly
picturized and how effectively it resembles the representation of knowledge in human brain.
Some widely known representation schemes:
Semantic nets
Frames Scripts Conceptual dependency
Semantic networks A semantic network is a structure for representing knowledge as a pattern of
interconnected nodes and arc. It is a graphical representation of knowledge. Nodes in the semantic net represent either
-- Entities
-- attributes
-- states or
-- Events
Shikha Sharma RCET Bhilai Page 24
Arcs in the net gives the relationship between the nodes and labels on the arc specify the type of relationship.
“A sparrow is a bird”
• Two concepts: ―sparrow‖ and ―bird‖ • A sparrow is a kind of bird, so connect the two concepts with a IS-A
relation
Spar
Bird
BirdSparIS-A
―A bird has feathers‖
• This is a different relation: the part-whole relation • Represented by a HAS-A link or PART-OF link • The link is from whole to part, so the direction is the opposite of the IS-A
link
BirdSparIS-A
HAS-A
Draw semantic for the followings:-
• Tweety and Sweety are birds • Tweety has a red beak
• Sweety is Tweety’s child • A crow is a bird
• Birds can fly Semantic networks
Feather
Shikha Sharma RCET Bhilai Page 25
Semantic networks Semantic networks can answer queries
• Query: ―Which birds have red beaks?‖ • Answer: Tweety • Method: Direct match of subgraph • Query: ―Can Tweety fly?‖ • Answer: Yes • Method: Following the IS-A link from ―Tweety‖ to ―bird‖ and the
property link of ―bird‖ to ―fly‖ • This process is called inheritance
Convert following into Semantic Net 1. Motor-bike is a two wheeler.
2. Two wheeler is a moving vehicle. 3. Moving-vehicle has a brake.
4. Moving vehicle has a engine. 5. Moving vehicle has electrical system. 6. Moving vehicle has fuel- system.
feather
Shikha Sharma RCET Bhilai Page 26
Hierarchical Structure
vehicle
Land-vehicle Water-vehicle Air-vehicle
Road rail river sea aircraft space
Is_a Is_a Is_a
Is_a Is_a
Shikha Sharma RCET Bhilai Page 27
Shikha Sharma RCET Bhilai Page 28
Represent following information in SN (is_a circus-elephant elephant) (has elephant head) (has elephant trunk) (has head mouth) (is_a elephant animal) (has animal heart) (is_a circus-elephant performer)
Shikha Sharma RCET Bhilai Page 29
(has performer costumes)
Circus-elephant elephant
head trunk
mouth
performer
costumes
animal
heart
Semantic networks
Advantages of semantic networks • Simple representation, easy to read • Associations possible
• Inheritance possible Disadvantages of semantic networks
• A separate inference procedure (interpreter) must be build • The validity of the inferences is not guaranteed • For large networks the processing is inefficient
•
Frame systems Frame theory
• When humans encounter something new, a basic structure called a frame
is selected from memory • A frame is a fixed framework in which all kinds of information is stored • For more details about the information in a frame, a different frame is
selected • A frame is connected to other frames, so this is a network of frames
Frame The term Frame was introduced in Minsky's paper: ``A Framework for
Representing Knowledge''.
Shikha Sharma RCET Bhilai Page 30
A basic idea of frames is that people make use of stereo typed information about typical features of objects, images, and situations;
such information is assumed to be structured in large units representing the stereotypes, and these units are referred to as ``frames''.
Typical Features of Frames
A frame can represent an individual object or a class of similar objects.
Instead of properties, a frame has slots. A slot is like a property, but can contain
more kinds of information
Data type information; constraints on possible slot fillers, Documentation.
Frames can inherit slots from parent frames. For example, FIDO (an individual
dog) might inherit properties from DOG (its parent class) or MAMMAL (a parent
class of DOG).
A sample frame of a computer
center
Name : computer Center
Air-conditioner Stationery cupboard
computer Dumb-terminal
printer
Name of the frame
Slotes in the frame
Shikha Sharma RCET Bhilai Page 31
Frame systems Frame
• Frame name: represents an object or a concept, so similar to node in the semantic network
• Frame type: shows if this a concept (class) or an object (instance)
Slot• Consists of slot name and facets
• Slot name: property or relation name
Facet• A facet gives information about
the slot, i.e. the value and name• Value: the value of the property
• Default: connecting frames can have a different value for this property
Demon• Perform a certain action if a
condition is satisfied
bird class
IS-A value animal
HAS-A default feather
default leg
#Leg default 2
Weight If-Needed calc-weight
Frame
nameFrame
type
Shikha Sharma RCET Bhilai Page 32
Frame systems
bird class
IS-A value animal
HAS-A default feather
default leg
#Leg default 2
Weight If-Needed calc-weight
Tweety instance
IS-A value bird
HAS-A value beak
Beakcol value red
Child value Sweety
Birthday value 1990.1.1
If-Added calc-age
crow class
IS-A value bird
Color default black
beak class
Beakcol default yellow
Frame systems
bird class
IS-A value animal
HAS-A default feather
default leg
#Leg default 2
Weight If-Needed calc-weight
Tweety instance
IS-A value bird
HAS-A value beak
Beakcol value red
Child value Sweety
Birthday value 1990.1.1
If-Added calc-age
crow class
IS-A value bird
Color default black
beak class
Beakcol default yellow
Inference in frame systems
• Query: ―How many legs has a crow?‖
Shikha Sharma RCET Bhilai Page 33
• Answer: 2
• Inference
• No information about this in the ―crow‖ frame
• Try to find it in the ―bird‖ frame
• Default value is 2
• Also called inheritance
• As soon as the birthday of Tweety is added, the ―calc-age‖ procedure is
invoked
• Query: ―What is the weight of Tweety?‖
• The answer is obtained by the procedure ―calc-weight‖ in bird
Frame interpreter
• Each frame system needs an inference mechanism
• Takes care of inheritance, the invoking of demons and the message
passing
Advantages of frame systems
• The knowledge can be structured
• Flexible inference by using procedural knowledge
• Layered representation and inheritance is possible
Disadvantages of frame systems
• The design of the interpreter is not easy
• The validity of the inferences is not guaranteed
• Hard to maintain consistency between the knowledge
Construct semantic network representations
Shikha Sharma RCET Bhilai Page 34
1Richard Nixon is a Quaker and a Republican. Quakers and Republicans
are Persons.
Every Quaker follows the doctrine of pacifism.
b. Mary gave the green flowered vase to her cousin.
Shikha Sharma RCET Bhilai Page 35
Scripts A script is a knowledge representation structure that is extensively used for
describing stereo type sequences of action.
It is special case of frame structure.
It represent events that takes place in day – to – day activities.
Script do have slots and with each slots, we associate info about the slot.
Components of scripts
1. Entry conditions
• Preconditions:
• facts that must be true to call the script
• Eg.: an open restaurant, a hungry customer that has some money
2. Results
Shikha Sharma RCET Bhilai Page 36
• Postconditions:
• facts that will be true after the script has terminated
• Eg.: customer is full and has less money; restaurant owner has more
money
3. Props
• Typical things that support the content of the script
• Eg.: waiters, tables, menus
4. Roles
• Actions that participants perform
• Represented using conceptual dependency
• Eg.: waiter takes orders, delivers food, presents bill
5. Scenes
• A temporal aspect of the script
• Eg.: entering the restaurant, ordering, eating, …
LOGIC
One of the prime activity of human intelligence is reasoning. The activity of reasoning involves construction , organization and manipulation of statements to arrive at new
conclusions.
Thus logic can be defined as a scientific study of the process of reasoning and the system of rates and procedure that help in the reasoning process.Basically the logic process takes
some function called premises and produces some outputs called conclusions.
Classification of Logic
Shikha Sharma RCET Bhilai Page 37
1. Propositional Logic
2. Predicate Logic
Propositional Logic:- This is the simplest form of logic.It takes only two values , i. e.
either the proposition is true or it is false.
Examples
Kinds of proposition
• Atomic or Simple Proposition in which simple or atomic sentences. • Molecular or Compound Propositions combining one or more atomic
proposition using a set of logical connectives.
Commonly used Propositional Logical Connectives
Properties of statements
• A sentence is
– satisfiable if it is true under some interpretation – valid if it is true under all possible interpretations
NAME CONNECTIVE
CONJUCTION AND
DISJUNCTION OR
NEGATION NOT
MATERIAL CONDITION IMPLIES
JOINT DENIAL NAND
DISJOINT DENIAL NOR
Shikha Sharma RCET Bhilai Page 38
– Inconsistent/contradiction if there does not exist any interpretation under which the sentence is true
– logical consequence: S |= X if all models of S are also models of X OR – A sentence is LC of another if it is satisfied by all interpretations which
satisfy the first. – Example P is a LC of (P & Q) since any interpretation for which (P &
Q) is true , P is also true.
First Order Predicate Logic (FOPL )
OR
First Order Predicate Calculus
The predicate logic is logical extension of propositional logic.FOPL was developed
by logician as a means for formal reasoning , primarily in the areas of mathematics.It is used in representing different kind of knowledge.
FOPL is flexible enough to permit the accurate representation of natural language.
It is commonly used in program design.It provides a way of deducing new statements from old ones.
The predicate calculus includes a wider range of entities. It permits the description of relations and the use of variables. It also requires an understanding of quantification.
• First-order logic is used to model the world in terms of
– objects which are things with individual identities
e.g., individual students, lecturers, companies, cars ...
– properties of objects that distinguish them from other objects
e.g., mortal, blue, oval, even, large, ...
– classes of objects (often defined by properties)
e.g., human, mammal, machine, ...
– functions which are a subset of the relations in which there is only one
``value'' for any given ``input''.
e.g., father of, best friend, second half, one more than ...
Shikha Sharma RCET Bhilai Page 39
The language of predicate calculus requires:
Variables
Constants
---these include the logical constants
The last two logical constants are additions to the logical connectives of propositional calculus ---they are known as quantifiers. The non-logical constants include both the
`names' of entities that are related and the `names' of the relations. For example, the constant dog might be a relation and the constant fido an entity.
Predicate :-these relate a number of entities. This number is usually greater than one. A
predicate with one argument is often used to express a property e.g. sun(hot) may represent
the statement that ``the sun has the property of being hot''.
Predicates: P(x[1], ..., x[n])
– predicate name; (x[1], ..., x[n]): argument list – Examples: human(x), – father(x, y)
A predicate, like a membership function, defines a set (or a class) of objects
Terms (arguments of predicates must be terms)
– Constants are terms (e.g., Fred, a, Z, “red”, etc.) – Variables are terms (e.g., x, y, z, etc.), a variable is instantiated when it is
assigned a constant as its value
Formulae:-these are constructed from predicates and formulae. The logical constants are used
to create new formulae from old ones. Here are some examples:
Shikha Sharma RCET Bhilai Page 40
Note that the word ``and'' used in the left hand column is used to suggest that we have more than one formula for combination ---and not necessarily a conjunction.
In the last two examples, ``dog(X)'' contains a variable which is said to be free while the
``X'' in `` X.dog(X)'' is bound.
Sentence :- a formula with no free variables is a sentence.
Two informal examples to illustrate quantification follow:
We can now represent the problem we initially raised: X.(dog(X) smelly(X)) dog(fido) smelly(fido)
A well-formed formula (wff) is a sentence containing no "free" variables.
i.e., all variables are "bound" by universal or existential quantifiers.
( x)P(x,y) has x bound as a universally quantified variable, but y is free.
Quantifiers
Universal quantification (or for all)
Shikha Sharma RCET Bhilai Page 41
– ( x)P(x) means that P holds for all values of x in the domain associated with that variable.
– E.g., ( x) dolphin(x) => mammal(x) ( x) human(x) => mortal(x)
– Universal quantifiers often used with "implication (=>)“ ( x) student(x) => smart(x) (All students are smart)
– Often associated with English words “all”, “everyone”, “always”, etc.
Existential quantification
– ( x)P(x) means that P holds for some value(s) of x in the domain associated with that variable.
– E.g., ( x) mammal(x) ^ lays-eggs(x) ( x) taller(x, Fred)
– Existential quantifiers usually used with “^ (and)" to specify a list of properties about an individual.
( x) student(x) ^ smart(x) (there is a student who is smart.)
– Often associated with English words “someone”, “sometimes”, ” at least “ etc.
Scopes of quantifiers
• Each quantified variable has its scope
– ( x)[human(x) => ( y) [human(y) ^ father(y, x)] – All occurrences of x within the scope of the quantified x refer to the same thing. – Use different variables for different things
• Switching the order of universal quantifiers does not change the meaning:
– ( x)( y)P(x,y) <=> ( y)( x)P(x,y), can write as ( x,y)P(x,y) • Similarly, you can switch the order of existential quantifiers.
– ( x)( y)P(x,y) <=> ( y)( x)P(x,y) • Switching the order of universals and existential does change meaning:
– Everyone likes someone: ( x)( y)likes(x,y) – Someone is liked by everyone: ( y)( x) likes(x,y)
Translating English to FOPL
Shikha Sharma RCET Bhilai Page 42
1. Bhaskar likes aeroplanes. 2. Ravi’s father is rani’s father.
3. Plato is a man 4. Ram likes mango.
5. Sima is a girl. 6. Rose is red. 7. John owns gold
8. Ram is taller than mohan 9. My name is khan
10. Apple is fruit. 11. Ram is male. 12. Tuna is fish.
13. Dashrath is ram’s father. 14. Kush is son of ram.
15. Kaushaliya is wife of Dashrath. 16. Clinton is tall. 17. There is a white alligator.
18. All kings are person. 19. Nobody loves jane.
20. Every body has a father.
INFERENCE RULE
If we want to prove something, we apply some manipulation procedures on the given
statements to deduce new statements.If we are totally sure that the given statement are
true , then the newly derived statements are also true. Some rules are
1. Modus Ponens 2. Chain Rule 3. Substitution 4. Simplification 5. Transposition 6. Resolution 7. Unification
Modus Pones
Shikha Sharma RCET Bhilai Page 43
If a has property P and all objects that have property P also have property Q, we conclude that a
has property Q.
P(a)
( x) P(x) → Q(x)
Q(a)
Example:-
• Assertion : Leo is a lion • Implication : All lions are ferocious. • Conclusion : Leo is ferocious.
Lion(Leo )
( x) Lion (x) → ferocious(x)
ferocious(Leo)
Chain Rule
If P→Q and Q→R then P →R
Example
Given : (programmer likes LISP)→(programmer hates COBOL)
And : (programmer hates COBOL) → (programmer likes Prolog)
Conclude: (programmer likes LISP)→ (programmer likes Prolog)
Substitution
If S is a valid statement then S’ is derived from is also valid.
Example :- if P Ѵ ˜P is valid then Q Ѵ ˜ Q is also valid.
Finding General Substitutions:---
Shikha Sharma RCET Bhilai Page 44
• Given: Hate(x,y)
Hate(Marcus,z)
We Could Produce:
(Marcus/x, z/y)
(Marcus/x, y/z)
(Marcus/x, Caesar/y, Caesar/z)
(Marcus/x, Polonius/y, Polonius/z)
Simplification: - P and Q → P
•
Transposition :- P → Q
infer ~ Q → ~ P
UNIFICATION
• A unification of two terms is a join with respect to a specialization order.
Full Unification (variables on both sides)
Variable-Variable Matching P(x) P(y)
Q(x,x) Q(y,z)
P(f(x),z) P(y, Fido)
Unifiers: Variable Substitutions P(x) P(y) {y/x}
Shikha Sharma RCET Bhilai Page 45
Q(x,x) Q(y,z) {y/x, z/x}
P(f(x),z) P(y,Fido) {y/f(x), z/Fido}
Consistent Variable Assignments P(Mary,John) P(y,y) #
R(x,y,y) R(y,y,z) {y\z, x\y}
W(P(x),y,z) W(Q(x),y,Fido) #
Advantages of Full Unification Query and data => both fully allow variables Permits full FOL Resolution (next)
Unification
Q(x)
P(y) FAIL
P(x)
P(y) x/y
P(Marcus)
P(y) Marcus/y
P(Marcus)
P(Julius) FAIL
P(x,x)
P(y,y) (y/x)
P(y,z) (z/y , y/x)
RESOLUTION
Shikha Sharma RCET Bhilai Page 46
The resolution rule in prositional logic is a single valid inference rule producing from two clause ,
a new clause implied by them.
Robinsón in 1965 introduced the resolution principle which can be directly applied to any set of
clause.
In other Words to prove a statements resolution attempts to show that the negation of the
statement produces a contraduction with the known statement.
Example :- Perform resolution on the set of clauses.
A: P V Q V R
B: P V Q
C: R
D: Q
Resolution in Propositional Logic
Shikha Sharma RCET Bhilai Page 47
The Basis of Resolution and Herbrand's Theorem
Given: winter V summer
winter V cold
We can conclude:
summer v cold
Herbrand's Theorem:
P V Q V R R
P V Q
Q
P
T V Q
T T
Shikha Sharma RCET Bhilai Page 48
To show that a set of clauses S is unsatisfiable, it is necessary to consider only
interpretations over a particular set, called the Herbrand universe of S. A set of clauses S is
unsatisfiable if and only if a finite subset of ground instances (in which all bound variables have
had a value substituted for them) of S is unsatsifable.
Algorithm: Propositional Resolution
1. Convert all the propositions of F to clause form. 2. Negate P and convert the result to clause form. Add it to the set of clauses obtained in
step 1. 3. Repeat until either a contradiction is found or no progress can be made:
a) Select two clauses. Call these the parent clauses.
b) Resolve them together. The resolvent will be the disjunction of all of the literals of both of
the parent clauses with the following exception: If there are any pairs of literals L and L such
that one of the parent clauses contains L and the other contains L, then select one such pair
and eliminate both L and L from the resolvent.
c) If the resolvent is the empty clause, then a contradiction has been found. If it is not, then
add it to the set of clauses available to the procedure.
Skolem Functions in FOL
Objective Want all variables universally quantified Notational variant of FOL w/o existentials Retain implicitly full FOL expressiveness
Skolem’s Theorem Every existentially quantified variable can be replaced by a unique Skolem function
whose arguments are all the universally quantified variables on which the existential depends,
without changing FOL.
Examples “Everybody likes something”
(x) (y) [Person (x) & Likes(x,y)]
(x) [Person(x) & Likes(x, S1(x))]
Where S1(x) = “that which x likes”
Shikha Sharma RCET Bhilai Page 49
“Every philosopher writes at least one book”
(x) (y)[Philosopher(x) & Book(y)) => Write(x,y)]
(x)[(Philosopher(x) & Book(S2(x))) => Write(x,S2(x))]
Examples of Conversion to Clause Form
Example: x: [Roman(x) know(x, Marcus)] [hate(x,Caesar) V ( y: z:
hate(y,z) thinkcrazy(x,y))]
Eliminate
x: [Roman(x) know(x, Marcus)] V [hate(x,Caesar) V ( y:
z: hate(y,z) V thinkcrazy(x,y))]
Reduce scope of . x: [ Roman(x) V know(x, Marcus)] V [hate (x,Caesar) V ( y: z:
hate(y,z) V thinkcrazy(x,y))]
“Standardize” variables:
x: P(x) V x: Q(x) converts to x: P(x) V y: Q(y)
Move quantifiers. x: y: z: [ Roman(x) V know(x, Marcus)] V
[hate(x,Caesar) V ( hate(y,z) V thinkcrazy(x,y))]
Eliminate existential quantifiers.
y: President(y) will be converted to President(S1)
x: y: father-of(y,x) will be converted to x: father-of(S2(x),x))
Drop the prefix.
[ Roman(x) know(x, Marcus)] V [hate(x, Caesar) V ( hate(y,z) V
thinkcrazy(x, y))]
Convert to a conjunction of disjuncts.
Roman(x) V know(x,Marcus) V hate(x,Caesar) V hate(y,z) V
thinkcrazy(x,y)
Shikha Sharma RCET Bhilai Page 50
A Predicate Logic Example
1. Marcus was a man. 2. Marcus was a Pompeian. 3. All Pompeians were Romans. 4. Caesar was a ruler. 5. All Romans were either loyal to Caesar or hated him. 6. Everyone is loyal to someone. 7. People only try to assassinate rulers they aren't loyal to 8. Marcus tried to assassinate Caesar. 9. All men are people.
Prove that “Marcus hates Caesar”.
A Predicate Logic form
1. Marcus was a man. man(Marcus) 2. Marcus was a Pompeian. Pompeian(Marcus)
3. All Pompeians were Romans. x: Pompeian(x) Roman(x) 4. Caesar was a ruler. ruler(Caesar) 5. All Romans were either loyal to Caesar or hated him.
x: Roman(x) loyalto(x, Caesar) V hate(x,Caesar)
6. Everyone is loyal to someone. x: y: loyalto(x,y) 7. People only try to assassinate rulers they aren't loyal to.
x: y:person(x) ruler(y) tryassassinate(x,y) loyalto(x,y)
8. Marcus tried to assassinate Caesar. tryassassinate(Marcus, Caesar)
9. All men are people. x: man(x) person(x)
A Resolution Proof
Axioms in clause form:
1. man(Marcus)
2. Pompeian(Marcus)
3. Pompeian(x1) v Roman(x1)
4. Ruler(Caesar)
5. Roman(x2) v loyalto(x2, Caesar) v hate(x2, Caesar)
Shikha Sharma RCET Bhilai Page 51
6. loyalto(x3, f1(x3))
7. man(x4) v ruler(y1) v tryassassinate(x4, y1) v
loyalto (x4, y1)
8 .tryassassinate(Marcus, Caesar)
9. man(x5) v person(x4)
An Unsuccessful Attempt at Resolution
Prove: hate(Marcus, Caesar) hate(Marcus, Caesar)
Roman(Marcus) V loyalto(Marcus,Caesar)
Marcus/x2
5
3
2
7
1
4
8
Marcus/x
1 Pompeian(Marcus) V loyalto(Marcus,Caesar)
loyalto(Marcus,Caesar)
Marcus/x4, Caesar/y1
man(Marcus) V ruler(Caesar) V tryassassinate(Marcus, Caesar)
ruler(Caesar) V tryassassinate(Marcus, Caesar)
tryassassinate(Marcus, Caesar)
Shikha Sharma RCET Bhilai Page 52
Horn clause
a Horn clause is a clause (a disjunction of literals) with at most one positive literal. They are
named for the logician Alfred Horn, who first pointed out the significance of such clauses in
1951, in the article "On sentences which are true of direct unions of algebras", Journal of
Symbolic Logic, 16, 14-21.
A Horn clause with exactly one positive literal is a definite clause; a Horn clause with no
positive literals is sometimes called a goal clause, especially in logic programming. A Horn formula is a conjunctive normal form formula whose clauses are all Horn; in other
Prove: loyalto(Marcus, Caesar) loyalto(Marcus, Caesar)
Roman(Marcus) V hate(Marcus,Caesar)
Marcus/x2
5
3
2 Marcus/x
1 Pompeian(Marcus) V hate(Marcus,Caesar)
hate(Marcus,Caesar)
Marcus/x6, Caesar/y3
(a)
hate(Marcus,Caesar
)
10
persecute(Caesar, Marcus)
hate(Marcus,Caesar)
9
Marcus/x5, Caesar/y2
(b) :
:
Shikha Sharma RCET Bhilai Page 53
words, it is a conjunction of Horn clauses. A dual-Horn clause is a clause with at most one negative literal. Horn clauses play a basic role in logic programming and are
important for constructive logic.
The following is an example of a (definite) Horn clause:
Such a formula can also be written equivalently in the form of an implication:
Horn clauses can be propositional or first order, depending on whether we consider
propositional or first-order literals.
The relevance of Horn clauses to theorem proving by first-order resolution is that the resolution of two Horn clauses is a Horn clause. Moreover, the resolution of a goal clause
and a definite clause is again a goal clause. In automated theorem proving, this can lead to greater efficiencies in proving a theorem (represented as a goal clause).
In fact, the resolution of a goal clause with a definite clause to produce a new goal clause
is the basis of the SLD resolution inference rule, used to implement logic programming and the programming language Prolog. In logic programming a definite clause behaves as
a goal-reduction procedure. For example, the Horn clause written above behaves as the procedure:
to show u, show p and show q and and show t.
To emphasize this backwards use of the clause, it is often written using the consequence
operator:
Propositional Horn clauses are also of interest in computational complexity, where the problem of finding a set of variable assignments to make a conjunction of Horn clauses
true is a P-complete problem, sometimes called HORNSAT. This is P's version of the boolean satisfiability problem, a central NP-complete problem. Satisfiability of first-order
Horn clauses is undecidable.
Shikha Sharma RCET Bhilai Page 54
Using Resolution with Equality and Reduce
Axioms in clause form: 1. man(Marcus) 2. Pompeian(Marcus) 3. Born(Marcus, 40)
4. man(x1) V mortal(x1)
5. Pompeian(x2) V died(x2,79) 6. erupted(volcano, 79)
7. mortal(x3) V born(x3, t1) V gt(t2—t1, 150) V dead(x3, t2) 8. Now=2002
9. alive(x4, t3) V dead (x4, t3)
10. dead(x5, t4) V alive (x5, t4)
11. died (x6, t5) V gt(x6, t5) V dead(x6, t6)
Prove: alive(Marcus, now)
Prove: loyalto(Marcus, Caesar) loyalto(Marcus, Caesar)
Roman(Marcus) V hate(Marcus,Caesar)
Marcus/x2
5
3
2 Marcus/x
1 Pompeian(Marcus) V hate(Marcus,Caesar)
hate(Marcus,Caesar)
Marcus/x6, Caesar/y3
(a)
hate(Marcus,Caesar
)
10
persecute(Caesar, Marcus)
hate(Marcus,Caesar)
9
Marcus/x5, Caesar/y2
(b) :
:
Shikha Sharma RCET Bhilai Page 55
Translating English to FOL
1. Every gardener likes the sun. 2. Not Every gardener likes the sun. 3. You can fool some of the people all of the time. 4. You can fool all of the people some of the time. 5. You can fool all of the people at same time. 6. You can not fool all of the people all of the time. 7. Everyone is younger than his father. 8. All purple mushrooms are poisonous. 9. No purple mushroom is poisonous. 10. There are exactly two purple mushrooms. 11. Clinton is not tall. 12. X is above Y if X is directly on top of Y or there is a pile of one or more other objects
directly on top of one another starting with X and ending with Y. 13. no one likes everyone
Rule-based Systems
A rule based system is also called a production system.
A production rule is an:
IF situation THEN action
IF premise THEN conclusion
Shikha Sharma RCET Bhilai Page 56
IF antecedent THEN consequent
Rule-based systems are the most popular type of expert
systems.
Two inference methods are used in rule-based systems
Forward reasoning (Forward chaining, data driven reasoning)
start with known data and progress to a conclusion.
Backward reasoning (Backward chaining, goal driven reasoning)
start with a possible conclusion and try to prove its validity by searching for evidance.
Why are rule-based systems more popular?
1. Modular nature (easy to expand)
2. Explanation facilities easily implemented (by keeping track of the rules that fire)
3. Similarity to human cognitive process (work of Newell and Simon)
Forward chaining:- starts with the data available and uses the inference rules to conclude more data until a desired goal is reached.
An inference engine using forward chaining searches the inference rules until it finds
one in which the if-clause is known to be true.
It then concludes the then-clause and adds this information to its data. It would continue to do this until a goal is reached. Because the data available determines
which inference rules are used.this method is also called data driven.
Shikha Sharma RCET Bhilai Page 57
A Simple Example
R1: IF hot AND smoky THEN ADD fire
R2: IF alarm_beeps THEN ADD smoky
R3: If fire THEN ADD switch_on_sprinklers
F1: alarm_beeps [Given]
F2: hot [Given]
F3: smoky [from F1 by R2]
F4: fire [from F2, F3 by R1]
F5: switch_on_sprinklers [from F4 by R3]
A typical Forward Chaining example
Forward Chaining Algorithm
Start from a set of facts (data available) and check to see if the premises of any rules are satisfied.
If there is a match then the rule fires (is executed).
The steps followed in forward chaining are:
1. Matching: Compare rules with known facts and find rules that are satisfied.
Shikha Sharma RCET Bhilai Page 58
2. Conflict Resolution: More than one rule may be satisfied. Conflict resolution is the process of selecting the one with highest priority for execution.
3. Execution: The rule selected is executed (fired). This may result in a new fact(s)
to be added and the process continues forward.
Backward chaining:starts with a list of goals and works backwards to see if there is data which will allow it to conclude any of these goals.An inference engine using
backward chaining would search the inference rules until it finds one which has a then-clause that matches a desired goal. If the if-clause of that inference rule is not known to
be true, then it is added to the list of goals.
Shikha Sharma RCET Bhilai Page 59
Backward Chaining
Same rules/facts may be processed differently, usingbackward chaining interpreter
Backward chaining means reasoning from goals backto facts.
The idea is that this focuses the search.
Checking hypothesis
Should I switch the sprinklers on?
Example
Rules:
R1: IF hotAND smoky THEN fire alarm_beeps
R2: IF alarm_beeps THEN smoky
R3: If fire THEN switch_on_sprinklers
Facts:
smoky hotF1: hot
F2: alarm_beeps
Goal:
Should I switch sprinklers on? fire
switch_on_sprinklers
Backward Chaining Algorithm
To prove goal G:
Shikha Sharma RCET Bhilai Page 60
If G is in the initial facts, it is proven.
Otherwise, find a rule which can be used to conclude G, and try to prove each of that rule's conditions.
Advantages of Rule Based Systems
• Modularity:- Each rule is a separate unit. This makes adding, editing or removing of rules easily possible giving great flexibility to the system.
• Uniformity: -The same format is used for representing all of the knowledge.
• Naturalness:- In many domains rules are used to express the knowledge.
Disadvantages of Rule Based Systems
• Infinite Chaining
• Addition of new contradictory knowledge • Modification of existing Knowledge
• Inefficiency • Large number of rules needed to cover some domains (e.g. air traffic control)
Forward vs Backward
Chaining
Depends on problem, and on properties of rule set. If you have clear hypotheses, backward chaining is likely to be better.
Goal driven
Diagnostic problems or classification problems Medical expert systems
Forward chaining may be better if you have less clear hypothesis and want to
see what can be concluded from current situation.
Data driven
Synthesis systems Design / configuration
Problem Reduction
Sometimes problems only seem hard to solve. A hard problem may be one that can be reduced to a number of simple problems...and, when each of the simple problems is
solved, then the hard problem has been solved. This is the basic intuition behind the method of problem reduction.
Shikha Sharma RCET Bhilai Page 61
In problem reduction search the problem space consists of an AND/OR
graph of (partial) state pairs. These pairs are referred to as (sub)problems. The first element of the pair is the starting state of the (sub)problem and the second element of the pair is the goal state (sub)problem.
There are two types of generators: non-terminal rules and terminal rules. Non-terminal
rules decompose a problem pair, <s0, g0> into an ANDed set of problem pairs {<si,gi>, <sj,gj>, ...>. The assumption is that the set of subproblems are in some sense simpler
problems than the problem itself. The set is referred to as an ANDed set because the assumption is that the solution of all of the subproblems implies that the problem has been solved. Note all of the subproblems must be solved in order to solve the parent
problem.
Any subproblem may itself be decomposed into subproblems. But, in order for this method to succeed, all subproblems must eventually terminate in primitive subproblems.
A primitive subproblem is one which can not be decomposed (i.e., there is no non-terminal that is applicable to the subproblem ) and its solution is simple or direct. The terminal rules serve as recognizers of primitive subproblems.
Problem reduction methods
• I want to be a famous musician– Learn to sing
– Learn to play the guitar
– Learn to play the bass
– Learn to play drums
• If I want to play the guitar what do I do?– Buy a guitar
– Take lessons
– Practice
Shikha Sharma RCET Bhilai Page 62
Problem reduction methods
Musician
Singer Guitarist Bass player Drummer
Buy Guitar Take lessons Practice
AND/OR tree
What is an Expert System?
Jackson (1999) provides us with the following definition:
Shikha Sharma RCET Bhilai Page 63
An expert system is a computer program that represents and reasons with knowledge of some
specialist subject with a view to solving problems or giving advice. To solve expert-level
problems, expert systems will need efficient access to a substantial domain knowledge base,
and a reasoning mechanism to apply the knowledge to the problems they are given. Usually
they will also need to be able to explain, to the users who rely on them, how they have reached
their decisions. They will generally build upon the ideas of knowledge representation,
production rules, search, and so on, that we have already covered. Often we use an expert
system shell which is an existing knowledge independent framework into which domain
knowledge can be inserted to produce a working expert system. We can thus avoid having to
program each new system from scratch.
Typical Tasks for Expert Systems
There are no fundamental limits on what problem domains an expert system can be built
to deal with. Some typical existing expert system tasks include:
1. The interpretation of data Such as sonar data or geophysical measurements
2. Diagnosis of malfunctions Such as equipment faults or human diseases
3. Structural analysis or configuration of complex objects Such as chemical compounds
or computer systems
4. Planning sequences of actions Such as might be performed by robots
5. Predicting the future Such as weather, share prices, exchange rates However, these
days, “conventional” computer systems can also do some of these things.
Characteristics of Expert Systems
Expert systems can be distinguished from conventional computer systems in that:
1. They simulate human reasoning about the problem domain, rather than simulating
the domain itself.
2. They perform reasoning over representations of human knowledge, in addition to
doing numerical calculations or data retrieval. They have corresponding distinct modules
referred to as the inference engine and the knowledge base.
Shikha Sharma RCET Bhilai Page 64
3. Problems tend to be solved using heuristics (rules of thumb) or approximate methods or
probabilistic methods which, unlike algorithmic solutions, are not guaranteed to result in a
correct or optimal solution.
4. They usually have to provide explanations and justifications of their solutions or
recommendations in order to convince the user that their reasoning is correct.
Note that the term Intelligent Knowledge Based System (IKBS) is sometimes used as a
synonym for Expert System.
The Architecture of Expert Systems
The process of building expert systems is often called knowledge engineering. The
knowledge engineer is involved with all components of an expert system:
Building expert systems is generally an iterative process. The components and their
interaction will be refined over the course of numerous meetings of the knowledge
engineer with the experts and users. We shall look in turn at the various components.
Shikha Sharma RCET Bhilai Page 65
Expert System Shells
An expert system shell is an expert system with an empty knowledge base, i.e.
An inference engine User interface module
Tracer/explanation module Knowledge base (rule) editor Etc.
EXSYS is a shell, KEE, OPS5, KAS, …
EMYCIN is the shell of MYCIN It is important to start with a shell with a suitable control strategy.
Recent trends are towards shells that include multiple engines, making them more flexible.
Knowledge Acquisition
The knowledge acquisition component allows the expert to enter their knowledge or
expertise into the expert system, and to refine it later as and when required. Historically, the
knowledge engineer played a major role in this process, but automated systems that allow the
expert to interact directly with the system are becoming increasingly common.
The knowledge acquisition process is usually comprised of three principal stages:
1. Knowledge elicitation is the interaction between the expert and the knowledge
engineer/program to elicit the expert knowledge in some systematic way.
2. The knowledge thus obtained is usually stored in some form of human friendly intermediate
representation.
3. The intermediate representation of the knowledge is then compiled into an executable form
(e.g. production rules) that the inference engine can process. In practice, many iterations
through these three stages are usually required!
Knowledge Elicitation
The knowledge elicitation process itself usually consists of several stages:
Shikha Sharma RCET Bhilai Page 66
1. Find as much as possible about the problem and domain from books, manuals, etc. In
particular, become familiar with any specialist terminology and jargon.
2. Try to characterise the types of reasoning and problem solving tasks that the system will be
required to perform.
3. Find an expert (or set of experts) that is willing to collaborate on the project. Sometimes
experts are frightened of being replaced by a computer system!
4. Interview the expert (usually many times during the course of building the system). Find out
how they solve the problems your system will be expected to solve. Have them check and refine
your intermediate knowledge representation.
This is a time intensive process, and automated knowledge elicitation and machine
learning techniques are increasingly common modern alternatives.
Stages of Knowledge Acquisition
The iterative nature of the knowledge acquisition process can be represented in the
Following diagram
Levels of Knowledge Analysis
Shikha Sharma RCET Bhilai Page 67
Knowledge identification: Use in depth interviews in which the knowledge engineer encourages
the expert to talk about how they do what they do. The knowledge engineer should understand
the domain well enough to know which objects and facts need talking about.
Knowledge conceptualization: Find the primitive concepts and conceptual relations of
the problem domain.
Epistemological analysis: Uncover the structural properties of the conceptual knowledge, such
as taxonomic relations (classifications).
Logical analysis: Decide how to perform reasoning in the problem domain. This kind
of knowledge can be particularly hard to acquire.
Implementational analysis: Work out systematic procedures for implementing and testing the
system.
Capturing Tacit/Implicit Knowledge
One problem that knowledge engineers often encounter is that the human experts use
tacit/implicit knowledge (e.g. procedural knowledge) that is difficult to capture.
There are several useful techniques for acquiring this knowledge:
1. Protocol analysis: Tape-record the expert thinking aloud while performing their role and later
analyse this. Break down the their protocol/account into the smallest atomic units of thought,
and let these become operators.
2. Participant observation: The knowledge engineer acquires tacit knowledge through practical
domain experience with the expert.
3. Machine induction: This is useful when the experts are able to supply examples of the results
of their decision making, even if they are unable to articulate the underlying knowledge or
reasoning process.
Which is/are best to use will generally depend on the problem domain and the expert.
Shikha Sharma RCET Bhilai Page 68
Representing the Knowledge
We have already looked at various types of knowledge representation. In general, the
knowledge acquired from our expert will be formulated in two ways:
1. Intermediate representation – a structured knowledge representation that the
knowledge engineer and expert can both work with efficiently.
2. Production system – a formulation that the expert system’s inference engine can
process efficiently.
It is important to distinguish between:
1. Domain knowledge – the expert’s knowledge which might be expressed in the
form of rules, general/default values, and so on.
2. Case knowledge – specific facts/knowledge about particular cases, including any
derived knowledge about the particular cases.
The system will have the domain knowledge built in, and will have to integrate this with
the different case knowledge that will become available each time the system is used.
The Inference Engine
We have already looked at production systems, and how they can be used to generate
new information and solve problems.
Recall the steps in the basic Recognize Act Cycle:
1. Match the premise patterns of the rules against elements in the working memory.
Generally the rules will be domain knowledge built into the system, and the
working memory will contain the case based facts entered into the system, plus
any new facts that have been derived from them.
2. If there is more than one rule that can be applied, use a conflict resolution
strategy to choose one to apply. Stop if no further rules are applicable.
Shikha Sharma RCET Bhilai Page 69
3. Activate the chosen rule, which generally means adding/deleting an item to/from
working memory. Stop if a terminating condition is reached, or return to step 1.
Early production systems spent over 90% of their time doing pattern matching, but there
is now a solution to this efficiency problem:
The Rete-Algorithm
The naïve approach to the recognize act cycle tries to match all E elements in working
memory against all P premises in all R rules, so it is necessary to check all EPR
possible matches in each cycle.
However, the rules will generally have parts of their conditions in common (structural
similarity), and the application of any one rule will generally only slightly change the
working memory (temporal redundancy).
These facts are used by the Rete Algorithm to improve efficiency (‘rete’ is Latin for
‘net’). The condition parts of the rules are structured into a network. For the first cycle,
the match algorithm generates the initial conflict set by processing the network for all
the initial facts. Thereafter, only the changed elements in working memory need to be
fed through the network to determine the changes to the conflict set.
The User Interface
The Expert System user interface usually comprises of two basic components:
1. The Interviewer Component
This controls the dialog with the user and/or allows any measured data to be read
into the system. For example, it might ask the user a series of questions, or it
might read a file containing a series of test results.
2. The Explanation Component
This gives the system’s solution, and also makes the system’s operation transparent
Shikha Sharma RCET Bhilai Page 70
by providing the user with information about its reasoning process. For example, it
might output the conclusion, and also the sequence of rules that was used to come
to that conclusion. It might instead explain why it could not reach a conclusion.
So that is how we go about building expert systems. In the next two weeks we shall see
how they can handle uncertainty and be improved by incorporating machine learning.
Question:- Explain different types of ES.
Ans. Expert systems appear in many varieties. The following classification of ES is not exclusive, that is, one ES can appear in several categories:
1. Expert System and Knowledge - based Systems
An ES is one whose behaviour is so sophisticated that we would call a person who performed in a similar manner as an expert. MYCIN and XCON are good examples.
In the business world, however, systems are emerging that can perform tasks effectively and efficiently for whose execution you really do not need an
expert.
Such systems are referred to as Knowledge -based Systems. They are also
known as advisory systems, knowledge systems.
For example, let us look at a system that gives advice on immunizations recommended for travel abroad. The advice depends upon many attributes
such as age, sex and the health of the traveler and the country of destination. Knowledge systems can be constructed more quickly and cheaply than
expert systems.
2. Rule - based expert system
Many commercial ES are rules based, because the technology of rule -based system is relatively well developed. In such systems the knowledge is
represented as a series of production rules.
Shikha Sharma RCET Bhilai Page 71
For example MYCIN is the best example of rule based ES.
3. Frame - based system
In these systems, the knowledge is represented as frames, a representation of the object - oriented programming approach.
4. Hybrid Systems
These systems include several knowledge representation approaches, at minimum frames and rules, but usually more.
5. Model-based Systems
Model-based systems are structured around a model that simulates the structure and function of the system under study. The model is used to
compute values, which are compared to observed ones. The comparison triggers action (if needed) or further diagnosis.
6. Ready-made (Off-the-Shelf) Systems
ES can be developed to meet the particular needs of a user (custom made), or they can be purchased as ready-made packages for any user. Ready-made
systems are similar to application packages like an accounting general ledger or project management in operations management. Ready-made systems enjoy the economy of mass production and therefore are considerably
less expensive than customized systems. They also can be used as soon as they are purchased. Unfortunately, ready-made systems are very general in nature, and the advice they render may not be of value to a user involved in a
complex situation.
7. Real-time Expert Systems
Real-time systems are systems in which there is a strict time limit on the system's response time, which must be fast enough for use to control the
process being computerized.
Shikha Sharma RCET Bhilai Page 72
Case Study (MYCIN)
An example Goal-driven Medical Diagnostic Expert System
(taken from Luger and Stubblefield section 8.4)
Purpose:
Diagnose and recommend treatment for meningitis and bacteremia (more quickly
than definitive lab tests).
Explore how human experts reason with missing and incomplete information.
History
mid- late '70s 50 person years
Stanford medical school Comprehensively evaluated Never used clinically
Widely documented ("Rule-based expert systems" Buchanan and Shortliffe, Stanford 1984, a collection of publications on MYCIN).
Representation
Facts : <attribute-object-value-(certainty)>
(ident organism-1 klebsiella .25) there is evidence (.25) that the identity of organism-1 is klebsiella
(sensitive organism-1 penicillin -1.0) it is known that org-1 is NOT sensitive to penicillin. Rules: condition-action pairs
Condition is a conjunction (AND) of facts IF: (AND (same-context infection
primary-bacteremia) (membf-context site sterilesite) (same-context portal GI))
THEN: (conclude context- ident bacteroid tally .7)
If the infection is primary bacteremia and the site of the culture is a sterile one and the suspected portal of entry is GI tract then there is suggestive evidence (.7) that infection is bacteroid.
Consequent (then-part) can Add facts to database
Write to terminal Change a value in a fact, or its certainty Lookup a table
Execute a LISP procedure
Shikha Sharma RCET Bhilai Page 73
Operation:
Routine questions
Specific questions about symptoms
Depth-first goal driven consideration
of each "known" organism
Terminates "depth-search" when certainty measures get too low.
Selection criterion is to maximise certainty - if a rule can prove a goal with certainty 1 then no more rules need be considered.
Goal-driven so that questions appear to be directed - less frustrating, more confidence building for the user. English- like interaction (see handout).
Answers WHY by printing the rule under consideration. Exhaustive consideration of possible infections - patient may have more than one.
Uncertainty in MYCIN
If A: stain is gram positive and B: morphology is coccus
and C: growth conformation is chains then there is suggestive evidence (0.7) that
H: organism is streptococcus
0.7 is the measure of increase of belief (MB) of H given evidence A and B and C.
MB ranges 0 to 1. Assigned by subjective judgement usually. As a guide:
1 if P(H)=1 MB(H|E) = max[P(H|E),P(H)] - P(H) otherwise
max[1,0] - P(H) Measures of disbelief also allowed. These also range 0 to 1. 1 if P(H)=1
MD(H|E) = min[P(H|E),P(H)] - P(H) otherwise min[1,0] - P(H)
Note if E and H are independent, E does not change the belief in H: P(H|E) = P(H), so MB = MD = 0.
MB(H|E) should only be 1 if E logically implies H. Initially each hypothesis has MB=MD=0.
Shikha Sharma RCET Bhilai Page 74
As evidence is accumulated these are updated. At the end a certainty factor CF = MB-MD is computed for each hypothesis.
The largest absolute CF values used to determine appropriate therapy. Weakly supported hypotheses |CF| < 2 are ignored.
MYCIN's handling of uncertainty is an ad-hoc method (based on probability). But it seems to work as well as more formal approaches.
LEARNING
Learning is the study of how to build computer systems that adapt and improve with experience. It is a subfield of Artificial Intelligence and intersects with cognitive science, information theory, and probability theory, among others.
Machine learning is particularly attractive in several real life problem because of the
following reasons:
• Some tasks cannot be defined well except by example
Shikha Sharma RCET Bhilai Page 75
• Working environment of machines may not be known at design time
• Explicit knowledge encoding may be difficult and not available
• Environments change over time
learning is widely used in a number of application areas including
• Data mining and knowledge discovery
• Speech/image/video (pattern) recognition
• Adaptive control
• Autonomous vehicles/robots
• Decision support systems
• Bioinformatics
• WWW
Rote learning
Rote learning is a learning technique which avoids understanding of a subject and
instead focuses on memorization. The major practice involved in rote learning is learning by repetition. The idea is that one will be able to quickly recall the meaning of the material the more one repeats it.
Rote learning is widely used in the mastery of foundational knowledge. Examples include, phonics in reading, the periodic table in chemistry, multiplication tables in mathematics, anatomy in medicine, cases or statutes in law, basic formulas in any
science, etc. Rote learning, by definition, eschews comprehension, however, and consequently, it is an ineffective tool in mastering any complex subject at an advanced level. However, rote learning is still useful in passing exams. If exam papers are not well
designed, it is possible for someone with good memorization techniques to pass the test without any meaningful comprehension of the subject. However, learning the context of a
particular topic can make the subject more memorable.
However, with some material rote learning is the only way to learn it in a timely manner; for example, when learning the Greek alphabet or the vocabulary of a foreign language.
Similarly, when learning the conjugation of foreign irregular verbs, the morphology is often too subtle to be learned explicitly in a short time. However, as in the alphabet example, learning where the alphabet came from helps one to grasp the concept of it and
therefore memorize it. (Native speakers and speakers with a lot of experience usually get
Shikha Sharma RCET Bhilai Page 76
an intuitive grasp of those subtle rules and are able to conjugate even irregular verbs that they have never heard before.)
The source transmission could be auditory or visual, and is usually in the form of short
bits such as rhyming phrases (but rhyming is not a prerequisite), rather than chunks of text large enough to make lengthy paragraphs. Brevity is not always the case with rote
learning. For example, many Americans can recite their National Anthem, or even the much more lengthy Preamble to the United States Constitution. Their ability to do so can be attributed, at least in some part, to having been assimilated by rote learning. The
repeated stimulus of hearing it recited in public, on TV, at a sporting event, etc. has caused the mere sound of the phrasing of the words and inflections to be "written", as if
hammer-to-stone, into the long-term memory.
Inductive learning
Inductive learning is an inherently conjectural process because any knowledge created by generalization from specific facts cannot be proven true; it can only be proven false. Hence, inductive inference is falsity preserving, not truth preserving.
To generalize beyond the specific training examples, we need constraints or biases on what f is best. That is, learning can be viewed as searching the Hypothesis Space H of possible f functions.
A bias allows us to choose one f over another one A completely unbiased inductive algorithm could only memorize the training examples
and could not say anything more about other unseen examples. Two types of biases are commonly used in machine learning:
o Restricted Hypothesis Space Bias Allow only certain types of f functions, not arbitrary ones
o Preference Bias Define a metric for comparing fs so as to determine whether one is better than another
Inductive Learning Framework
Raw input data from sensors are preprocessed to obtain a feature vector, x, that adequately describes all of the relevant features for classifying examples.
Each x is a list of (attribute, value) pairs. For example,
x = (Person = Sue, Eye-Color = Brown, Age = Young, Sex = Female)
The number of attributes (also called features) is fixed (positive, finite). Each attribute has a fixed, finite number of possible values.
Each example can be interpreted as a point in an n-dimensional feature space, where n is the number of attributes.
Shikha Sharma RCET Bhilai Page 77
Decision tree learning
It used in data mining and machine learning, uses a decision tree as a predictive model
which maps observations about an item to conclusions about the item's target value. More
descriptive names for such tree models are classification trees or regression trees. In these
tree structures, leaves represent classifications and branches represent conjunctions of
features that lead to those classifications.
In decision theory and decision analysis, a decision tree is a graph or model of decisions and
their possible consequences, including chance event outcomes, resource costs, and utility. It
can be used to create a plan to reach a goal. Decision trees are constructed in order to help
with making decisions. A decision tree is a special form of tree structure.
Another use of trees is as a descriptive means for calculating conditional probabilities.
In decision analysis, a decision tree can be used to visually and explicitly represent decisions
and decision making.
In data mining, a decision tree describes data but not decisions; rather the resulting
classification tree can be an input for decision making. This page deals with trees in data
mining.
Practical example
David is the manager of a famous golf club. Sadly, he is having some trouble with his
customer attendance. There are days when everyone wants to play golf and the staff are
overworked. On other days, for no apparent reason, no one plays golf and staff have too
much slack time. David’s objective is to optimize staff availability by trying to predict when
people will play golf.
To accomplish that he needs to understand the reason people decide to play and if there is
any explanation for that.
He assumes that weather must be an important underlying factor, so he decides to use the
weather forecast for the upcoming week.
So during two weeks he has been recording:
• The outlook, whether it was sunny, overcast or raining.
Shikha Sharma RCET Bhilai Page 78
• The temperature (in degrees Fahrenheit). • The relative humidity in percent. • Whether it was windy or not. • Whether people attended the golf club on that day.
David compiled this dataset into a table containing 14 rows and 5 columns as shown below.
Shikha Sharma RCET Bhilai Page 79
A decision tree is a model of the data that encodes the distribution of the class label (again
the Y) in terms of the predictor attributes. It is a directed acyclic graph in form of a tree. The
top node represents all the data.
The classification tree algorithm concludes that the best way to explain the dependent
variable, play, is by using the variable "outlook". Using the categories of the variable
outlook, three different groups were found:
• One that plays golf when the weather is sunny, • One that plays when the weather is cloudy, and • One that plays when it's raining.
David's first conclusion: if the outlook is overcast people always play golf, and there are
some fanatics who play golf even in the rain. Then he divided the sunny group in two. He
realized that people don't like to play golf if the humidity is higher than seventy percent.
Finally, he divided the rain category in two and found that people will also not play golf if it
is windy.
And lastly, here is the short solution of the problem given by the classification tree:
David dismisses most of the staff on days that are sunny and humid or on rainy days that are
windy, because almost no one is going to play golf on those days.
Shikha Sharma RCET Bhilai Page 80
On days when a lot of people will play golf, he hires extra staff.
The conclusion is that the decision tree helped David turn a complex data representation
into a much easier structure.
Explaination based learning
Explanation based learning (EBL) uses an explicit domain theory to construct an explanation or
proof of a training example. By then generalizing from the proof, new knowledge is acquired
that can be applied in non-training situations.
This differs from inductive learning in that the domain theory implies the new knowledge. It is sometimes called deductive learning or analytic learning.
Example: Learning When an Object is a Cup
Target Concept: cup(C) :- premise(C).
Domain Theory:
cup(X) :- liftable(X),
holds_liquid(X).
holds_liquid(Z) :- part(Z,W),
concave(W),
points_up(W).
liftable(X) :- light(X), part(Y,handle).
light(X) :- small(X).
light(X) :- made_of(X,feathers).
Note that the domain theory includes the knowledge needed to determine when
something is a cup. We want an explicit rule that specifies when something is a cup.
What is Natural language processing?
Ans. ``Natural'' languages are human languages, such as English, German, or Chinese.
Understanding text (in machine-readable form). What customers ordered widgets in May? Understanding continuous speech: perception as well as language understanding. Language generation (written or spoken).
Shikha Sharma RCET Bhilai Page 81
Machine translation, e.g., German to English:[METAL system, University of Texas Linguistics Research Center.]
Vor dem Headerfeld befindet sich eine
Praeambel von 42 Byte Laenge fuer den
Ausgleich aller Toleranzen.
-->
A preamble of 42 byte length for the
adjustment of all tolerances is found
in front of the header field.
Que:- Explain Grammar. Write grammar for English sentences.
Ans. A grammar specifies the legal syntax of a language. The kind of grammar most
often used in computer language processing is a context-free grammar. A grammar specifies a set of productions; non-terminal symbols (phrase names or parts of speech) are enclosed in angle brackets. Each production specifies how a nonterminal symbol may
be replaced by a string of terminal or nonterminal symbols, e.g., a Sentence is composed of a Noun Phrase followed by a Verb Phrase.
< s> --> < np> < vp>
< np> --> < art> < adj> < noun>
< np> --> < art> < noun>
< np> --> < art> < noun> < pp>
< vp> --> < verb> < np>
< vp> --> < verb> < np> < pp>
< pp> --> < prep> < np>
< art> --> a | an | the
< noun> --> boy | dog | leg | porch
< adj> --> big
< verb> --> bit
< prep> --> on
Q:- What is Parsing? Explain each type of parsing.
Ans. Parsing is the inverse of generation: the assignment of structure to a linear string of words according to a grammar; this is much like the ``diagramming'' of a sentence taught
in grammar school.
Shikha Sharma RCET Bhilai Page 82
A parser is a program that converts a linear string of input words into a structured representation that shows how the phrases (substructures) are related and shows how the input could have been derived according to the grammar of the language.
Finding the correct parsing of a sentence is an essential step towards extracting its meaning.
Natural languages are harder to parse than programming languages; the parser will often make a mistake and have to fail and back up: parsing is search. There may be hundreds of
ambiguous parses, most of which are wrong.
Parsers are generally classified as top-down or bottom-up, though real parsers have characteristics of both.
There are several well-known context-free parsers:
Cocke-Kasami-Younger (CKY or CYK) chart parser Earley algorithm Augmented transition network
Top-down Parser
A top-down parser begins with the Sentence symbol, < S> , expands a production for < S> , and so on recursively until words (terminal symbols) are reached. If the string of
words matches the input, a parsing has been found.[See the Language Generation slide earlier in this section.]
This approach to parsing might seem hopelessly inefficient. However, top-down filtering,
that is, testing whether the next word in the input string could begin the phrase about to be tried, can prune many failing paths early.
For languages with keywords, such as programming languages or natural language
applications, top-down parsing can work well. It is easy to program.
Shikha Sharma RCET Bhilai Page 83
Bottom-up Parsing
In bottom-up parsing, words from the input string are reduced to phrases using grammar
productions:
< NP>
/ \
< art> < noun>
| |
The man ate fish
This process continues until a group of phrases can be reduced to < S> .
Chart Parser
A chart parser is a type of bottom-up parser that produces all parses in a triangular array called the chart; each chart cell contains a set of nonterminals. The bottom level of the
array contains all possible parts of speech for each input word. Successive levels contain reductions that span the items in levels below: cell a_i,k contains nonterminal N iff there
is a parse of N beginning at word i and spanning k words.
The chart parser eliminates the redundant work that would be required to reparse the same phrase for different higher-level grammar rules.
The Cocke-Kasami-Younger (CKY) parser is a chart parser that guarantees to parse any
context-free language in at most O(n^3) time.
HOW CAN WE REASON?
To a certain extend this will depend on the chosen knowledge representation. Although
a good knowledge representation scheme has to allow easy, natural, and plausible reasoning.
Listed below are very broad methods of how we may reason.
1) Deductive Reasoning:
Shikha Sharma RCET Bhilai Page 84
Deductive Reasoning is a process in which general premises are used to obtain a specific
inference. Reasoning moves from a general principle to a specific conclusion.
Example:
Premise : I wash my car when the weather is good on weekends.
Premise: Today is Sunday and the weather is hot
Conclusion: Therefore, I will wash my car today
To use deductive reasoning the problem must generally be formatted in this way.Once the
format has been achieved, the conclusion must be valid if the premises are true.
The whole idea is to develop new knowladge from previously given knowledge.
Starting with such a set of postulates, axioms, and definitions Euclid was able to prove 465
geometric propositions as the logical consequence of the input assumptions.
One of the basic rules of inference of deductive logic is the modus ponens rule.
A formal English statement of this rule is :
If X is true and if X being true implies Y is true then Y is true.
(X∧(X→Y)→Y
Example:
All cats are felines.
Bosty is a cat.
I can deduce that Bosty is a feline.
Abduction
Abduction is a form of deductive logic which provides only a ‘plausible inference’.
For instance:
If I read Smoking couses lung cancer and Frank died of lung cancer, I may infer that
Frank was a smoker.
Again this may or may not be true. Using statistics and probability theory, abduction
may yield the most probable inference among many possible inferences.
Shikha Sharma RCET Bhilai Page 85
Abduction is heuristic in the sense that it provides a plausible conclusion consistent with
available information, but one which may in fact be wrong.
Shikha Sharma RCET Bhilai Page 86
To illustrate how abduction works, consider following logical system consisting of a general rule
and a specific proposition:
1) All succesful , entrepreneurial industrialists are rich persons. 2) John is a rich person.
If this was only information available, a plausible inference would be that John was a
successful , entrepreneurial industrialist. This conclusion could also be false since there are
other roads to riches such as inheritance , the lottery...If we had a table of the income
distribution of wealthy persons along with their personal histories, we could refine our
abduction inference with the probability of the inference being true .
2)Inductive Reasoning A principle of reasoning to a conclusion about all members of a
class from examination of only a few members of the class; broadly, reasoning from the
particular to the general.
For example:
In 1998, The best model of Turkey is from İzmir
In 1999, The best model of Turkey is from İzmir
In 2000, The best model of Turkey is from İzmir
I would logically infer that all the girls from İzmir is beautiful. This may or may not be
true. But it provides a useful generalization.
Another example :
1= 12
1+3 = 22
1+3+5 = 32
1+3+5+7 =42
and, by induction Σ ( n successive odd integers ) = n2
Another example :
Falcon can fly.
Shikha Sharma RCET Bhilai Page 87
Canary can fly.
Gull can fly.
Conclusion: Birds can fly.
The outcome of the inductive reasoning process will frequently contain some measure
of uncertainty because including all possible facts in the premises is usually impossible.
Deductive or inductive approaches are used in logic, rule-based systems, and frames.
Shikha Sharma RCET,Bhilai
3) Analogical Reasoning
Analogical reasoning assumes that when a question is asked, the answer can be derived by analogy.
Example :
Premise : All football teams gets 3 point when they win.
Question : How many points did GS take this weekend?
Conclusion : Because GS is a football team and won Antep they took 3 points.
Analogical reasoning is a type of verbalization of an internalized learning process. An individual uses processes that require an ability to recognize previously
encountered experiences.
The use of this approach has not been exploited yet in AI field. However, case-based reasoning is an attempt.
4) Formal Reasoning
Formal reasoning involves syntactic manipulation of data structures to deduce new facts. A typical example is the mathematical logic used in proving
theorems in geometry.
5) Procedural Numeric Reasoning
Procedural numeric reasoning uses mathematical models or simulation to solve problems. Model -based reasoning is an example of this approach.
6) Generalization and Abstraction
Generalization and abstraction can be successfully used with both logical and semantic representation of knowledge.
Shikha Sharma RCET,Bhilai
7) Metalevel Reasoning
Meta level reasoning involves “knowledge about what you know”.
Which approach to use, how successful the inference will be, depends to a great extent on which knowledge representation method is used.
For example; reasoning by analogy can be more successful with semantic networks than with frames.
REASONING WITH LOGIC
We utulize various rules of inference to manipulate the logical expressions to create new expressions.
Modus Ponens
If there is a rule “if A, then B,” and if we know that A is true, then it is valid to conclude that B is also true.
[A AND (A → B)+ → B
Example :
A : It is rainy.
B : I will stay at home.
A→B : If is rainy, I will stay at home.
Modus Tollens
When B is known to be false, and if there is a rule “if A, then B,” it is valid to conclude that A is also false.
Resolution
Resolution is a method of discovering whether a new fact is valid, given a set of logical statements. It is a method of “theorem proving”. The resolution
process, which can be computerized because of its well-formed structure, is applied to a pair of parent clauses to produce a derived new clause.
Shikha Sharma RCET,Bhilai
Bayesian networks
These are also called Belief Networks or Probabilistic Inference Networks. Initially developed by Pearl (1988).
The basic idea is:
Knowledge in the world is modular -- most events are conditionally independent of most other events. Adopt a model that can use a more local representation to allow interactions between events that only affect each other. Some events may only be unidirectional others may be bidirectional -- make a distinction between these in model. Events may be causal and thus get chained together in a network.
Implementation
A Bayesian Network is a directed acyclic graph: o A graph where the directions are links which indicate dependencies that exist between nodes. o Nodes represent propositions about events or events themselves. o Conditional probabilities quantify the strength of dependencies.
Consider the following example:
The probability, that my car won't start. If my car won't start then it is likely that
o The battery is flat or o The staring motor is broken.
In order to decide whether to fix the car myself or send it to the garage I make the following decision:
If the headlights do not work then the battery is likely to be flat so i fix it myself. If the starting motor is defective then send car to garage. If battery and starting motor both gone send car to garage.
The network to represent this is as follows:
Shikha Sharma RCET,Bhilai
Fig. 21 A simple Bayesian network
Reasoning in Bayesian nets
Probabilities in links obey standard conditional probability axioms. Therefore follow links in reaching hypothesis and update beliefs accordingly. A few broad classes of algorithms have bee used to help with this:
o Pearls's message passing method. o Clique triangulation. o Stochastic methods. o Basically they all take advantage of clusters in the network and use their limits on the influence to constrain the search through net. o They also ensure that probabilities are updated correctly.
Since information is local information can be readily added and deleted with minimum effect on the whole network. ONLY affected nodes need updating.
A Practical Example
Here we describe a practical example from research based here in Cardiff.
We have used Bayesian Nets in a Computer Vision application. Details of the visual processes involved will be discussed later in the course so the contest will become clearer later.
Shikha Sharma RCET,Bhilai
Here we attempt to describe the Bayesian reasoning behind the process.
The goal is to perform a task called data fusion to obtain a segmentation -- a description of an object (viewed from a set of images) detailing its surface properties. In the example given here we deal with a simple cube. So the final description will hopefully list its ed ges and its faces and how
they are connected together.
The input to the fusion process is three preprocessing stages that have extracted out edge information and planar surface information from 2D grey scale (monochrome) images and 3D range data.
So from these three pre-processes we have a list of all lines, curved or straight, a list of all line intersections (two or three line intersections) and a list
of all the surface equations extracted from both image types.
We can now build the network from these lists of features. As mentioned above, we hypothesise about extracted surfaces intersecting. For us to evaluate these hypotheses we need to have evidence to support or contradict them. The evidence that we use is :
straight lines extracted from light image. curves extracted from light image. `areas of uncertainty' extracted from depth map.
The two lines lists are generated as described above. The areas of uncertainty are found when we are attempting to find the surface equations of each surface type.
Errors are found in the depth map where the mask to find the general surface shape overlaps two or more surfaces, the error tends to be enlarged
therefore, giving us a clue that a surface intersection exists in that general area. So we are using evidence from more than one source of data.
We proceed by taking each of the surfaces in the surface list and a node is generated to represent it. We then take a pair of surfaces and attempt to intersect them.
If they are possibly intersecting then a `feature group' node is generated referencing the surfaces and connected to the children surface nodes. This process is repeated for each pair surfaces that we have extracted. We now want to attach a conditional probability to each of our new nodes. So we
now know the surfaces that could possibly interact in the object.
We now attach a probability to these connections. We do this by finding the equation of intersection, this will be a three dimensional line for two planes or an ellipse for a plane and a sphere, and project this onto our focal plane.
Now we have our hypothesised intersections in the same dimension as our extracted lines from the preliminary stage. So we now find, for each
intersecting line a closest match line from our line list.
Shikha Sharma RCET,Bhilai
Once we have found the closest matching line we generate a probability from the error. So a line that closely matches our intersection line then we have a high probability whereas two surfaces that don't intersect in the object are unlikely to coincide with a line from the line list therefore giving us
a low probability.
The line that is found is also checked to see if it lies in an area of uncertainty. If it does then that is another strong clue that the line that we have found is actually where surfaces are joined.
So once we have generated this network with all the necessary links etc. any more information that is provided to the system can be added and the
network will propagate this information throughout the network in the form of probability updating.
So for example say a new image was provided from say a colour image and this image increased the likelihood of some edges and corners being present in the image then this would increase the probability of those features that are linked to those edges and corners which would propagate
throughout the network.
Figure 22 shows us a simple example of the network that would be generated from the input data of edges and planar faces of the cube. As can be seen, the feature group nodes can represent groups that vary from single features such as line segments, surfaces or corners or the whole object is represented in the lower nodes which includes three surfaces, three line segments, three crosses and one corner.
Shikha Sharma RCET,Bhilai
Fig. 22 A Bayesian Network for Segmentation of a Cube
Shikha Sharma RCET,Bhilai
Neural Network
A neural network is a computational structure inspired by the study of biological neural processing. There are many different
types of neural networks, from relatively simple to very complex, just as there are many theories on how biological neural
processing works.
We will begin with a discussion of a layered feed-forward type of neural network.
A layered feed-forward neural network has layers, or subgroups of processing elements.
A layer of processing elements makes independent computations on data that it receives and passes the results to another
layer.
The next layer may in turn make its independent computations and pass on the results to yet another layer. Finally, a
subgroup of one or more processing elements determines the output from the network.
Each processing element makes its computation based upon a weighted sum of its inputs.
The first layer is the input layer and the last the output layer. The layers that are placed between the first and the last layers
are the hidden layers.
The processing elements are seen as units that are similar to the neurons in a human brain, and hence, they are referred to as
cells, neuromimes, or artificial neurons.
Shikha Sharma RCET,Bhilai
A threshold function is sometimes used to qualify the output of a neuron in the output layer. Even though our subject matter
deals with artificial neurons, we will simply refer to them as neurons. Synapses between neurons are referred to as
connections, which are represented by edges of a directed graph in which the nodes are the artificial neurons.
Figure 1.1 is a layered feed-forward neural network. The circular nodes represent neurons.
Here there are three layers, an input layer, a hidden layer, and an output layer. The directed graph mentioned shows the
connections from nodes from a given layer to other nodes in other layers. Throughout this book you will see many variations
on the number and types
of layers.
Figure 1.1 A typical neural network.
Output of a Neuron
Basically, the internal activation or raw output of a neuron in a neural network is a weighted sum of its inputs, but a threshold
function is also used to determine the final value, or the output. When the output is 1, the neuron is said to fire, and when it is
0, the neuron is considered not to have fired. When a threshold function is used, different results
Shikha Sharma RCET,Bhilai
of activations, all in the same interval of values, can cause the same final output value.
This situation helps in the sense that, if precise input causes an activation of 9 and noisy input causes an activation of 10, then
the output works out the same as if noise is filtered out.
To put the description of a neural network in a simple and familiar setting, let us describe an example about a daytime game
show on television, The Price is Right.
Fuzzy Logic
Logic deals with true and false. A proposition can be true on one occasion and false on another. “Apple is a red fruit” is such a
proposition. If you are holding a Granny Smith apple that is green, the proposition that apple is a red fruit is false. On the
other hand, if your apple is of a red delicious variety, it is a red fruit and the proposition in reference is true.
If a proposition is true, it has a truth value of 1; if it is false, its truth value is 0. These are the only possible truth values.
Propositions can be combined to generate other propositions, by means of logical operations.
When you say it will rain today or that you will have an outdoor picnic today, you are making statements with certainty. Of
course your statements in this case can be either true or false. The truth values of your statements can be only 1, or 0. Your
statements then can be said to be crisp.
Shikha Sharma RCET,Bhilai
On the other hand, there are statements you cannot make with such certainty. You may be saying that you think it will rain
today. If pressed further, you may be able to say with a degree of certainty in your statement that it will rain today. Your level
of certainty, however, is about 0.8, rather than 1. This type of situation is what fuzzy logic was developed to model. Fuzzy logic
deals with propositions that can be true to a certain degree—somewhere from 0 to 1. Therefore, a proposition’s truth value
indicates the degree of certainty about which the proposition is true.
The degree of certainity sounds like a probability (perhaps subjective probability), but it is not quite the same. Probabilities for
mutually exclusive events cannot add up to more than 1, but their fuzzy values may.
Suppose that the probability of a cup of coffee being hot is 0.8 and the probability of the cup of coffee being cold is 0.2. These
probabilities must add up to 1.0. Fuzzy values do not need to add up to 1.0. The truth value of a proposition that a cup of
coffee is hot is 0.8.
The truth value of a proposition that the cup of coffee is cold can be 0.5. There is no restriction on what these truth values
must add up to.
Fuzzy Sets
Fuzzy logic is best understood in the context of set membership.
Suppose you are assembling a set of rainy days. Would you put today in the set? When you deal only with crisp statements
that are either true or false, your inclusion of today in the set of rainy days is based on certainty. When dealing with fuzzy
logic, you would include today in the set or rainy days via an ordered pair, such as (today, 0.8). The first member in such an
ordered pair is a candidate for inclusion in the set, and the second member is a value between 0 and 1, inclusive, called the
degree of membership in the set. The inclusion of the degree of membership in the set makes it convenient for developers to
Shikha Sharma RCET,Bhilai
come up with a set theory based on fuzzy logic, just as regular set theory is developed. Fuzzy sets are sets in which members
are presented as ordered pairs that include information on degree of membership.
A traditional set of, say, k elements, is a special case of a fuzzy set, where each of those k elements has 1 for the degree of
membership, and every other element in the universal set has a degree of membership 0, for which reason you don’t bother
to list it.
Fuzzy Set Operations
The usual operations you can perform on ordinary sets are union, in which you take all the elements that are in one set or the
other; and intersection, in which you take the elements that are in both sets. In the case of fuzzy sets, taking a union is finding
the degree of membership that an element should have in the new fuzzy set, which is the union of two fuzzy sets.
If a, b, c, and d are such that their degrees of membership in the fuzzy set A are 0.9, 0.4, 0.5, and 0, respectively, then the
fuzzy set A is given by the fit vector (0.9, 0.4, 0.5, 0). The components of this fit vector are called fit values of a, b, c, and d.
Union of Fuzzy Sets
Consider a union of two traditional sets and an element that belongs to only one of those sets. Earlier you saw that if you treat
these sets as fuzzy sets, this element has a degree of membership of 1 in one case and 0 in the other since it belongs to one
set and not the other.
Yet you are going to put this element in the union. The criterion you use in this action ha to do with degrees of membership.
You need to look at the two degrees of membership, namely, 0 and 1, and pick the higher value of the two, namely, 1. In other
words, what you want for the degree of membership of an element when listed in the union of two fuzzy sets, is the
maximum value of its degrees of membership within the two fuzzy sets forming a union.
Shikha Sharma RCET,Bhilai
If a, b, c, and d have the respective degrees of membership in fuzzy sets A,B as A = (0.9, 0.4, 0.5, 0) and B = (0.7, 0.6, 0.3, 0.8),
then A [cup] B = (0.9, 0.6, 0.5, 0.8).
Intersection and Complement of Two Fuzzy Sets
Analogously, the degree of membership of an element in the intersection of two fuzzy sets is the minimum, or the smaller
value of its degree of membership individually in the two sets forming the intersection. For example, if today has 0.8 for
degree of membership in the set of rainy days and 0.5 for degree of membership in the set of days of work completion, then
today belongs to the set of rainy days on which work is completed to a
degree of 0.5, the smaller of 0.5 and 0.8.
Recall the fuzzy sets A and B in the previous example. A = (0.9, 0.4, 0.5, 0) and B = (0.7, 0.6, 0.3, 0.8). A[cap]B, which is the
intersection of the fuzzy sets A and B, is obtained by taking, in each component, the smaller of the values found in that
component in A and in B.
Thus A[cap]B = (0.7, 0.4, 0.3, 0).
The idea of a universal set is implicit in dealing with traditional sets. For example, if you talk of the set of married persons, the
universal set is the set of all persons. Every other set you consider in that context is a subset of the universal set. We br ing up
this matter of universal set because when you make the complement of a traditional set A, you need to put in every element
in the universal set that is not in A. The complement of a fuzzy set,
however, is obtained as follows. In the case of fuzzy sets, if the degree of membership is 0.8 for a member, then that member
is not in that set to a degree of 1.0 – 0.8 = 0.2. So you can set the degree of membership in the complement fuzzy set to the
complement with respect to 1. If we return to the scenario of having a degree of 0.8 in the set of rainy days, then today has to
have 0.2 membership degree in the set of nonrainy or clear days.
Shikha Sharma RCET,Bhilai
Continuing with our example of fuzzy sets A and B, and denoting the complement of A by
A’, we have A’ = (0.1, 0.6, 0.5, 1) and B’ = (0.3, 0.4, 0.7, 0.2).
Note that A’ *cup+ B’ = (0.3, 0.6, 0.7, 1), which is also the complement of A *cap+ B. You can similarly verify that the
complement of A *cup+ B is the same as A’ *cap+ B’.
Applications of Fuzzy Logic
Applications of fuzzy sets and fuzzy logic are found in many fields, including artificial intelligence, engineering, computer
science, operations research, robotics, and pattern recognition. These fields are also ripe for applications for neural networks.
So it seems natural that fuzziness should be introduced in neural networks themselves. Any area where humans need to
indulge in making decisions, fuzzy sets can find a place, since information on which decisions are to be based may not always
be complete and the reliability of the supposed values of the underlying parameters is not always certain.
Examples of Fuzzy Logic
Let us say five tasks have to be performed in a given period of time, and each task requires one person dedicated to it.
Suppose there are six people capable of doing these tasks. As you have more than enough people, there is no problem in
scheduling this work and getting it done. Of course who gets assigned to which task depends on some criterion, such as total
time for completion, on which some optimization can be done. But suppose these
six people are not necessarily available during the particular period of time in question.
Suddenly, the equation is seen in less than crisp terms. The availability of the people is fuzzy-valued. Here is an example of an
assignment problem where fuzzy sets can be used.
Shikha Sharma RCET,Bhilai
Commercial Applications
Many commercial uses of fuzzy logic exist today. A few examples are listed here:
• A subway in Sendai, Japan uses a fuzzy controller to control a subway car. This controller has outperformed human and
conventional controllers in giving a smooth ride to passengers in all terrain and external conditions.
• Cameras and camcorders use fuzzy logic to adjust autofocus mechanisms and to cancel the jitter caused by a shaking hand.
• Some automobiles use fuzzy logic for different control applications. Nissan has patents on fuzzy logic braking systems,
transmission controls, and fuel injectors. GM uses a fuzzy transmission system in its Saturn vehicles.
• FuziWare has developed and patented a fuzzy spreadsheet called FuziCalc that allows users to incorporate fuzziness in their
data.
• Software applications to search and match images for certain pixel regions of interest have been developed. Avian Systems
has a software package called FullPixelSearch.
• A stock market charting and research tool called SuperCharts from Omega
Research, uses fuzzy logic in one of its modules to determine whether the market is bullish, bearish, or neutral.
Shikha Sharma RCET,Bhilai
int
elligent Agent : Design and Construction
An agent is anything that can be viewed as
--- perceiving its environment through sensors and
---acting upon that environment through actuators or effectors.
Shikha Sharma RCET,Bhilai
Specifying the task environment (PEAS)
- Performance Measure
- Environment
- Sensors
- Actuators
In designing an agent, the first step must always be to specify the task environment (PEAS) as fully as possible PEAS for an automated taxi driver
- Performance measure: Safe, fast, legal, comfortable trip, maximize profits
- Environment: Roads, other traffic, pedestrians, customers
Shikha Sharma RCET,Bhilai
- Actuators: Steering wheel, accelerator, brake, signal, horn
- Sensors: Cameras, sonar, speedometer, GPS, odometer, engine sensors, keyboard
PEAS for a medical diagnosis system
- Performance measure: Healthy patient, minimize costs, lawsuits
- Environment: Patient, hospital, staff
- Actuators: Screen display (questions, tests, diagnoses, treatments, referrals)
- Sensors: Keyboard (entry of symptoms, findings, patient's answers)
PEAS for Interactive English tutor
- Performance measure: Maximize student's score on test
- Environment: Set of students
- Actuators: Screen display (exercises, suggestions, corrections)
- Sensors: Keyboard
Environment types
The critical decision an agent faces is determining which action to perform to best satisfy its design objectives.
- Accessible vs. Inaccessible
• Deterministic vs. stochastic
• Episodic vs. sequential
Shikha Sharma RCET,Bhilai
• Static vs. dynamic
• Discrete vs. continuous
• Single agent vs. multi agent
Accessible vs. Inaccessible:- An environment is fully accessible if an agent's sensors give it access to the complete state of the environment at each
point in time where as An environment might be inaccessible because of noisy and inaccurate sensors;
• Examples: vacuum cleaner with local dirt sensor, taxi driver
Deterministic vs. stochastic:- The environment is deterministic if the next state of the environment is completely determined by the current state and
the action executed by the agent where as If the environment is partially observable then it could appear to be stochastic
• Examples: Vacuum world is deterministic while taxi driver is not
Episodic vs. sequential:-In episodic environments, the agent's experience is divided into atomic "episodes" (each episode consists of the agent
perceiving and then performing a single action), and the choice of action in each episode depends only on the episode itself. Examples:
classification tasks.
• In sequential environments, the current decision could affect all future decisions. Examples: chess and taxi driver
Static vs. dynamic: The static environment is unchanged while an agent is deliberating where as Dynamic environments continuously ask the agent
what it wants to do.
• Examples: taxi driving is dynamic, crossword puzzles are static.
Discrete vs. continuous: A limited number of distinct, clearly defined states, percepts and actions.
Examples: Chess has discrete set of percepts and actions.
While Taxi driving has continuous states, and actions
Single agent vs. multiagent: An agent operating by itself in an environment is single agent
Shikha Sharma RCET,Bhilai
Examples: Crossword is a single agent while taxi driving is a multi agent environment
Environment types
Task Observable Deterministic Episodic Static Discrete Agents
Environment
Crossword puzzle Fully Deterministic Sequential Static Discrete Single
Chess with a Fully Strategic Sequential Semi Discrete Multi
clock
Poker Partially Stochastic Sequential Static Discrete Multi
Backgammon Fully Stochastic Sequential Static Discrete Multi
Taxi driving Partially Stochastic Sequential Dynamic Continuous Multi
Medical Partially Stochastic Sequential Dynamic Continuous Single
Diagnosis
Image Analysis Fully Deterministic Episodic Semi Continuous Single
Part-picking robot Partially Stochastic Episodic Dynamic Continuous Single
Refinery Partially Stochastic Sequential Dynamic Continuous Single
controller Interactive
Partially Stochastic Sequential Dynamic Discrete MultiEnglish Tutor
• The environment type largely determines the agent design • The real world is (of course) partially observable, stochastic, sequential, dynamic, continuous, multi-agent
Shikha Sharma RCET,Bhilai
Agent types
Four basic types in order to increasing generality:
- Simple reflex agents
- Model-based reflex agents
- Goal based agents
- Utility based agents
Simple reflex agents
Shikha Sharma RCET,Bhilai
Model-based reflex agents
Shikha Sharma RCET,Bhilai
Goal-based agents
Shikha Sharma RCET,Bhilai
Utility-based agents
Shikha Sharma RCET,Bhilai
Learning agents
Agents and Objects
The designers of an object oriented system work towards a common goal where as agents may be built for different and organizations, no such common goal can be assumed.
“Objects invoke, agents request” or as
one researcher said that
“Objects do it for free; agents do it for money”. Agents and Expert systems
Expert system could not be considered as agents. Expert systems typically do not exist in an environment they are disembodied. expert systems do not act on any environment but instead give feedback or advice to a third party. This does not mean that an expert system cannot be an agent. In fact, some real-time (typically process control) expert systems are agents.
What Kinds of Things Can Intelligent Agents Do?
- Search for information automatically
- Answer specific questions
- Inform you when an event has occurred.
Shikha Sharma RCET,Bhilai
- Provide custom news to you on a just-in-time format
- Provide intelligent tutoring
- Find you the best prices on nearly any item
- Provide automatic services, such as checking web pages for changes or broken links
Features of an Agent
- Responsive (explicit: programmed, implicit:learn)
- Predictable
- Interactive (accessible)
- Trustworthy
- Expertise
- Skill
- Quick
- Accurate
Other properties • Mobility: the ability to move around an electronic environment • veracity: an agent will not knowingly communicate false information. • Benevolence: agents do not have conflicting goals and every agent will therefore always try to do what is asked of it. • Rationality: an agent will act in order to achieve its goals insofar as its beliefs permit. • Learning/adaptation: agents improve performance over time
summary
• Agent-based systems technology is a vibrant and rapidly expanding field of academic research and business world applications. • Agent technology is greatly hyped as a panacea for the current ills of system design and development, but the developer is cautioned to be
aware of the pitfalls inherent in any new and untested technology. • The potential is there but the full benefit is yet to be realized. • Much work is yet to be done.
Shikha Sharma RCET,Bhilai