(lr+< w^^ mwj - stacksfq117fb3513/fq117fb3513.pdf · time die taskit's toperform changesa...

(Lr+< W^^ MWJ

Stanford UnivDept. ofSpc.--\(X

,"■

con

~-

<- x.\ I.

Box 3Fol 1 1 Fol. Title

*

Thoughts.on the Dipmeter Rules

D. Lenat9/12/80

Dealing with a very large number of rulesOne of the problems that this project will face, as the number of rules reaches ever higher, is theincreasing cost of running all the rules all the time (that is. evaluating ALL their "IF" parts to seewhich ones arc relevant and should have their "THEN" parts obeyed). In any given situation, onlya small subset of the rules will be even POTENTIALLY relevant; most of them would beridiculously nonsensical because they check for situations having NOTHING to do with the currentone. Below are four techniques for reducing the size of the set of rules which have to have their"IF" parts tested - ail the rest can be safely assumed to be irrelevant. One good thing about theseis that the four techniques arc independent: any or all of them can be employed.

Imagine all the dipmeter concepts arranged in a big tree, a hierarchy, linked by arcs labelled "moregeneral than" and "more specialized than". At the top would be the few most general concepts,like "log", and at the bottom would be the most specific ones, like "multiple small red patternswithin a blue". Attach each rule to the concept to which it's most relevant; try to make it theMOST GENERAL concept to which it's relevant. Notice that the rule is also relevant to thespecializations of that concept, (e.g., any rule that tells you about ALL log patterns -- such as thatone wild tadpole can be ignored -- certainly also is relevant to a specialization of that concept, suchas the "blue within red log pattern" concept). Now. given the recognition of a very specific pattern,all the program needs to do is run the rules associated with that specific concept, and itsgeneralizations; the other rules can be presumed to be irrelevant. Thus, you are only dealing withLOG(n) rules, instead of the full set of n rules.

A second technique can be used to cut a simple CONSTANT factor off the number of rules thathave to be considered: divide tip the task into phases, or substasks, such as Validity-check, user-input, missing-sections, stratigraphy, etc. Decide for each Rile which of these subtasks it is relevantto. and let each subtask know the names of the rules relevant to it. Then it need only look at thosen/8 rules, rather than n (assuming you've split the task into 8 subtasks).

A third technique can be used. Like the second one, this also cuts only a linear factor away fromthe number of rules that need to be considered as potentially relevant. Pick a few aspects that allthe subtasks have in common, such as Initialization, Checking, Finding, Reporting, etc., and classifyeach aile in that way, too.

The fourth technique is to split the IF part of a Rile into two parts: an IF-potentially-relcvant partand an IF-truly-relevant part. The former is always filled with a VERY FAST test wheras thelatter contains a much slower but more complete predicate. To use this, the program first runs theIF-potentially-relevant parts of all the rules, and only runs the more costly IF part on those whosurvive (return True for) the quick pre-condition check. This doesn't reduce the NUMBER of ruleswhich must be run, but can dramatically speed up the time involved in examining them all. On asmall scale, the same effect is achieved by carefully ordering the conjuncts of an AND expression inLisp, so thai the first ones are very fast.

Suppose the data base contains 500 concepts, and a total of 5,000 Riles. Employing technique 1alone, the number of rules found by "rippling up" the generalization links is only about 100.Techniques 2 and 3 reduce this to about 4. Technique 4 makes it very fast to tell which of thesefour remaining rules have some chance of firing. Of the, say, 2 which

remain,

one or both of themwill actually be relevant. Note how the techniques essentially eliminate the problem of "too many

2**

rules". Another benefit comes when it's necessary to modify the set of Riles - by adding a newone, modifying an existing rule, etc. - and it's necessary to understand the impact of that changeon related Riles. The organization and classification of rules means that there will be very few ruleswhich die new/dclted/changed rule could POSSIBLY interact with, and all of these can be foundautomatically very quickly. Also, by following the advice of suggestion 1, each rule will be at its"most general possible level", hence there is as good a chance as possible that it will be "above"any change you later make to the data base, i.e., that it won't have to be changed in any way.

The way the rules are currently written, they each know a LOT about each other. To make die rulesystem truly additive, to make the augmentation of the system by a new Rile an easy event, it isnecessary to eliminate as much of this "innate knowledge" as possible. One of the chief culpritscurrently is the long idiosyncratic list of paramters each Rile gets and passes along, the attributecalled VARIABLES on the property list of each rule. That would be bette off being replaced byVARIABLE-TYPE. For instance, rule 13 currently has that attribute filled with the list (typepattern-top pattern-bot azimuth trend-top trend-bot). If instead it simply had an attribute calledVARIABLE-TYPE, we could have it filled with the atom LogPatternSegment. The latter wouldhave a property list, among whose attributes would be VARIABLES. Then, when die format ofthat long list of variables changed, only one place would have to get changed, not hundreds. Thisis the kind of change which should be effected as soon as possible.

What are the rules based on?Several simple solutions to this are possible: store a textual response, store a list of machine-usablereasons, store a single numeric certainty factor, classify each Rile as either Definite or merelyProbable, etc. The advantage of the "list of machine- readable reasons" is diat from it can begenerated each of the other types of epistemological tags mentioned. Also, you may at times needto compare the types of justifications for various Riles, e.g. if some higher-order rule (metarule)told you "If two rules conflict, and one is based on empirical evidence and the other is logicallytrue. Then prefer the latter". The program can always cache away (store) the values of the certaintyfactors (CFs) it computes from diese more descriptive reasons. These CFs can dren be used when aquick and dirty estimate of the overall validity of die rule is required.Other attributes of rules

Rules can be related to each other in many ways, and this could come in handy; e.g., if somemetarule said "When a rule fails, try a. different rale which has roughly the same purpose but whichis known to succeed more frequently". To do tliis easily, it would be nice if each rule had a"Percentage of success" attribute. Some other useful attribuates (slots, properties on its propertylist, etc.) would be: average-cpu-time, averagc-number-of-questions- asked-of-the-user, names-of-more-general-rules, etc.How should the overall log interpretation process be

managed?Currently, the problem is broken into 7 phases (valid-check, user-input. valid-check-11, user-input-11,missing-sections, pattern recognition, stratigraphy), which are rigidly run one after the other, andthen the program halts. Instead, it might be more natural to have an AGENDA containing a few"tasks", such as Check-validity, Missing-Sections, Stratigraphy, and each of these would have somesymbolic reasons (and a cached number representing its overall worth), '['he program chooses thetask widi the highest overall worth, and works on it for a while. It is dicn suspended and placedback onthc agenda, and a new task will probably have a higher priority rating and be run next.During the running of a task, it may suggest new tasks for the agenda. This kind of controlstßicture allows the program to switch back and forth between the tasks of guessing the stßicturaldip and the possible faults, letting the latest results from one influence a new round ofhypothesizing about the other.

ow can the program "really understand"?One step along the way is to make as much knowledge as possible EXPLICIT. Thus, the experts

3

know that sections of the log can repeat, and they know what that means geologically. Theprogram should have that kind of information explicitly represented within it. Thus it might havethe horseshoe warping "script", which accounts for a repeated reversed pattern; it might have the"crack and slide" script to account for a repeated nonreversed segment; etc. Geological knowledgeper se may seem superfluous to the functioning of the program, but it will in the long run be anecessity, if die the program is not to be continually reworked in excruciating detail by humans eachtime die task it's to perform changes a little. General knowledge, such as that of causality, physics,etc., is also required.

Another example of troubles you get into when you know more than is explicitly represented inyour program is the following: You have a rule whose IF part has three conjuncts; how do youdistinguish the ones you want the system to quickly see if are known to be true, from those whichyou want the system to spend time WORKING to see if they are true (i.e., by recurring), fromdiose which you want the system to ACHIEVE (i.e., make true if possible)? The answer is toreplace the vauge IF part by several specialized parts, an IF-known-to-be-true, IF-you-can-show,and an IF-you-can-achieve part. A similar solution is to specify, for each conjunct in the IF parteach rule, how many resources of each type may be brought to bear in testing it (how much cputime, real time, list cells, queries to the user, etc.)

(lr+< w^^ mwj - stacksfq117fb3513/fq117fb3513.pdf · time die taskit's toperform changesa...

Documents