burc: bootstrapping using researchcyc by kino coursey

29
BURC: BURC: B B ootstrapping ootstrapping U U sing sing R R esearch esearch C C yc yc By Kino Coursey By Kino Coursey

Upload: abel-garrison

Post on 16-Dec-2015

220 views

Category:

Documents


0 download

TRANSCRIPT

BURC: BURC: BBootstrapping ootstrapping UUsing sing RResearchesearchCCycyc

By Kino CourseyBy Kino Coursey

Introduction to the ProblemIntroduction to the Problem Goal: To extend Cyc’s knowledge base Goal: To extend Cyc’s knowledge base

using using “relationships implied to be possible, “relationships implied to be possible, normal or commonplace in the world”normal or commonplace in the world”

Prior work with Cyc knowledge entry has Prior work with Cyc knowledge entry has been manually orientedbeen manually oriented

How will we collect commonsense without How will we collect commonsense without a body and manual labor…?a body and manual labor…?

Read, Parse, Mine!Read, Parse, Mine! Proposal: Read text, Parse into a database, Proposal: Read text, Parse into a database,

Extract relations between words, Propose Extract relations between words, Propose hypothetical relations between conceptshypothetical relations between concepts

Basic AnalogyBasic Analogy

The Shotgun approach to the Human The Shotgun approach to the Human GenomeGenome

Extract millions of fragments then Extract millions of fragments then knit them back together by finding knit them back together by finding commonalitiescommonalities

Will it work for the Human Menome?Will it work for the Human Menome?

What is Cyc?What is Cyc? ““the world's largest and the world's largest and

most complete general most complete general knowledge base and knowledge base and commonsense reasoning commonsense reasoning engine”engine”

Started in mid 1980’s Started in mid 1980’s (“should take only 10 (“should take only 10 years….”)years….”)

Logic BasedLogic Based LISP orientedLISP oriented For WordNet users, each For WordNet users, each

Concept Concept ≈≈ Synset Synset Available from Available from

http://www.opencyc.orghttp://www.opencyc.org http://http://researchcyc.cyc.comresearchcyc.cyc.com

Big (ResearchCyc Big (ResearchCyc v0.8)v0.8)• Constants Constants 89,37989,379• Assertions Assertions 968,985968,985• Deduction Deduction 361,185361,185

Sample Collection ExtentsSample Collection Extents• EnglishWord EnglishWord 18,00718,007• Event Event 6,0506,050• PartiallyTangible PartiallyTangible 24,38724,387• Microtheory Microtheory 1,6881,688

Example of what Cyc currently Example of what Cyc currently knows about fingersknows about fingers

Collection : Collection : FingerFingerGAF Arg : 1GAF Arg : 1Mt : Mt : UniversalVocabularyMtUniversalVocabularyMt

isaisa : : AnimalBodyPartTypeAnimalBodyPartType quotedIsaquotedIsa : : DensoOntologyConstantDensoOntologyConstant

genlsgenls : : Digit-Digit-AnatomicalPartAnatomicalPart

commentcomment : : "The collection of all digits "The collection of all digits of all of all HandHands (q.v.). Fingers are s (q.v.). Fingers are (typically) flexibly jointed and are (typically) flexibly jointed and are necessary to enabling the hand (and its necessary to enabling the hand (and its owner) to perform grasping and owner) to perform grasping and manipulation actions." manipulation actions."

Mt : Mt : BaseKBBaseKBdefiningMtdefiningMt : : AnimalPhysiologyVocabularyMtAnimalPhysiologyVocabularyMt

Mt : Mt : AnimalPhysiologyMtAnimalPhysiologyMtproperPhysicalPartTypesproperPhysicalPartTypes : : FingernailFingernail

Mt : Mt : WordNetMappingMtWordNetMappingMt ((synonymousExternalConceptsynonymousExternalConcept FingerFinger WordNet-Version2_0WordNet-Version2_0 "N05247839") "N05247839") ((synonymousExternalConceptsynonymousExternalConcept FingerFinger WordNet-1997VersionWordNet-1997Version "N04312497") "N04312497")

GAF Arg : 2GAF Arg : 2Mt : Mt : UniversalVocabularyMtUniversalVocabularyMt

((genlsgenls LittleFingerLittleFinger FingerFinger)) ((genlsgenls IndexFingerIndexFinger FingerFinger)) ((genlsgenls ThumbThumb FingerFinger)) ((genlsgenls RingFingerRingFinger FingerFinger)) ((genlsgenls MiddleFingerMiddleFinger FingerFinger))

Mt : HumanActivitiesMtMt : HumanActivitiesMt (bodyPartsUsed-TypeType Typing (bodyPartsUsed-TypeType Typing Finger)Finger)

Mt : HumanSocialLifeMtMt : HumanSocialLifeMt (bodyPartsUsed-TypeType (bodyPartsUsed-TypeType PointingAFinger Finger)PointingAFinger Finger)

Example of what Cyc currently Example of what Cyc currently knows about fingers - 2knows about fingers - 2

Mt : Mt : AnimalPhysiologyMtAnimalPhysiologyMt

-(-(conceptuallyRelatedconceptuallyRelated FingernailFingernail FingerFinger)) ((properPhysicalPartTypesproperPhysicalPartTypes HandHand FingerFinger)) ((relationAllInstancerelationAllInstance ageage FingerFinger               ((YearsDurationYearsDuration 0 200)) 0 200)) ((relationAllInstancerelationAllInstance widthOfObjectwidthOfObject FingerFinger               ((MeterMeter 0.001 0.2)) 0.001 0.2)) ((relationAllInstancerelationAllInstance heightOfObjectheightOfObject FingerFinger               ((MeterMeter 0.001 0.2)) 0.001 0.2)) ((relationAllInstancerelationAllInstance lengthOfObjectlengthOfObject FingerFinger               ((MeterMeter 0.01 0.5)) 0.01 0.5)) ((relationAllInstancerelationAllInstance massOfObjectmassOfObject FingerFinger               ((KilogramKilogram 0.001 1)) 0.001 1))

GAF Arg : 3GAF Arg : 3

Mt : Mt : HumanPhysiologyMtHumanPhysiologyMt ((relationAllExistsrelationAllExists anatomicalPartsanatomicalParts HomoSapiensHomoSapiens FingerFinger))

Mt : Mt : VertebratePhysiologyMtVertebratePhysiologyMt ((relationAllExistsCountrelationAllExistsCount physicalPartsphysicalParts HandHand FingerFinger 5) 5)

Mt : Mt : UniversalVocabularyMtUniversalVocabularyMt ((relationAllOnlyrelationAllOnly wornOnwornOn Ring-JewelryRing-Jewelry FingerFinger))

Mt : Mt : AnimalPhysiologyMtAnimalPhysiologyMt ((relationExistsAllrelationExistsAll physicalPartsphysicalParts HandHand FingerFinger))

GAF Arg : 4GAF Arg : 4

Mt : Mt : GeneralEnglishMtGeneralEnglishMt ((denotationdenotation Finger-Finger-TheWordTheWord CountNounCountNoun 0 0 FingerFinger))

Bootstrapping with ResearchCycBootstrapping with ResearchCyc

Cyc has vocabulary about objects in the Cyc has vocabulary about objects in the world and relationshipsworld and relationships

Cyc could still use more common Cyc could still use more common relationshipsrelationships

BURC uses what Cyc already has + lots of BURC uses what Cyc already has + lots of parsed text to create new Cyc entries for parsed text to create new Cyc entries for common relationships found in the textcommon relationships found in the text

Lenat’s Bootstrap HypothesisLenat’s Bootstrap Hypothesis: once : once Cyc reaches a certain level/scale it can Cyc reaches a certain level/scale it can help in its own development and start help in its own development and start using NLP to augment its knowledge baseusing NLP to augment its knowledge base

BURC should help test this hypothesisBURC should help test this hypothesis

The BURC ProcessThe BURC Process From seeds…Hypothe-seed’s From seeds…Hypothe-seed’s

Use the link grammar parser for bulk Use the link grammar parser for bulk parsing of text, primarily narratives parsing of text, primarily narratives based in ‘worlds like ours’. Other text based in ‘worlds like ours’. Other text styles could be included. styles could be included.

Operates in two directions: Operates in two directions: • Forward from text to CycLForward from text to CycL• Backwards from existing CycL to the text to Backwards from existing CycL to the text to

find new forward patternsfind new forward patterns

BURC Process - 2BURC Process - 2 Load the link fragments into a database (1 and 2 Load the link fragments into a database (1 and 2

link fragments), and compute frequency of link fragments), and compute frequency of fragment occurrences. The database will be in a fragment occurrences. The database will be in a SQL format so multiple queries can be formed SQL format so multiple queries can be formed dynamically.dynamically.

Using Cyc knowledge as a starting point (the Using Cyc knowledge as a starting point (the seeds), extract knowledge for use in Cyc:seeds), extract knowledge for use in Cyc:• Given a set of seed facts in Cyc, identify how those facts Given a set of seed facts in Cyc, identify how those facts

are represented as link fragments in the databaseare represented as link fragments in the database• Generate conjectures as to new knowledge AND new Generate conjectures as to new knowledge AND new

knowledge extraction patterns using the fragment knowledge extraction patterns using the fragment patterns.patterns.

BURC Process - 3BURC Process - 3 Use Cyc knowledge directly to conjecture new Use Cyc knowledge directly to conjecture new

statements: statements: • Cyc has lexical knowledge, which can be used as Cyc has lexical knowledge, which can be used as

templates against the DB to form new statementstemplates against the DB to form new statements• For example, common adjectives applied to noun classes For example, common adjectives applied to noun classes • Cyc knows “WhiteColor” and “Blouse” but does not know Cyc knows “WhiteColor” and “Blouse” but does not know

that white is a common blouse color, although it becomes that white is a common blouse color, although it becomes apparent after reading some textapparent after reading some text

Optionally, gather supporting background statistics Optionally, gather supporting background statistics for hypothesis verification using other sources: for hypothesis verification using other sources: • Perhaps Google desktop with a larger than fully parsed Perhaps Google desktop with a larger than fully parsed

corpuscorpus• Perhaps check against answer extraction enginesPerhaps check against answer extraction engines

KNEXT (KNEXT (KNKNowledge owledge EXEXtraction traction from from TText)ext)

Deriving general world knowledge from texts and Deriving general world knowledge from texts and taxonomies:taxonomies:• http://www.cs.rochester.edu/~schubert/projects/world-http://www.cs.rochester.edu/~schubert/projects/world-

knowledge-mining.htmlknowledge-mining.html• Lenhart K. Schubert and Matthew Tong, Lenhart K. Schubert and Matthew Tong,

"Extracting and evaluating general world knowledge from "Extracting and evaluating general world knowledge from the Brown Corpus"the Brown Corpus", , Proc. of the HLT-NAACL Workshop on Text MeaningProc. of the HLT-NAACL Workshop on Text Meaning, May , May 31, 2003, Edmonton, Alberta, pp. 7-13.31, 2003, Edmonton, Alberta, pp. 7-13.

System extracts commonsense relationships from System extracts commonsense relationships from texttext

Limited to the pre-parsed Penn TreebankLimited to the pre-parsed Penn Treebank Generated 117,326 propositions (about 2 per Generated 117,326 propositions (about 2 per

sentence)sentence) About 60% judged reasonable by any given judgeAbout 60% judged reasonable by any given judge

KNEXT (Example) KNEXT (Example) (BLANCHE KNEW 0 SOMETHING MUST BE CAUSING STANLEY 'S NEW, STRANGE (BLANCHE KNEW 0 SOMETHING MUST BE CAUSING STANLEY 'S NEW, STRANGE

BEHAVIOR BUT SHE NEVER ONCE CONNECTED IT WITH KITTI WALKER.) BEHAVIOR BUT SHE NEVER ONCE CONNECTED IT WITH KITTI WALKER.)

A FEMALE-INDIVIDUAL MAY KNOW A PROPOSITION.A FEMALE-INDIVIDUAL MAY KNOW A PROPOSITION.SOMETHING MAY CAUSE A BEHAVIOR. SOMETHING MAY CAUSE A BEHAVIOR. A MALE-INDIVIDUAL MAY HAVE A BEHAVIOR. A MALE-INDIVIDUAL MAY HAVE A BEHAVIOR. A BEHAVIOR CAN BE NEW. A BEHAVIOR CAN BE NEW. A BEHAVIOR CAN BE STRANGE. A BEHAVIOR CAN BE STRANGE. A FEMALE-INDIVIDUAL MAY CONNECT A THING-REFERRED-TO WITH A FEMALE-A FEMALE-INDIVIDUAL MAY CONNECT A THING-REFERRED-TO WITH A FEMALE-

INDIVIDUAL.INDIVIDUAL. ((:I (:Q DET FEMALE-INDIVIDUAL) KNOW[V] (:Q DET PROPOS))((:I (:Q DET FEMALE-INDIVIDUAL) KNOW[V] (:Q DET PROPOS)) (:I (:F K SOMETHING[N]) CAUSE[V] (:Q THE BEHAVIOR[N])) (:I (:F K SOMETHING[N]) CAUSE[V] (:Q THE BEHAVIOR[N])) (:I (:Q DET MALE-INDIVIDUAL) HAVE[V] (:Q DET BEHAVIOR[N])) (:I (:Q DET MALE-INDIVIDUAL) HAVE[V] (:Q DET BEHAVIOR[N])) (:I (:Q DET BEHAVIOR[N]) NEW[A]) (:I (:Q DET BEHAVIOR[N]) NEW[A]) (:I (:Q DET BEHAVIOR[N]) STRANGE[A]) (:I (:Q DET BEHAVIOR[N]) STRANGE[A]) (:I (:Q DET FEMALE-INDIVIDUAL) CONNECT[V] (:Q DET THING-REFERRED-TO) (:I (:Q DET FEMALE-INDIVIDUAL) CONNECT[V] (:Q DET THING-REFERRED-TO) (:P WITH[P] (:Q DET FEMALE-INDIVIDUAL))))(:P WITH[P] (:Q DET FEMALE-INDIVIDUAL))))

Other Extraction Pattern ResearchOther Extraction Pattern Research

Towards Terascale Knowledge Acquisition Towards Terascale Knowledge Acquisition (Pantel, Ravichandran and Hovy, 2004)(Pantel, Ravichandran and Hovy, 2004)

Learning Surface Text Patterns for a Learning Surface Text Patterns for a Question Answering System (Ravichandran Question Answering System (Ravichandran & Hovy, 2002)& Hovy, 2002)

Defined Pattern Precision P = Ca/CoDefined Pattern Precision P = Ca/CoCa = total number of patterns with answer term presentCa = total number of patterns with answer term presentCo = Total number of patterns with any term presentCo = Total number of patterns with any term present

DIRT – DIRT – DDiscovery of iscovery of IInference nference RRules from ules from TText (Lin & Pantel, 2001)ext (Lin & Pantel, 2001)

Other Lexical Knowledge ResearchOther Lexical Knowledge Research

VerbOcean (Chklovski & Pantel): Collecting VerbOcean (Chklovski & Pantel): Collecting pairs and searching to verify relationshipspairs and searching to verify relationships

Lexical Acquisition via Constraint Solving Lexical Acquisition via Constraint Solving (Pedersen & Chen): Acquiring syntactic (Pedersen & Chen): Acquiring syntactic and semantic classification rules of and semantic classification rules of unknown words for LGPunknown words for LGP

Information Extraction Using Link Information Extraction Using Link Grammar papersGrammar papers

Automatic Meaning Discovery Using Automatic Meaning Discovery Using GoogleGoogle

The General Backwards ModelThe General Backwards Model

Given some Cyc relation Pred(?X,?Y)Given some Cyc relation Pred(?X,?Y) Create SQL search queryCreate SQL search query

• Lookup in Cyc lexical entries for X & Y Lookup in Cyc lexical entries for X & Y LX, LY LX, LY• Select * from LGPTable where Term1="<LX>" and Select * from LGPTable where Term1="<LX>" and

Term3="<LY>“Term3="<LY>“• System returns records [LX | Link1 | Term2 | Link2 | LY] (Freq) System returns records [LX | Link1 | Term2 | Link2 | LY] (Freq)

Generate new hypothetical extraction Generate new hypothetical extraction patternspatterns• Select * from LGPTable where Link1="<L1>" and Link2="<L2>" Select * from LGPTable where Link1="<L1>" and Link2="<L2>"

and Term2="<T2>“and Term2="<T2>“• [* L1 T2 L2 *] [* L1 T2 L2 *] generate hypothetical record ( Pred |?S1|?S3 ) generate hypothetical record ( Pred |?S1|?S3 )• Frequency information is propagated forwardFrequency information is propagated forward

The General Backwards Model - 2The General Backwards Model - 2

Optional: Search Cyc for ?PRED (X,Y) and Optional: Search Cyc for ?PRED (X,Y) and use the set to form a local ambiguity class use the set to form a local ambiguity class to reduce search labor and identify to reduce search labor and identify ambiguity. ambiguity. One rule One rule multiple relations. multiple relations.• Stored as “SQLTemplate \ Pattern \ Stored as “SQLTemplate \ Pattern \

Pred1/Pred2/…/PRedN”Pred1/Pred2/…/PRedN”• Need to explore (canidateBinaryPred ARG1 ARG2 RELN)Need to explore (canidateBinaryPred ARG1 ARG2 RELN)

Optional: Form more specific patterns for Optional: Form more specific patterns for Pred(X,_) and Pred(_,Y)Pred(X,_) and Pred(_,Y)

Update the LGParser’s CycL RulesUpdate the LGParser’s CycL Rules

There are rules for There are rules for translation of LGP translation of LGP output into CycLoutput into CycL

If the frequency If the frequency information warrants it information warrants it then we can generate then we can generate new LGP rulesnew LGP rules

Results in expanded Results in expanded parser precisionparser precision

<rule><rule><pattern>* {Link1} {Term2} {Link2}*</pattern><pattern>* {Link1} {Term2} {Link2}*</pattern>

<define>?ITEM%r </define><define>?ITEM%r </define>

<body>(#$is-node ?ITEM%r "%R")</body><body>(#$is-node ?ITEM%r "%R")</body>

<define>?ITEM%l </define><define>?ITEM%l </define>

<body>(#$is-node ?ITEM%l "%R")</body><body>(#$is-node ?ITEM%l "%R")</body>

<body>({?PRED1} ?ITEM%l ?TERM%r)</body><body>({?PRED1} ?ITEM%l ?TERM%r)</body>

......

<body>({?PREDN} ?ITEM%l ?TERM%r)</body><body>({?PREDN} ?ITEM%l ?TERM%r)</body>

</rule></rule>

Forward Mining Adjective RelationsForward Mining Adjective Relations

There are 1941 GAF’s on There are 1941 GAF’s on adjSemTrans,adjSemTrans, the the primary lexical adjective predicateprimary lexical adjective predicate

Find applicable fragments and use definitions:Find applicable fragments and use definitions:• ““Select * from LGPTable Where NumLinks=1 and Select * from LGPTable Where NumLinks=1 and

Link1='a' and Term1 like '%.a' and Term2 like '%.n‘ ”Link1='a' and Term1 like '%.a' and Term2 like '%.n‘ ”• Returns records [Term1.a | a | Term2.n] Returns records [Term1.a | a | Term2.n] • Potentially test using either an internal or search engine Potentially test using either an internal or search engine

based relevancy metricbased relevancy metric• Query Cyc for “(adjSemTrans <term1>-TheWord ?N Query Cyc for “(adjSemTrans <term1>-TheWord ?N

RegularAdjFrame (?Pred :NOUN ?Val))”RegularAdjFrame (?Pred :NOUN ?Val))”• Generate (plausiblePredValOFType <term2> <?Pred> Generate (plausiblePredValOFType <term2> <?Pred>

<?Val>)<?Val>)• Possibly generate parsing rulePossibly generate parsing rule

Mining Adjective Knowledge Mining Adjective Knowledge ExampleExample

““white blouse” as factoidwhite blouse” as factoid [white.a | a | blouse.n][white.a | a | blouse.n] Potentially test using an internal or search Potentially test using an internal or search

engine relevancy metric [GC=70400]engine relevancy metric [GC=70400] (adjSemTrans White-TheWord 11 (adjSemTrans White-TheWord 11

RegularAdjFrame RegularAdjFrame (mainColorOfObject :NOUN WhiteColor))(mainColorOfObject :NOUN WhiteColor))

Hypothesis: Hypothesis: (plausiblePredValueOfType (plausiblePredValueOfType Blouse mainColorOfObject WhiteColor)Blouse mainColorOfObject WhiteColor)

Update the LGParser’s CycL Rules - 2Update the LGParser’s CycL Rules - 2

There are rules for There are rules for translation of LGP translation of LGP output into CycLoutput into CycL

We can use the We can use the adjSemTrans data to adjSemTrans data to generate new generate new translation rulestranslation rules

Results in expanded Results in expanded parser precisionparser precision

<rule><rule><pattern> {Term1.a} a *</pattern><pattern> {Term1.a} a *</pattern><define>?ITEM%r </define><define>?ITEM%r </define><body>(#$is-node ?ITEM%r "%R")</body><body>(#$is-node ?ITEM%r "%R")</body><define>?ITEM%l </define><define>?ITEM%l </define><body>({?PRED} ?ITEM%r {?VAL})</body><body>({?PRED} ?ITEM%r {?VAL})</body>

</rule></rule>

<rule><rule><pattern> <pattern> white.awhite.a a *</pattern> a *</pattern>

<define>?ITEM%r </define><define>?ITEM%r </define>

<body>(#$is-node ?ITEM%r "%R")</body><body>(#$is-node ?ITEM%r "%R")</body>

<define>?ITEM%l </define><define>?ITEM%l </define>

<body>(<body>(mainColorOfObjectmainColorOfObject ?ITEM%r ?ITEM%r WhiteColorWhiteColor)</body>)</body>

</rule></rule>

Mined Finger DescriptionsMined Finger Descriptions000010:(#$plausiblePredValueOfType #$Finger #$feelsSensation (#$PositiveAmountFn 000010:(#$plausiblePredValueOfType #$Finger #$feelsSensation (#$PositiveAmountFn

#$LevelOfSoreness)) #$LevelOfSoreness)) 000037:(#$plausiblePredValueOfType #$Finger #$forceCapacity #$Strong) 000037:(#$plausiblePredValueOfType #$Finger #$forceCapacity #$Strong) 000025:(#$plausiblePredValueOfType #$Finger #$forceCapacity #$Strong)000025:(#$plausiblePredValueOfType #$Finger #$forceCapacity #$Strong)000025:(#$plausiblePredValueOfType #$Finger #$hardnessOfObject #$Hard) 000025:(#$plausiblePredValueOfType #$Finger #$hardnessOfObject #$Hard) 000037:(#$plausiblePredValueOfType #$Finger #$hardnessOfObject 000037:(#$plausiblePredValueOfType #$Finger #$hardnessOfObject

(#$MediumToVeryHighAmountFn #$Hardness))(#$MediumToVeryHighAmountFn #$Hardness))000037:(#$plausiblePredValueOfType #$Finger #$hardnessOfObject 000037:(#$plausiblePredValueOfType #$Finger #$hardnessOfObject

(#$MediumToVeryHighAmountFn #$Hardness))(#$MediumToVeryHighAmountFn #$Hardness))000002:(#$plausiblePredValueOfType #$Finger #$hasEvaluativeQuantity 000002:(#$plausiblePredValueOfType #$Finger #$hasEvaluativeQuantity

(#$MediumToVeryHighAmountFn #$Goodness-Generic))(#$MediumToVeryHighAmountFn #$Goodness-Generic))000002:(#$plausiblePredValueOfType #$Finger #$hasPhysicalAttractiveness #$GoodLooking) 000002:(#$plausiblePredValueOfType #$Finger #$hasPhysicalAttractiveness #$GoodLooking) 000047:(#$plausiblePredValueOfType #$Finger #$isa (#$LeftObjectOfPairFn :REPLACE)) 000047:(#$plausiblePredValueOfType #$Finger #$isa (#$LeftObjectOfPairFn :REPLACE)) 000015:(#$plausiblePredValueOfType #$Finger #$isa (#$RightObjectOfPairFn :REPLACE)) 000015:(#$plausiblePredValueOfType #$Finger #$isa (#$RightObjectOfPairFn :REPLACE)) 000155:(#$plausiblePredValueOfType #$Finger #$lengthOfObject (#$RelativeGenericValueFn 000155:(#$plausiblePredValueOfType #$Finger #$lengthOfObject (#$RelativeGenericValueFn

#$lengthOfObject :REPLACE #$highAmountOf))#$lengthOfObject :REPLACE #$highAmountOf))000155:(#$plausiblePredValueOfType #$Finger #$lengthOfObject (#$RelativeGenericValueFn 000155:(#$plausiblePredValueOfType #$Finger #$lengthOfObject (#$RelativeGenericValueFn

#$lengthOfObject :REPLACE #$highToVeryHighAmountOf)) #$lengthOfObject :REPLACE #$highToVeryHighAmountOf)) 000003:(#$plausiblePredValueOfType #$Finger #$mainColorOfObject #$BlackColor) 000003:(#$plausiblePredValueOfType #$Finger #$mainColorOfObject #$BlackColor) 000010:(#$plausiblePredValueOfType #$Finger #$mainColorOfObject #$LightYellowishBrown-000010:(#$plausiblePredValueOfType #$Finger #$mainColorOfObject #$LightYellowishBrown-

Color) Color) 000010:(#$plausiblePredValueOfType #$Finger #$mainColorOfObject 000010:(#$plausiblePredValueOfType #$Finger #$mainColorOfObject

#$ModerateYellowishBrown-Color)#$ModerateYellowishBrown-Color)000010:(#$plausiblePredValueOfType #$Finger #$mainColorOfObject #$SunTan-FleshColor) 000010:(#$plausiblePredValueOfType #$Finger #$mainColorOfObject #$SunTan-FleshColor) 000002:(#$plausiblePredValueOfType #$Finger #$possessiveRelation #$SuddenChange) 000002:(#$plausiblePredValueOfType #$Finger #$possessiveRelation #$SuddenChange)

Mined Finger DescriptionsMined Finger Descriptions000006:(#$plausiblePredValueOfType #$Finger #$possessiveRelation (#$HighAmountFn 000006:(#$plausiblePredValueOfType #$Finger #$possessiveRelation (#$HighAmountFn

#$Speed))#$Speed))000094:(#$plausiblePredValueOfType #$Finger #$rigidityOfObject (#$HighAmountFn 000094:(#$plausiblePredValueOfType #$Finger #$rigidityOfObject (#$HighAmountFn

#$Rigidity))#$Rigidity))000060:(#$plausiblePredValueOfType #$Finger #$sizeParameterOfObject 000060:(#$plausiblePredValueOfType #$Finger #$sizeParameterOfObject

(#$RelativeGenericValueFn #$sizeParameterOfObject :REPLACE #$highAmountOf)) (#$RelativeGenericValueFn #$sizeParameterOfObject :REPLACE #$highAmountOf)) 000052:(#$plausiblePredValueOfType #$Finger #$sizeParameterOfObject 000052:(#$plausiblePredValueOfType #$Finger #$sizeParameterOfObject

(#$RelativeGenericValueFn #$sizeParameterOfObject :REPLACE (#$RelativeGenericValueFn #$sizeParameterOfObject :REPLACE #$highToVeryHighAmountOf))#$highToVeryHighAmountOf))

000060:(#$plausiblePredValueOfType #$Finger #$sizeParameterOfObject 000060:(#$plausiblePredValueOfType #$Finger #$sizeParameterOfObject (#$RelativeGenericValueFn #$sizeParameterOfObject :REPLACE (#$RelativeGenericValueFn #$sizeParameterOfObject :REPLACE #$highToVeryHighAmountOf))#$highToVeryHighAmountOf))

000285:(#$plausiblePredValueOfType #$Finger #$sizeParameterOfObject 000285:(#$plausiblePredValueOfType #$Finger #$sizeParameterOfObject (#$RelativeGenericValueFn #$sizeParameterOfObject :REPLACE (#$RelativeGenericValueFn #$sizeParameterOfObject :REPLACE #$veryLowToLowAmountOf))#$veryLowToLowAmountOf))

000074:(#$plausiblePredValueOfType #$Finger #$sizeParameterOfObject 000074:(#$plausiblePredValueOfType #$Finger #$sizeParameterOfObject (#$RelativeGenericValueFn #$sizeParameterOfObject :REPLACE (#$RelativeGenericValueFn #$sizeParameterOfObject :REPLACE #$veryLowToLowAmountOf)) #$veryLowToLowAmountOf))

000029:(#$plausiblePredValueOfType #$Finger #$speedOfObject-Underspecified 000029:(#$plausiblePredValueOfType #$Finger #$speedOfObject-Underspecified (#$LowAmountFn #$Speed)) (#$LowAmountFn #$Speed))

000138:(#$plausiblePredValueOfType #$Finger #$surfaceFeatureOfObj #$Slippery) 000138:(#$plausiblePredValueOfType #$Finger #$surfaceFeatureOfObj #$Slippery) 000074:(#$plausiblePredValueOfType #$Finger #$temperatureOfObject #$Warm) 000074:(#$plausiblePredValueOfType #$Finger #$temperatureOfObject #$Warm) 000004:(#$plausiblePredValueOfType #$Finger #$textureOfObject #$Rough) 000004:(#$plausiblePredValueOfType #$Finger #$textureOfObject #$Rough) 000168:(#$plausiblePredValueOfType #$Finger #$thicknessOfObject 000168:(#$plausiblePredValueOfType #$Finger #$thicknessOfObject

(#$RelativeGenericValueFn #$thicknessOfObject :REPLACE #$highAmountOf)) (#$RelativeGenericValueFn #$thicknessOfObject :REPLACE #$highAmountOf)) 000168:(#$plausiblePredValueOfType #$Finger #$thicknessOfObject 000168:(#$plausiblePredValueOfType #$Finger #$thicknessOfObject

(#$RelativeGenericValueFn #$thicknessOfObject :REPLACE #$highToVeryHighAmountOf))(#$RelativeGenericValueFn #$thicknessOfObject :REPLACE #$highToVeryHighAmountOf))000182:(#$plausiblePredValueOfType #$Finger #$wetnessOfObject #$Wet)000182:(#$plausiblePredValueOfType #$Finger #$wetnessOfObject #$Wet)

Verb Semantic Filtering -1Verb Semantic Filtering -1Discovering what a finger can do…Discovering what a finger can do…

A similar process can be used finding information based on verb A similar process can be used finding information based on verb semantic parsing framessemantic parsing frames

For each potential <NOUNWORD>-<VERB> pair query Cyc to find For each potential <NOUNWORD>-<VERB> pair query Cyc to find basic relationships using the verb semantic templatesbasic relationships using the verb semantic templates

(#$and (#$and (#$denotation <NOUNWORD> ?NOUNTYPE ?N ?CYCTERM)(#$denotation <NOUNWORD> ?NOUNTYPE ?N ?CYCTERM) (#$wordForms ?WORD ?PRED ""<VERB>"")(#$wordForms ?WORD ?PRED ""<VERB>"") (#$speechPartPreds ?POS ?PRED)(#$speechPartPreds ?POS ?PRED) (#$semTransPredForPOS ?POS ?SEMTRANSPRED)(#$semTransPredForPOS ?POS ?SEMTRANSPRED) (?SEMTRANSPRED ?WORD ?NUM ?FRAME ?TEMPLATE))(?SEMTRANSPRED ?WORD ?NUM ?FRAME ?TEMPLATE))

Verify for each potential relationship (<SPRED> <VERTERM> Verify for each potential relationship (<SPRED> <VERTERM> <CYCTERM>) derivable from ?TEMPLATE that it makes sense in <CYCTERM>) derivable from ?TEMPLATE that it makes sense in the ontologythe ontology

(#$and (#$and (#$arg1Isa <SPRED> ?VTYP)(#$arg1Isa <SPRED> ?VTYP) (#$arg2Isa <SPRED> ?CTYP)(#$arg2Isa <SPRED> ?CTYP) (#$genls <CYCTERM> ?CTYP)(#$genls <CYCTERM> ?CTYP) (#$genls <VERBTERM> ?VTYP) )(#$genls <VERBTERM> ?VTYP) )

Verb Semantic Filtering -2Verb Semantic Filtering -2Templates of Movement…Templates of Movement…

((verbSemTransverbSemTrans Move-Move-TheWordTheWord 0 0 IntransitiveVerbFrameIntransitiveVerbFrame        (       (andand            (           (isaisa :ACTION :ACTION MovementEventMovementEvent) )            (           (primaryObjectMovingprimaryObjectMoving :ACTION :SUBJECT))) :ACTION :SUBJECT)))

((verbSemTransverbSemTrans Move-Move-TheWordTheWord 1 1 IntransitiveVerbFrameIntransitiveVerbFrame        (       (andand            (           (isaisa :ACTION :ACTION ChangeOfResidenceChangeOfResidence) )            (           (performedByperformedBy :ACTION :SUBJECT))) :ACTION :SUBJECT)))

((verbSemTransverbSemTrans Move-Move-TheWordTheWord 2 2 TransitiveNPFrameTransitiveNPFrame        (       (andand            (           (isaisa :ACTION :ACTION CausingAnotherObjectsTranslationalMotionCausingAnotherObjectsTranslationalMotion) )            (           (objectActedOnobjectActedOn :ACTION :OBJECT) :ACTION :OBJECT)            (           (doneBydoneBy :ACTION :SUBJECT))) :ACTION :SUBJECT)))

((arg1Isaarg1Isa performedByperformedBy ActionAction))((arg2Isaarg2Isa performedByperformedBy Agent-GenericAgent-Generic) )

Verb Semantic Filtering - 3Verb Semantic Filtering - 3 BURC can use Cyc’s knowledge of what things can perform BURC can use Cyc’s knowledge of what things can perform

what actions or have what attributes to filter out what actions or have what attributes to filter out implausible relationships.implausible relationships.

(#$behaviorCapableOf #$Finger #$CausingAnotherObjectsTranslationalMotion #$doneBy) (#$behaviorCapableOf #$Finger #$CausingAnotherObjectsTranslationalMotion #$doneBy) (#$behaviorCapableOf #$Finger #$ChangeOfResidence #$performedBy)(#$behaviorCapableOf #$Finger #$ChangeOfResidence #$performedBy)(#$behaviorCapableOf #$Finger #$Inspecting #$performedBy)(#$behaviorCapableOf #$Finger #$Inspecting #$performedBy) (#$behaviorCapableOf #$Finger #$Movement-TranslationEvent #$primaryObjectMoving) (#$behaviorCapableOf #$Finger #$Movement-TranslationEvent #$primaryObjectMoving) (#$behaviorCapableOf #$Finger #$MovementEvent #$primaryObjectMoving)(#$behaviorCapableOf #$Finger #$MovementEvent #$primaryObjectMoving)(#$behaviorCapableOf #$Finger #$PushingAnObject #$providerOfMotiveForce)(#$behaviorCapableOf #$Finger #$PushingAnObject #$providerOfMotiveForce)(#$behaviorCapableOf #$Finger #$Sliding-Generic #$objectMoving) (#$behaviorCapableOf #$Finger #$Sliding-Generic #$objectMoving) (#$behaviorCapableOf #$Finger #$Sliding-Generic #$primaryObjectMoving)(#$behaviorCapableOf #$Finger #$Sliding-Generic #$primaryObjectMoving)(#$behaviorCapableOf #$Finger #$Slipping #$objectMoving) (#$behaviorCapableOf #$Finger #$Slipping #$objectMoving) (#$behaviorCapableOf #$Finger #$Slipping #$primaryObjectMoving)(#$behaviorCapableOf #$Finger #$Slipping #$primaryObjectMoving)

Cyc Cyc cancan help in its own knowledge entry process. 62% of help in its own knowledge entry process. 62% of generated hypothesis were filtered out using semantic role generated hypothesis were filtered out using semantic role filtering.filtering.

Other Direct Extraction RulesOther Direct Extraction Rules

Some “underspecified” patterns exist just Some “underspecified” patterns exist just based on the linksbased on the links

This could be used to extract ConceptNet This could be used to extract ConceptNet like output directly from link recordslike output directly from link records

Examples:Examples:• [<obj1>|ss|<act>.v|os|<obj2>] [<obj1>|ss|<act>.v|os|<obj2>]

capableOf(<obj1>, “<act> <obj2>”)capableOf(<obj1>, “<act> <obj2>”)• [<act>.v |os|<obj>] [<act>.v |os|<obj>]

CapableOfReveivingAction(<obj>,<act>)CapableOfReveivingAction(<obj>,<act>)• [<obj>|s*|<act>.v] [<obj>|s*|<act>.v] capableOf(<obj>,<act>) capableOf(<obj>,<act>)

Quest for MetricsQuest for Metrics Percentage of hypothesis that make sense to a panel of judges Percentage of hypothesis that make sense to a panel of judges Percentages of hypothesis that are already known to Cyc Percentages of hypothesis that are already known to Cyc Percentage of hypothesis that are known in other knowledge Percentage of hypothesis that are known in other knowledge

sources (WordNet, Sumo/Milo, VerbOcean, MIT OpenMind…) sources (WordNet, Sumo/Milo, VerbOcean, MIT OpenMind…) Number of hypothesis generated vs. number of records Number of hypothesis generated vs. number of records What percentage of relations in Cyc can be found in the What percentage of relations in Cyc can be found in the

fragment pool fragment pool The Pattern Precision measureThe Pattern Precision measure Maybe compare against KNEXT but need to see if they return Maybe compare against KNEXT but need to see if they return

real numbers real numbers Unfortunately we don’t know all possible knowledge (otherwise Unfortunately we don’t know all possible knowledge (otherwise

we wouldn’t be doing this), because if we did we could we wouldn’t be doing this), because if we did we could measure recall and precision. measure recall and precision.

Simple space estimate (2.3K binary predicates * 85K constants Simple space estimate (2.3K binary predicates * 85K constants * 85K constants = 16.617500 T simple possibilities) * 85K constants = 16.617500 T simple possibilities)

Desired OutputsDesired Outputs Version of link grammar for bulk reading Version of link grammar for bulk reading

and generating fragments and generating fragments Database control program to queue texts, Database control program to queue texts,

monitor their processing, and merge the monitor their processing, and merge the fragment results fragment results

The database of fragments with fragment The database of fragments with fragment counts for some corpus counts for some corpus

The hypothesis set generated by the The hypothesis set generated by the system system

Optionally an OpenMind / ConceptNet like Optionally an OpenMind / ConceptNet like set of commonsense factoids set of commonsense factoids

Open enough that others could duplicateOpen enough that others could duplicate

Did any of that make sense?Did any of that make sense?

Comments? Comments? Questions?Questions? Suggestions?Suggestions?