the knowledge acquisition bottleneck revisited: how can we build large kbs?
DESCRIPTION
The Knowledge Acquisition Bottleneck Revisited: How can we build large KBs?. Illustrations of different approaches Peter Clark and John Thompson Boeing Research 2004. Premise. Intelligent machines needs lots of knowledge , for question-answering intelligent search information integration - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: The Knowledge Acquisition Bottleneck Revisited: How can we build large KBs?](https://reader036.vdocuments.mx/reader036/viewer/2022070405/56813fd8550346895daabe10/html5/thumbnails/1.jpg)
The Knowledge Acquisition Bottleneck Revisited:
How can we build large KBs?
Illustrations of different approachesPeter Clark and John Thompson
Boeing Research2004
![Page 2: The Knowledge Acquisition Bottleneck Revisited: How can we build large KBs?](https://reader036.vdocuments.mx/reader036/viewer/2022070405/56813fd8550346895daabe10/html5/thumbnails/2.jpg)
Premise• Intelligent machines needs lots of knowledge, for
– question-answering– intelligent search– information integration– natural language understanding– decision support– modeling– etc. etc.
• Much of this knowledge can be drawn from some general repository of reusable knowledge– e.g., WordNet
• How does one build such a repository?“No-one considers hand-building a large KB to be a realistic proposition these days” [paraphrase of Daphne Koller, 2004]
![Page 3: The Knowledge Acquisition Bottleneck Revisited: How can we build large KBs?](https://reader036.vdocuments.mx/reader036/viewer/2022070405/56813fd8550346895daabe10/html5/thumbnails/3.jpg)
1. Build it by Hand• “Let’s roll up our sleeves and
get on with it!”• But: It’s a daunting task
– Our own work• Cyc
+ Lots in it, (Relatively) well designed ontology
- 650 person-years effort so far
- Still patchy coverage (why?)
- Difficult to use outside Cycorp
![Page 4: The Knowledge Acquisition Bottleneck Revisited: How can we build large KBs?](https://reader036.vdocuments.mx/reader036/viewer/2022070405/56813fd8550346895daabe10/html5/thumbnails/4.jpg)
1. Build it by Hand (cont)- WordNet
+ Easy to use+ Comprehensive- Little inference-
supporting knowledge in
- Ad hoc ontology
![Page 5: The Knowledge Acquisition Bottleneck Revisited: How can we build large KBs?](https://reader036.vdocuments.mx/reader036/viewer/2022070405/56813fd8550346895daabe10/html5/thumbnails/5.jpg)
1. Build it by Hand (cont)• The Component Library
Claim: can bound the required knowledge by working at a coarse-grained level
+ Large, more doable
- Hard to use, still very incomplete
![Page 6: The Knowledge Acquisition Bottleneck Revisited: How can we build large KBs?](https://reader036.vdocuments.mx/reader036/viewer/2022070405/56813fd8550346895daabe10/html5/thumbnails/6.jpg)
2. Extract from Dictionaries
- MindNet+ Automatically built- Unusable?
- Extended WordNet+ Won TREC
competition- Still somewhat
incoherent- Lot of manual
labor
![Page 7: The Knowledge Acquisition Bottleneck Revisited: How can we build large KBs?](https://reader036.vdocuments.mx/reader036/viewer/2022070405/56813fd8550346895daabe10/html5/thumbnails/7.jpg)
3. Corpus-based Text/Web Mining
- Schubert’s system+ Automatic
+ Lots of knowledge
- Noisy- No word senses- Only grabs certain
kinds of knowledge
30M entries…
![Page 8: The Knowledge Acquisition Bottleneck Revisited: How can we build large KBs?](https://reader036.vdocuments.mx/reader036/viewer/2022070405/56813fd8550346895daabe10/html5/thumbnails/8.jpg)
3. Corpus-based Text/Web Mining (cont)
- KnowIt (Etsioni)+ automatic- only factoids
![Page 9: The Knowledge Acquisition Bottleneck Revisited: How can we build large KBs?](https://reader036.vdocuments.mx/reader036/viewer/2022070405/56813fd8550346895daabe10/html5/thumbnails/9.jpg)
4. Community-Based Acquisition• Knowledge entry by the masses• OpenMind
+ Large- Full of junk, unusable (?)
- Would this work with better acquisition tools?
(see next slide for illustration)
![Page 10: The Knowledge Acquisition Bottleneck Revisited: How can we build large KBs?](https://reader036.vdocuments.mx/reader036/viewer/2022070405/56813fd8550346895daabe10/html5/thumbnails/10.jpg)
![Page 11: The Knowledge Acquisition Bottleneck Revisited: How can we build large KBs?](https://reader036.vdocuments.mx/reader036/viewer/2022070405/56813fd8550346895daabe10/html5/thumbnails/11.jpg)
5. Use Existing Resources
• e.g.,– databases– CIA World Fact Book– Web data/services
• e.g., SRI/ISI’s ARDA QA system+ Syntactically simple + Available- Largely limited to factoids- Information integration is a major challenge
- different ontologies, contradictory data
![Page 12: The Knowledge Acquisition Bottleneck Revisited: How can we build large KBs?](https://reader036.vdocuments.mx/reader036/viewer/2022070405/56813fd8550346895daabe10/html5/thumbnails/12.jpg)
Where to?• Can we bound the knowledge needed
– for a particular application– for a useful, sharable, general resource?
• Which of these approaches seems most realistic?– build by hand– extract from dictionaries– mine text corpora– community knowledge entry– use existing resources